From: Daniel Wagner <wagi(a)kernel.org>
[ Upstream commit d3d380eded7ee5fc2fc53b3b0e72365ded025c4a ]
The initial controller initialization mimiks the reconnect loop
behavior by switching from NEW to RESETTING and then to CONNECTING.
The transition from NEW to CONNECTING is a valid transition, so there is
no point entering the RESETTING state. TCP and RDMA also transition
directly to CONNECTING state.
Reviewed-by: Sagi Grimberg <sagi(a)grimberg.me>
Reviewed-by: Hannes Reinecke <hare(a)suse.de>
Reviewed-by: Christoph Hellwig <hch(a)lst.de>
Signed-off-by: Daniel Wagner <wagi(a)kernel.org>
Signed-off-by: Keith Busch <kbusch(a)kernel.org>
Signed-off-by: Sasha Levin <sashal(a)kernel.org>
---
drivers/nvme/host/fc.c | 3 +--
1 file changed, 1 insertion(+), 2 deletions(-)
diff --git a/drivers/nvme/host/fc.c b/drivers/nvme/host/fc.c
index 3dbf926fd99fd..2b0f15de77111 100644
--- a/drivers/nvme/host/fc.c
+++ b/drivers/nvme/host/fc.c
@@ -3525,8 +3525,7 @@ nvme_fc_init_ctrl(struct device *dev, struct nvmf_ctrl_options *opts,
list_add_tail(&ctrl->ctrl_list, &rport->ctrl_list);
spin_unlock_irqrestore(&rport->lock, flags);
- if (!nvme_change_ctrl_state(&ctrl->ctrl, NVME_CTRL_RESETTING) ||
- !nvme_change_ctrl_state(&ctrl->ctrl, NVME_CTRL_CONNECTING)) {
+ if (!nvme_change_ctrl_state(&ctrl->ctrl, NVME_CTRL_CONNECTING)) {
dev_err(ctrl->ctrl.device,
"NVME-FC{%d}: failed to init ctrl state\n", ctrl->cnum);
goto fail_ctrl;
--
2.39.5
From: Daniel Wagner <wagi(a)kernel.org>
[ Upstream commit d3d380eded7ee5fc2fc53b3b0e72365ded025c4a ]
The initial controller initialization mimiks the reconnect loop
behavior by switching from NEW to RESETTING and then to CONNECTING.
The transition from NEW to CONNECTING is a valid transition, so there is
no point entering the RESETTING state. TCP and RDMA also transition
directly to CONNECTING state.
Reviewed-by: Sagi Grimberg <sagi(a)grimberg.me>
Reviewed-by: Hannes Reinecke <hare(a)suse.de>
Reviewed-by: Christoph Hellwig <hch(a)lst.de>
Signed-off-by: Daniel Wagner <wagi(a)kernel.org>
Signed-off-by: Keith Busch <kbusch(a)kernel.org>
Signed-off-by: Sasha Levin <sashal(a)kernel.org>
---
drivers/nvme/host/fc.c | 3 +--
1 file changed, 1 insertion(+), 2 deletions(-)
diff --git a/drivers/nvme/host/fc.c b/drivers/nvme/host/fc.c
index cdb1e706f855e..bb85e79b62fad 100644
--- a/drivers/nvme/host/fc.c
+++ b/drivers/nvme/host/fc.c
@@ -3550,8 +3550,7 @@ nvme_fc_init_ctrl(struct device *dev, struct nvmf_ctrl_options *opts,
list_add_tail(&ctrl->ctrl_list, &rport->ctrl_list);
spin_unlock_irqrestore(&rport->lock, flags);
- if (!nvme_change_ctrl_state(&ctrl->ctrl, NVME_CTRL_RESETTING) ||
- !nvme_change_ctrl_state(&ctrl->ctrl, NVME_CTRL_CONNECTING)) {
+ if (!nvme_change_ctrl_state(&ctrl->ctrl, NVME_CTRL_CONNECTING)) {
dev_err(ctrl->ctrl.device,
"NVME-FC{%d}: failed to init ctrl state\n", ctrl->cnum);
goto fail_ctrl;
--
2.39.5
From: Daniel Wagner <wagi(a)kernel.org>
[ Upstream commit d3d380eded7ee5fc2fc53b3b0e72365ded025c4a ]
The initial controller initialization mimiks the reconnect loop
behavior by switching from NEW to RESETTING and then to CONNECTING.
The transition from NEW to CONNECTING is a valid transition, so there is
no point entering the RESETTING state. TCP and RDMA also transition
directly to CONNECTING state.
Reviewed-by: Sagi Grimberg <sagi(a)grimberg.me>
Reviewed-by: Hannes Reinecke <hare(a)suse.de>
Reviewed-by: Christoph Hellwig <hch(a)lst.de>
Signed-off-by: Daniel Wagner <wagi(a)kernel.org>
Signed-off-by: Keith Busch <kbusch(a)kernel.org>
Signed-off-by: Sasha Levin <sashal(a)kernel.org>
---
drivers/nvme/host/fc.c | 3 +--
1 file changed, 1 insertion(+), 2 deletions(-)
diff --git a/drivers/nvme/host/fc.c b/drivers/nvme/host/fc.c
index b81af7919e94c..d45ab530ff9b7 100644
--- a/drivers/nvme/host/fc.c
+++ b/drivers/nvme/host/fc.c
@@ -3579,8 +3579,7 @@ nvme_fc_init_ctrl(struct device *dev, struct nvmf_ctrl_options *opts,
list_add_tail(&ctrl->ctrl_list, &rport->ctrl_list);
spin_unlock_irqrestore(&rport->lock, flags);
- if (!nvme_change_ctrl_state(&ctrl->ctrl, NVME_CTRL_RESETTING) ||
- !nvme_change_ctrl_state(&ctrl->ctrl, NVME_CTRL_CONNECTING)) {
+ if (!nvme_change_ctrl_state(&ctrl->ctrl, NVME_CTRL_CONNECTING)) {
dev_err(ctrl->ctrl.device,
"NVME-FC{%d}: failed to init ctrl state\n", ctrl->cnum);
goto fail_ctrl;
--
2.39.5
From: Daniel Wagner <wagi(a)kernel.org>
[ Upstream commit d3d380eded7ee5fc2fc53b3b0e72365ded025c4a ]
The initial controller initialization mimiks the reconnect loop
behavior by switching from NEW to RESETTING and then to CONNECTING.
The transition from NEW to CONNECTING is a valid transition, so there is
no point entering the RESETTING state. TCP and RDMA also transition
directly to CONNECTING state.
Reviewed-by: Sagi Grimberg <sagi(a)grimberg.me>
Reviewed-by: Hannes Reinecke <hare(a)suse.de>
Reviewed-by: Christoph Hellwig <hch(a)lst.de>
Signed-off-by: Daniel Wagner <wagi(a)kernel.org>
Signed-off-by: Keith Busch <kbusch(a)kernel.org>
Signed-off-by: Sasha Levin <sashal(a)kernel.org>
---
drivers/nvme/host/fc.c | 3 +--
1 file changed, 1 insertion(+), 2 deletions(-)
diff --git a/drivers/nvme/host/fc.c b/drivers/nvme/host/fc.c
index b81af7919e94c..d45ab530ff9b7 100644
--- a/drivers/nvme/host/fc.c
+++ b/drivers/nvme/host/fc.c
@@ -3579,8 +3579,7 @@ nvme_fc_init_ctrl(struct device *dev, struct nvmf_ctrl_options *opts,
list_add_tail(&ctrl->ctrl_list, &rport->ctrl_list);
spin_unlock_irqrestore(&rport->lock, flags);
- if (!nvme_change_ctrl_state(&ctrl->ctrl, NVME_CTRL_RESETTING) ||
- !nvme_change_ctrl_state(&ctrl->ctrl, NVME_CTRL_CONNECTING)) {
+ if (!nvme_change_ctrl_state(&ctrl->ctrl, NVME_CTRL_CONNECTING)) {
dev_err(ctrl->ctrl.device,
"NVME-FC{%d}: failed to init ctrl state\n", ctrl->cnum);
goto fail_ctrl;
--
2.39.5
The patch titled
Subject: mm/rmap: reject hugetlb folios in folio_make_device_exclusive()
has been added to the -mm mm-unstable branch. Its filename is
mm-rmap-reject-hugetlb-folios-in-folio_make_device_exclusive.patch
This patch will shortly appear at
https://git.kernel.org/pub/scm/linux/kernel/git/akpm/25-new.git/tree/patche…
This patch will later appear in the mm-unstable branch at
git://git.kernel.org/pub/scm/linux/kernel/git/akpm/mm
Before you just go and hit "reply", please:
a) Consider who else should be cc'ed
b) Prefer to cc a suitable mailing list as well
c) Ideally: find the original patch on the mailing list and do a
reply-to-all to that, adding suitable additional cc's
*** Remember to use Documentation/process/submit-checklist.rst when testing your code ***
The -mm tree is included into linux-next via the mm-everything
branch at git://git.kernel.org/pub/scm/linux/kernel/git/akpm/mm
and is updated there every 2-3 working days
------------------------------------------------------
From: David Hildenbrand <david(a)redhat.com>
Subject: mm/rmap: reject hugetlb folios in folio_make_device_exclusive()
Date: Mon, 10 Feb 2025 20:37:44 +0100
Even though FOLL_SPLIT_PMD on hugetlb now always fails with -EOPNOTSUPP,
let's add a safety net in case FOLL_SPLIT_PMD usage would ever be
reworked.
In particular, before commit 9cb28da54643 ("mm/gup: handle hugetlb in the
generic follow_page_mask code"), GUP(FOLL_SPLIT_PMD) would just have
returned a page. In particular, hugetlb folios that are not PMD-sized
would never have been prone to FOLL_SPLIT_PMD.
hugetlb folios can be anonymous, and page_make_device_exclusive_one() is
not really prepared for handling them at all. So let's spell that out.
Link: https://lkml.kernel.org/r/20250210193801.781278-3-david@redhat.com
Fixes: b756a3b5e7ea ("mm: device exclusive memory access")
Signed-off-by: David Hildenbrand <david(a)redhat.com>
Reviewed-by: Alistair Popple <apopple(a)nvidia.com>
Cc: Alex Shi <alexs(a)kernel.org>
Cc: Danilo Krummrich <dakr(a)kernel.org>
Cc: Dave Airlie <airlied(a)gmail.com>
Cc: Jann Horn <jannh(a)google.com>
Cc: Jason Gunthorpe <jgg(a)nvidia.com>
Cc: Jerome Glisse <jglisse(a)redhat.com>
Cc: John Hubbard <jhubbard(a)nvidia.com>
Cc: Jonathan Corbet <corbet(a)lwn.net>
Cc: Karol Herbst <kherbst(a)redhat.com>
Cc: Liam Howlett <liam.howlett(a)oracle.com>
Cc: Lorenzo Stoakes <lorenzo.stoakes(a)oracle.com>
Cc: Lyude <lyude(a)redhat.com>
Cc: "Masami Hiramatsu (Google)" <mhiramat(a)kernel.org>
Cc: Oleg Nesterov <oleg(a)redhat.com>
Cc: Pasha Tatashin <pasha.tatashin(a)soleen.com>
Cc: Peter Xu <peterx(a)redhat.com>
Cc: Peter Zijlstra (Intel) <peterz(a)infradead.org>
Cc: SeongJae Park <sj(a)kernel.org>
Cc: Simona Vetter <simona.vetter(a)ffwll.ch>
Cc: Vlastimil Babka <vbabka(a)suse.cz>
Cc: Yanteng Si <si.yanteng(a)linux.dev>
Cc: Barry Song <v-songbaohua(a)oppo.com>
Cc: <stable(a)vger.kernel.org>
Signed-off-by: Andrew Morton <akpm(a)linux-foundation.org>
---
mm/rmap.c | 2 +-
1 file changed, 1 insertion(+), 1 deletion(-)
--- a/mm/rmap.c~mm-rmap-reject-hugetlb-folios-in-folio_make_device_exclusive
+++ a/mm/rmap.c
@@ -2499,7 +2499,7 @@ static bool folio_make_device_exclusive(
* Restrict to anonymous folios for now to avoid potential writeback
* issues.
*/
- if (!folio_test_anon(folio))
+ if (!folio_test_anon(folio) || folio_test_hugetlb(folio))
return false;
rmap_walk(folio, &rwc);
_
Patches currently in -mm which might be from david(a)redhat.com are
mm-gup-reject-foll_split_pmd-with-hugetlb-vmas.patch
mm-rmap-reject-hugetlb-folios-in-folio_make_device_exclusive.patch
mm-rmap-convert-make_device_exclusive_range-to-make_device_exclusive.patch
mm-rmap-implement-make_device_exclusive-using-folio_walk-instead-of-rmap-walk.patch
mm-memory-detect-writability-in-restore_exclusive_pte-through-can_change_pte_writable.patch
mm-use-single-swp_device_exclusive-entry-type.patch
mm-page_vma_mapped-device-exclusive-entries-are-not-migration-entries.patch
kernel-events-uprobes-handle-device-exclusive-entries-correctly-in-__replace_page.patch
mm-ksm-handle-device-exclusive-entries-correctly-in-write_protect_page.patch
mm-rmap-handle-device-exclusive-entries-correctly-in-try_to_unmap_one.patch
mm-rmap-handle-device-exclusive-entries-correctly-in-try_to_migrate_one.patch
mm-rmap-handle-device-exclusive-entries-correctly-in-page_vma_mkclean_one.patch
mm-page_idle-handle-device-exclusive-entries-correctly-in-page_idle_clear_pte_refs_one.patch
mm-damon-handle-device-exclusive-entries-correctly-in-damon_folio_young_one.patch
mm-damon-handle-device-exclusive-entries-correctly-in-damon_folio_mkold_one.patch
mm-rmap-keep-mapcount-untouched-for-device-exclusive-entries.patch
mm-rmap-avoid-ebusy-from-make_device_exclusive.patch
The patch titled
Subject: mm/gup: reject FOLL_SPLIT_PMD with hugetlb VMAs
has been added to the -mm mm-unstable branch. Its filename is
mm-gup-reject-foll_split_pmd-with-hugetlb-vmas.patch
This patch will shortly appear at
https://git.kernel.org/pub/scm/linux/kernel/git/akpm/25-new.git/tree/patche…
This patch will later appear in the mm-unstable branch at
git://git.kernel.org/pub/scm/linux/kernel/git/akpm/mm
Before you just go and hit "reply", please:
a) Consider who else should be cc'ed
b) Prefer to cc a suitable mailing list as well
c) Ideally: find the original patch on the mailing list and do a
reply-to-all to that, adding suitable additional cc's
*** Remember to use Documentation/process/submit-checklist.rst when testing your code ***
The -mm tree is included into linux-next via the mm-everything
branch at git://git.kernel.org/pub/scm/linux/kernel/git/akpm/mm
and is updated there every 2-3 working days
------------------------------------------------------
From: David Hildenbrand <david(a)redhat.com>
Subject: mm/gup: reject FOLL_SPLIT_PMD with hugetlb VMAs
Date: Mon, 10 Feb 2025 20:37:43 +0100
Patch series "mm: fixes for device-exclusive entries (hmm)", v2.
Discussing the PageTail() call in make_device_exclusive_range() with
Willy, I recently discovered [1] that device-exclusive handling does not
properly work with THP, making the hmm-tests selftests fail if THPs are
enabled on the system.
Looking into more details, I found that hugetlb is not properly fenced,
and I realized that something that was bugging me for longer -- how
device-exclusive entries interact with mapcounts -- completely breaks
migration/swapout/split/hwpoison handling of these folios while they have
device-exclusive PTEs.
The program below can be used to allocate 1 GiB worth of pages and making
them device-exclusive on a kernel with CONFIG_TEST_HMM.
Once they are device-exclusive, these folios cannot get swapped out
(proc$pid/smaps_rollup will always indicate 1 GiB RSS no matter how much
one forces memory reclaim), and when having a memory block onlined to
ZONE_MOVABLE, trying to offline it will loop forever and complain about
failed migration of a page that should be movable.
# echo offline > /sys/devices/system/memory/memory136/state
# echo online_movable > /sys/devices/system/memory/memory136/state
# ./hmm-swap &
... wait until everything is device-exclusive
# echo offline > /sys/devices/system/memory/memory136/state
[ 285.193431][T14882] page: refcount:2 mapcount:0 mapping:0000000000000000
index:0x7f20671f7 pfn:0x442b6a
[ 285.196618][T14882] memcg:ffff888179298000
[ 285.198085][T14882] anon flags: 0x5fff0000002091c(referenced|uptodate|
dirty|active|owner_2|swapbacked|node=1|zone=3|lastcpupid=0x7ff)
[ 285.201734][T14882] raw: ...
[ 285.204464][T14882] raw: ...
[ 285.207196][T14882] page dumped because: migration failure
[ 285.209072][T14882] page_owner tracks the page as allocated
[ 285.210915][T14882] page last allocated via order 0, migratetype
Movable, gfp_mask 0x140dca(GFP_HIGHUSER_MOVABLE|__GFP_COMP|__GFP_ZERO),
id 14926, tgid 14926 (hmm-swap), ts 254506295376, free_ts 227402023774
[ 285.216765][T14882] post_alloc_hook+0x197/0x1b0
[ 285.218874][T14882] get_page_from_freelist+0x76e/0x3280
[ 285.220864][T14882] __alloc_frozen_pages_noprof+0x38e/0x2740
[ 285.223302][T14882] alloc_pages_mpol+0x1fc/0x540
[ 285.225130][T14882] folio_alloc_mpol_noprof+0x36/0x340
[ 285.227222][T14882] vma_alloc_folio_noprof+0xee/0x1a0
[ 285.229074][T14882] __handle_mm_fault+0x2b38/0x56a0
[ 285.230822][T14882] handle_mm_fault+0x368/0x9f0
...
This series fixes all issues I found so far. There is no easy way to fix
without a bigger rework/cleanup. I have a bunch of cleanups on top (some
previous sent, some the result of the discussion in v1) that I will send
out separately once this landed and I get to it.
I wish we could just use some special present PROT_NONE PTEs instead of
these (non-present, non-none) fake-swap entries; but that just results in
the same problem we keep having (lack of spare PTE bits), and staring at
other similar fake-swap entries, that ship has sailed.
With this series, make_device_exclusive() doesn't actually belong into
mm/rmap.c anymore, but I'll leave moving that for another day.
I only tested this series with the hmm-tests selftests due to lack of HW,
so I'd appreciate some testing, especially if the interaction between two
GPUs wanting a device-exclusive entry works as expected.
<program>
#include <stdio.h>
#include <fcntl.h>
#include <stdint.h>
#include <unistd.h>
#include <stdlib.h>
#include <string.h>
#include <sys/mman.h>
#include <sys/ioctl.h>
#include <linux/types.h>
#include <linux/ioctl.h>
#define HMM_DMIRROR_EXCLUSIVE _IOWR('H', 0x05, struct hmm_dmirror_cmd)
struct hmm_dmirror_cmd {
__u64 addr;
__u64 ptr;
__u64 npages;
__u64 cpages;
__u64 faults;
};
const size_t size = 1 * 1024 * 1024 * 1024ul;
const size_t chunk_size = 2 * 1024 * 1024ul;
int main(void)
{
struct hmm_dmirror_cmd cmd;
size_t cur_size;
int fd, ret;
char *addr, *mirror;
fd = open("/dev/hmm_dmirror1", O_RDWR, 0);
if (fd < 0) {
perror("open failed\n");
exit(1);
}
addr = mmap(NULL, size, PROT_READ | PROT_WRITE,
MAP_PRIVATE | MAP_ANONYMOUS, -1, 0);
if (addr == MAP_FAILED) {
perror("mmap failed\n");
exit(1);
}
madvise(addr, size, MADV_NOHUGEPAGE);
memset(addr, 1, size);
mirror = malloc(chunk_size);
for (cur_size = 0; cur_size < size; cur_size += chunk_size) {
cmd.addr = (uintptr_t)addr + cur_size;
cmd.ptr = (uintptr_t)mirror;
cmd.npages = chunk_size / getpagesize();
ret = ioctl(fd, HMM_DMIRROR_EXCLUSIVE, &cmd);
if (ret) {
perror("ioctl failed\n");
exit(1);
}
}
pause();
return 0;
}
</program>
[1] https://lkml.kernel.org/r/25e02685-4f1d-47fa-be5b-01ff85bb0ce2@redhat.com
This patch (of 17):
We only have two FOLL_SPLIT_PMD users. While uprobe refuses hugetlb
early, make_device_exclusive_range() can end up getting called on hugetlb
VMAs.
Right now, this means that with a PMD-sized hugetlb page, we can end up
calling split_huge_pmd(), because pmd_trans_huge() also succeeds with
hugetlb PMDs.
For example, using a modified hmm-test selftest one can trigger:
[ 207.017134][T14945] ------------[ cut here ]------------
[ 207.018614][T14945] kernel BUG at mm/page_table_check.c:87!
[ 207.019716][T14945] Oops: invalid opcode: 0000 [#1] PREEMPT SMP KASAN NOPTI
[ 207.021072][T14945] CPU: 3 UID: 0 PID: ...
[ 207.023036][T14945] Hardware name: QEMU Standard PC (Q35 + ICH9, 2009), BIOS 1.16.3-2.fc40 04/01/2014
[ 207.024834][T14945] RIP: 0010:page_table_check_clear.part.0+0x488/0x510
[ 207.026128][T14945] Code: ...
[ 207.029965][T14945] RSP: 0018:ffffc9000cb8f348 EFLAGS: 00010293
[ 207.031139][T14945] RAX: 0000000000000000 RBX: 00000000ffffffff RCX: ffffffff8249a0cd
[ 207.032649][T14945] RDX: ffff88811e883c80 RSI: ffffffff8249a357 RDI: ffff88811e883c80
[ 207.034183][T14945] RBP: ffff888105c0a050 R08: 0000000000000005 R09: 0000000000000000
[ 207.035688][T14945] R10: 00000000ffffffff R11: 0000000000000003 R12: 0000000000000001
[ 207.037203][T14945] R13: 0000000000000200 R14: 0000000000000001 R15: dffffc0000000000
[ 207.038711][T14945] FS: 00007f2783275740(0000) GS:ffff8881f4980000(0000) knlGS:0000000000000000
[ 207.040407][T14945] CS: 0010 DS: 0000 ES: 0000 CR0: 0000000080050033
[ 207.041660][T14945] CR2: 00007f2782c00000 CR3: 0000000132356000 CR4: 0000000000750ef0
[ 207.043196][T14945] PKRU: 55555554
[ 207.043880][T14945] Call Trace:
[ 207.044506][T14945] <TASK>
[ 207.045086][T14945] ? __die+0x51/0x92
[ 207.045864][T14945] ? die+0x29/0x50
[ 207.046596][T14945] ? do_trap+0x250/0x320
[ 207.047430][T14945] ? do_error_trap+0xe7/0x220
[ 207.048346][T14945] ? page_table_check_clear.part.0+0x488/0x510
[ 207.049535][T14945] ? handle_invalid_op+0x34/0x40
[ 207.050494][T14945] ? page_table_check_clear.part.0+0x488/0x510
[ 207.051681][T14945] ? exc_invalid_op+0x2e/0x50
[ 207.052589][T14945] ? asm_exc_invalid_op+0x1a/0x20
[ 207.053596][T14945] ? page_table_check_clear.part.0+0x1fd/0x510
[ 207.054790][T14945] ? page_table_check_clear.part.0+0x487/0x510
[ 207.055993][T14945] ? page_table_check_clear.part.0+0x488/0x510
[ 207.057195][T14945] ? page_table_check_clear.part.0+0x487/0x510
[ 207.058384][T14945] __page_table_check_pmd_clear+0x34b/0x5a0
[ 207.059524][T14945] ? __pfx___page_table_check_pmd_clear+0x10/0x10
[ 207.060775][T14945] ? __pfx___mutex_unlock_slowpath+0x10/0x10
[ 207.061940][T14945] ? __pfx___lock_acquire+0x10/0x10
[ 207.062967][T14945] pmdp_huge_clear_flush+0x279/0x360
[ 207.064024][T14945] split_huge_pmd_locked+0x82b/0x3750
...
Before commit 9cb28da54643 ("mm/gup: handle hugetlb in the generic
follow_page_mask code"), we would have ignored the flag; instead, let's
simply refuse the combination completely in check_vma_flags(): the caller
is likely not prepared to handle any hugetlb folios.
We'll teach make_device_exclusive_range() separately to ignore any hugetlb
folios as a future-proof safety net.
Link: https://lkml.kernel.org/r/20250210193801.781278-1-david@redhat.com
Link: https://lkml.kernel.org/r/20250210193801.781278-2-david@redhat.com
Fixes: 9cb28da54643 ("mm/gup: handle hugetlb in the generic follow_page_mask code")
Signed-off-by: David Hildenbrand <david(a)redhat.com>
Reviewed-by: John Hubbard <jhubbard(a)nvidia.com>
Reviewed-by: Alistair Popple <apopple(a)nvidia.com>
Cc: Alex Shi <alexs(a)kernel.org>
Cc: Danilo Krummrich <dakr(a)kernel.org>
Cc: Dave Airlie <airlied(a)gmail.com>
Cc: Jann Horn <jannh(a)google.com>
Cc: Jason Gunthorpe <jgg(a)nvidia.com>
Cc: Jerome Glisse <jglisse(a)redhat.com>
Cc: Jonathan Corbet <corbet(a)lwn.net>
Cc: Karol Herbst <kherbst(a)redhat.com>
Cc: Liam Howlett <liam.howlett(a)oracle.com>
Cc: Lorenzo Stoakes <lorenzo.stoakes(a)oracle.com>
Cc: Lyude <lyude(a)redhat.com>
Cc: "Masami Hiramatsu (Google)" <mhiramat(a)kernel.org>
Cc: Oleg Nesterov <oleg(a)redhat.com>
Cc: Pasha Tatashin <pasha.tatashin(a)soleen.com>
Cc: Peter Xu <peterx(a)redhat.com>
Cc: Peter Zijlstra (Intel) <peterz(a)infradead.org>
Cc: SeongJae Park <sj(a)kernel.org>
Cc: Vlastimil Babka <vbabka(a)suse.cz>
Cc: Yanteng Si <si.yanteng(a)linux.dev>
Cc: Simona Vetter <simona.vetter(a)ffwll.ch>
Cc: Barry Song <v-songbaohua(a)oppo.com>
Cc: <stable(a)vger.kernel.org>
Signed-off-by: Andrew Morton <akpm(a)linux-foundation.org>
---
mm/gup.c | 3 +++
1 file changed, 3 insertions(+)
--- a/mm/gup.c~mm-gup-reject-foll_split_pmd-with-hugetlb-vmas
+++ a/mm/gup.c
@@ -1283,6 +1283,9 @@ static int check_vma_flags(struct vm_are
if ((gup_flags & FOLL_LONGTERM) && vma_is_fsdax(vma))
return -EOPNOTSUPP;
+ if ((gup_flags & FOLL_SPLIT_PMD) && is_vm_hugetlb_page(vma))
+ return -EOPNOTSUPP;
+
if (vma_is_secretmem(vma))
return -EFAULT;
_
Patches currently in -mm which might be from david(a)redhat.com are
mm-gup-reject-foll_split_pmd-with-hugetlb-vmas.patch
mm-rmap-reject-hugetlb-folios-in-folio_make_device_exclusive.patch
mm-rmap-convert-make_device_exclusive_range-to-make_device_exclusive.patch
mm-rmap-implement-make_device_exclusive-using-folio_walk-instead-of-rmap-walk.patch
mm-memory-detect-writability-in-restore_exclusive_pte-through-can_change_pte_writable.patch
mm-use-single-swp_device_exclusive-entry-type.patch
mm-page_vma_mapped-device-exclusive-entries-are-not-migration-entries.patch
kernel-events-uprobes-handle-device-exclusive-entries-correctly-in-__replace_page.patch
mm-ksm-handle-device-exclusive-entries-correctly-in-write_protect_page.patch
mm-rmap-handle-device-exclusive-entries-correctly-in-try_to_unmap_one.patch
mm-rmap-handle-device-exclusive-entries-correctly-in-try_to_migrate_one.patch
mm-rmap-handle-device-exclusive-entries-correctly-in-page_vma_mkclean_one.patch
mm-page_idle-handle-device-exclusive-entries-correctly-in-page_idle_clear_pte_refs_one.patch
mm-damon-handle-device-exclusive-entries-correctly-in-damon_folio_young_one.patch
mm-damon-handle-device-exclusive-entries-correctly-in-damon_folio_mkold_one.patch
mm-rmap-keep-mapcount-untouched-for-device-exclusive-entries.patch
mm-rmap-avoid-ebusy-from-make_device_exclusive.patch
Commit 91b587ba79e1 ("f2fs: Introduce linear search for dentries")
This is a follow up to a patch that previously merged in stable to restore
access to files created with emojis under a different hashing scheme.
I believe it's relevant for all currently supported stable branches.
-Daniel
The patch below does not apply to the 6.1-stable tree.
If someone wants it applied there, or to any other stable or longterm
tree, then please email the backport, including the original git commit
id to <stable(a)vger.kernel.org>.
To reproduce the conflict and resubmit, you may use the following commands:
git fetch https://git.kernel.org/pub/scm/linux/kernel/git/stable/linux.git/ linux-6.1.y
git checkout FETCH_HEAD
git cherry-pick -x 604120fd9e9df50ee0e803d3c6e77a1f45d2c58e
# <resolve conflicts, build, test, etc.>
git commit -s
git send-email --to '<stable(a)vger.kernel.org>' --in-reply-to '2025021013-rearrange-cavalry-69c3@gregkh' --subject-prefix 'PATCH 6.1.y' HEAD^..
Possible dependencies:
thanks,
greg k-h
------------------ original commit in Linus's tree ------------------
From 604120fd9e9df50ee0e803d3c6e77a1f45d2c58e Mon Sep 17 00:00:00 2001
From: Sumit Gupta <sumitg(a)nvidia.com>
Date: Wed, 18 Dec 2024 00:07:36 +0000
Subject: [PATCH] arm64: tegra: Fix typo in Tegra234 dce-fabric compatible
The compatible string for the Tegra DCE fabric is currently defined as
'nvidia,tegra234-sce-fabric' but this is incorrect because this is the
compatible string for SCE fabric. Update the compatible for the DCE
fabric to correct the compatible string.
This compatible needs to be correct in order for the interconnect
to catch things such as improper data accesses.
Cc: stable(a)vger.kernel.org
Fixes: 302e154000ec ("arm64: tegra: Add node for CBB 2.0 on Tegra234")
Signed-off-by: Sumit Gupta <sumitg(a)nvidia.com>
Signed-off-by: Ivy Huang <yijuh(a)nvidia.com>
Reviewed-by: Brad Griffis <bgriffis(a)nvidia.com>
Reviewed-by: Jon Hunter <jonathanh(a)nvidia.com>
Link: https://lore.kernel.org/r/20241218000737.1789569-2-yijuh@nvidia.com
Signed-off-by: Thierry Reding <treding(a)nvidia.com>
diff --git a/arch/arm64/boot/dts/nvidia/tegra234.dtsi b/arch/arm64/boot/dts/nvidia/tegra234.dtsi
index 570331baa09e..62b9f1784030 100644
--- a/arch/arm64/boot/dts/nvidia/tegra234.dtsi
+++ b/arch/arm64/boot/dts/nvidia/tegra234.dtsi
@@ -3995,7 +3995,7 @@ bpmp-fabric@d600000 {
};
dce-fabric@de00000 {
- compatible = "nvidia,tegra234-sce-fabric";
+ compatible = "nvidia,tegra234-dce-fabric";
reg = <0x0 0xde00000 0x0 0x40000>;
interrupts = <GIC_SPI 381 IRQ_TYPE_LEVEL_HIGH>;
status = "okay";
Even though FOLL_SPLIT_PMD on hugetlb now always fails with -EOPNOTSUPP,
let's add a safety net in case FOLL_SPLIT_PMD usage would ever be reworked.
In particular, before commit 9cb28da54643 ("mm/gup: handle hugetlb in the
generic follow_page_mask code"), GUP(FOLL_SPLIT_PMD) would just have
returned a page. In particular, hugetlb folios that are not PMD-sized
would never have been prone to FOLL_SPLIT_PMD.
hugetlb folios can be anonymous, and page_make_device_exclusive_one() is
not really prepared for handling them at all. So let's spell that out.
Fixes: b756a3b5e7ea ("mm: device exclusive memory access")
Reviewed-by: Alistair Popple <apopple(a)nvidia.com>
Cc: <stable(a)vger.kernel.org>
Signed-off-by: David Hildenbrand <david(a)redhat.com>
---
mm/rmap.c | 2 +-
1 file changed, 1 insertion(+), 1 deletion(-)
diff --git a/mm/rmap.c b/mm/rmap.c
index c6c4d4ea29a7e..17fbfa61f7efb 100644
--- a/mm/rmap.c
+++ b/mm/rmap.c
@@ -2499,7 +2499,7 @@ static bool folio_make_device_exclusive(struct folio *folio,
* Restrict to anonymous folios for now to avoid potential writeback
* issues.
*/
- if (!folio_test_anon(folio))
+ if (!folio_test_anon(folio) || folio_test_hugetlb(folio))
return false;
rmap_walk(folio, &rwc);
--
2.48.1