The patch below does not apply to the 6.1-stable tree. If someone wants it applied there, or to any other stable or longterm tree, then please email the backport, including the original git commit id to stable@vger.kernel.org.
To reproduce the conflict and resubmit, you may use the following commands:
git fetch https://git.kernel.org/pub/scm/linux/kernel/git/stable/linux.git/ linux-6.1.y git checkout FETCH_HEAD git cherry-pick -x f183663901f21fe0fba8bd31ae894bc529709ee0 # <resolve conflicts, build, test, etc.> git commit -s git send-email --to 'stable@vger.kernel.org' --in-reply-to '2026010547-early-outnumber-a575@gregkh' --subject-prefix 'PATCH 6.1.y' HEAD^..
Possible dependencies:
thanks,
greg k-h
------------------ original commit in Linus's tree ------------------
From f183663901f21fe0fba8bd31ae894bc529709ee0 Mon Sep 17 00:00:00 2001 From: Bijan Tabatabai bijan311@gmail.com Date: Tue, 16 Dec 2025 14:07:27 -0600 Subject: [PATCH] mm: consider non-anon swap cache folios in folio_expected_ref_count()
Currently, folio_expected_ref_count() only adds references for the swap cache if the folio is anonymous. However, according to the comment above the definition of PG_swapcache in enum pageflags, shmem folios can also have PG_swapcache set. This patch makes sure references for the swap cache are added if folio_test_swapcache(folio) is true.
This issue was found when trying to hot-unplug memory in a QEMU/KVM virtual machine. When initiating hot-unplug when most of the guest memory is allocated, hot-unplug hangs partway through removal due to migration failures. The following message would be printed several times, and would be printed again about every five seconds:
[ 49.641309] migrating pfn b12f25 failed ret:7 [ 49.641310] page: refcount:2 mapcount:0 mapping:0000000033bd8fe2 index:0x7f404d925 pfn:0xb12f25 [ 49.641311] aops:swap_aops [ 49.641313] flags: 0x300000000030508(uptodate|active|owner_priv_1|reclaim|swapbacked|node=0|zone=3) [ 49.641314] raw: 0300000000030508 ffffed312c4bc908 ffffed312c4bc9c8 0000000000000000 [ 49.641315] raw: 00000007f404d925 00000000000c823b 00000002ffffffff 0000000000000000 [ 49.641315] page dumped because: migration failure
When debugging this, I found that these migration failures were due to __migrate_folio() returning -EAGAIN for a small set of folios because the expected reference count it calculates via folio_expected_ref_count() is one less than the actual reference count of the folios. Furthermore, all of the affected folios were not anonymous, but had the PG_swapcache flag set, inspiring this patch. After applying this patch, the memory hot-unplug behaves as expected.
I tested this on a machine running Ubuntu 24.04 with kernel version 6.8.0-90-generic and 64GB of memory. The guest VM is managed by libvirt and runs Ubuntu 24.04 with kernel version 6.18 (though the head of the mm-unstable branch as a Dec 16, 2025 was also tested and behaves the same) and 48GB of memory. The libvirt XML definition for the VM can be found at [1]. CONFIG_MHP_DEFAULT_ONLINE_TYPE_ONLINE_MOVABLE is set in the guest kernel so the hot-pluggable memory is automatically onlined.
Below are the steps to reproduce this behavior:
1) Define and start and virtual machine host$ virsh -c qemu:///system define ./test_vm.xml # test_vm.xml from [1] host$ virsh -c qemu:///system start test_vm
2) Setup swap in the guest guest$ sudo fallocate -l 32G /swapfile guest$ sudo chmod 0600 /swapfile guest$ sudo mkswap /swapfile guest$ sudo swapon /swapfile
3) Use alloc_data [2] to allocate most of the remaining guest memory guest$ ./alloc_data 45
4) In a separate guest terminal, monitor the amount of used memory guest$ watch -n1 free -h
5) When alloc_data has finished allocating, initiate the memory hot-unplug using the provided xml file [3] host$ virsh -c qemu:///system detach-device test_vm ./remove.xml --live
After initiating the memory hot-unplug, you should see the amount of available memory in the guest decrease, and the amount of used swap data increase. If everything works as expected, when all of the memory is unplugged, there should be around 8.5-9GB of data in swap. If the unplugging is unsuccessful, the amount of used swap data will settle below that. If that happens, you should be able to see log messages in dmesg similar to the one posted above.
Link: https://lkml.kernel.org/r/20251216200727.2360228-1-bijan311@gmail.com Link: https://github.com/BijanT/linux_patch_files/blob/main/test_vm.xml [1] Link: https://github.com/BijanT/linux_patch_files/blob/main/alloc_data.c [2] Link: https://github.com/BijanT/linux_patch_files/blob/main/remove.xml [3] Fixes: 86ebd50224c0 ("mm: add folio_expected_ref_count() for reference count calculation") Signed-off-by: Bijan Tabatabai bijan311@gmail.com Acked-by: David Hildenbrand (Red Hat) david@kernel.org Acked-by: Zi Yan ziy@nvidia.com Reviewed-by: Baolin Wang baolin.wang@linux.alibaba.com Cc: Liam Howlett liam.howlett@oracle.com Cc: Lorenzo Stoakes lorenzo.stoakes@oracle.com Cc: Michal Hocko mhocko@suse.com Cc: Mike Rapoport rppt@kernel.org Cc: Shivank Garg shivankg@amd.com Cc: Suren Baghdasaryan surenb@google.com Cc: Vlastimil Babka vbabka@suse.cz Cc: Kairui Song ryncsn@gmail.com Cc: stable@vger.kernel.org Signed-off-by: Andrew Morton akpm@linux-foundation.org
diff --git a/include/linux/mm.h b/include/linux/mm.h index 15076261d0c2..6f959d8ca4b4 100644 --- a/include/linux/mm.h +++ b/include/linux/mm.h @@ -2459,10 +2459,10 @@ static inline int folio_expected_ref_count(const struct folio *folio) if (WARN_ON_ONCE(page_has_type(&folio->page) && !folio_test_hugetlb(folio))) return 0;
- if (folio_test_anon(folio)) { - /* One reference per page from the swapcache. */ - ref_count += folio_test_swapcache(folio) << order; - } else { + /* One reference per page from the swapcache. */ + ref_count += folio_test_swapcache(folio) << order; + + if (!folio_test_anon(folio)) { /* One reference per page from the pagecache. */ ref_count += !!folio->mapping << order; /* One reference from PG_private. */
From: David Hildenbrand david@redhat.com
[ Upstream commit 78cb1a13c42a6d843e21389f74d1edb90ed07288 ]
Now that PAGE_MAPPING_MOVABLE is gone, we can simplify and rely on the folio_test_anon() test only.
... but staring at the users, this function should never even have been called on movable_ops pages. E.g., * __buffer_migrate_folio() does not make sense for them * folio_migrate_mapping() does not make sense for them * migrate_huge_page_move_mapping() does not make sense for them * __migrate_folio() does not make sense for them * ... and khugepaged should never stumble over them
Let's simply refuse typed pages (which includes slab) except hugetlb, and WARN.
Link: https://lkml.kernel.org/r/20250704102524.326966-26-david@redhat.com Signed-off-by: David Hildenbrand david@redhat.com Reviewed-by: Zi Yan ziy@nvidia.com Reviewed-by: Lorenzo Stoakes lorenzo.stoakes@oracle.com Reviewed-by: Harry Yoo harry.yoo@oracle.com Cc: Alistair Popple apopple@nvidia.com Cc: Al Viro viro@zeniv.linux.org.uk Cc: Arnd Bergmann arnd@arndb.de Cc: Brendan Jackman jackmanb@google.com Cc: Byungchul Park byungchul@sk.com Cc: Chengming Zhou chengming.zhou@linux.dev Cc: Christian Brauner brauner@kernel.org Cc: Christophe Leroy christophe.leroy@csgroup.eu Cc: Eugenio Pé rez eperezma@redhat.com Cc: Greg Kroah-Hartman gregkh@linuxfoundation.org Cc: Gregory Price gourry@gourry.net Cc: "Huang, Ying" ying.huang@linux.alibaba.com Cc: Jan Kara jack@suse.cz Cc: Jason Gunthorpe jgg@ziepe.ca Cc: Jason Wang jasowang@redhat.com Cc: Jerrin Shaji George jerrin.shaji-george@broadcom.com Cc: Johannes Weiner hannes@cmpxchg.org Cc: John Hubbard jhubbard@nvidia.com Cc: Jonathan Corbet corbet@lwn.net Cc: Joshua Hahn joshua.hahnjy@gmail.com Cc: Liam Howlett liam.howlett@oracle.com Cc: Madhavan Srinivasan maddy@linux.ibm.com Cc: Mathew Brost matthew.brost@intel.com Cc: Matthew Wilcox (Oracle) willy@infradead.org Cc: Miaohe Lin linmiaohe@huawei.com Cc: Michael Ellerman mpe@ellerman.id.au Cc: "Michael S. Tsirkin" mst@redhat.com Cc: Michal Hocko mhocko@suse.com Cc: Mike Rapoport rppt@kernel.org Cc: Minchan Kim minchan@kernel.org Cc: Naoya Horiguchi nao.horiguchi@gmail.com Cc: Nicholas Piggin npiggin@gmail.com Cc: Oscar Salvador osalvador@suse.de Cc: Peter Xu peterx@redhat.com Cc: Qi Zheng zhengqi.arch@bytedance.com Cc: Rakie Kim rakie.kim@sk.com Cc: Rik van Riel riel@surriel.com Cc: Sergey Senozhatsky senozhatsky@chromium.org Cc: Shakeel Butt shakeel.butt@linux.dev Cc: Suren Baghdasaryan surenb@google.com Cc: Vlastimil Babka vbabka@suse.cz Cc: Xuan Zhuo xuanzhuo@linux.alibaba.com Cc: xu xin xu.xin16@zte.com.cn Signed-off-by: Andrew Morton akpm@linux-foundation.org Stable-dep-of: f183663901f2 ("mm: consider non-anon swap cache folios in folio_expected_ref_count()") Signed-off-by: Sasha Levin sashal@kernel.org --- include/linux/mm.h | 4 ++-- 1 file changed, 2 insertions(+), 2 deletions(-)
diff --git a/include/linux/mm.h b/include/linux/mm.h index 44381ffaf34b..61efff50eb67 100644 --- a/include/linux/mm.h +++ b/include/linux/mm.h @@ -1820,13 +1820,13 @@ static inline int folio_expected_ref_count(struct folio *folio) const int order = folio_order(folio); int ref_count = 0;
- if (WARN_ON_ONCE(folio_test_slab(folio))) + if (WARN_ON_ONCE(page_has_type(&folio->page) && !folio_test_hugetlb(folio))) return 0;
if (folio_test_anon(folio)) { /* One reference per page from the swapcache. */ ref_count += folio_test_swapcache(folio) << order; - } else if (!((unsigned long)folio->mapping & PAGE_MAPPING_FLAGS)) { + } else { /* One reference per page from the pagecache. */ ref_count += !!folio->mapping << order; /* One reference from PG_private. */
From: Bijan Tabatabai bijan311@gmail.com
[ Upstream commit f183663901f21fe0fba8bd31ae894bc529709ee0 ]
Currently, folio_expected_ref_count() only adds references for the swap cache if the folio is anonymous. However, according to the comment above the definition of PG_swapcache in enum pageflags, shmem folios can also have PG_swapcache set. This patch makes sure references for the swap cache are added if folio_test_swapcache(folio) is true.
This issue was found when trying to hot-unplug memory in a QEMU/KVM virtual machine. When initiating hot-unplug when most of the guest memory is allocated, hot-unplug hangs partway through removal due to migration failures. The following message would be printed several times, and would be printed again about every five seconds:
[ 49.641309] migrating pfn b12f25 failed ret:7 [ 49.641310] page: refcount:2 mapcount:0 mapping:0000000033bd8fe2 index:0x7f404d925 pfn:0xb12f25 [ 49.641311] aops:swap_aops [ 49.641313] flags: 0x300000000030508(uptodate|active|owner_priv_1|reclaim|swapbacked|node=0|zone=3) [ 49.641314] raw: 0300000000030508 ffffed312c4bc908 ffffed312c4bc9c8 0000000000000000 [ 49.641315] raw: 00000007f404d925 00000000000c823b 00000002ffffffff 0000000000000000 [ 49.641315] page dumped because: migration failure
When debugging this, I found that these migration failures were due to __migrate_folio() returning -EAGAIN for a small set of folios because the expected reference count it calculates via folio_expected_ref_count() is one less than the actual reference count of the folios. Furthermore, all of the affected folios were not anonymous, but had the PG_swapcache flag set, inspiring this patch. After applying this patch, the memory hot-unplug behaves as expected.
I tested this on a machine running Ubuntu 24.04 with kernel version 6.8.0-90-generic and 64GB of memory. The guest VM is managed by libvirt and runs Ubuntu 24.04 with kernel version 6.18 (though the head of the mm-unstable branch as a Dec 16, 2025 was also tested and behaves the same) and 48GB of memory. The libvirt XML definition for the VM can be found at [1]. CONFIG_MHP_DEFAULT_ONLINE_TYPE_ONLINE_MOVABLE is set in the guest kernel so the hot-pluggable memory is automatically onlined.
Below are the steps to reproduce this behavior:
1) Define and start and virtual machine host$ virsh -c qemu:///system define ./test_vm.xml # test_vm.xml from [1] host$ virsh -c qemu:///system start test_vm
2) Setup swap in the guest guest$ sudo fallocate -l 32G /swapfile guest$ sudo chmod 0600 /swapfile guest$ sudo mkswap /swapfile guest$ sudo swapon /swapfile
3) Use alloc_data [2] to allocate most of the remaining guest memory guest$ ./alloc_data 45
4) In a separate guest terminal, monitor the amount of used memory guest$ watch -n1 free -h
5) When alloc_data has finished allocating, initiate the memory hot-unplug using the provided xml file [3] host$ virsh -c qemu:///system detach-device test_vm ./remove.xml --live
After initiating the memory hot-unplug, you should see the amount of available memory in the guest decrease, and the amount of used swap data increase. If everything works as expected, when all of the memory is unplugged, there should be around 8.5-9GB of data in swap. If the unplugging is unsuccessful, the amount of used swap data will settle below that. If that happens, you should be able to see log messages in dmesg similar to the one posted above.
Link: https://lkml.kernel.org/r/20251216200727.2360228-1-bijan311@gmail.com Link: https://github.com/BijanT/linux_patch_files/blob/main/test_vm.xml [1] Link: https://github.com/BijanT/linux_patch_files/blob/main/alloc_data.c [2] Link: https://github.com/BijanT/linux_patch_files/blob/main/remove.xml [3] Fixes: 86ebd50224c0 ("mm: add folio_expected_ref_count() for reference count calculation") Signed-off-by: Bijan Tabatabai bijan311@gmail.com Acked-by: David Hildenbrand (Red Hat) david@kernel.org Acked-by: Zi Yan ziy@nvidia.com Reviewed-by: Baolin Wang baolin.wang@linux.alibaba.com Cc: Liam Howlett liam.howlett@oracle.com Cc: Lorenzo Stoakes lorenzo.stoakes@oracle.com Cc: Michal Hocko mhocko@suse.com Cc: Mike Rapoport rppt@kernel.org Cc: Shivank Garg shivankg@amd.com Cc: Suren Baghdasaryan surenb@google.com Cc: Vlastimil Babka vbabka@suse.cz Cc: Kairui Song ryncsn@gmail.com Cc: stable@vger.kernel.org Signed-off-by: Andrew Morton akpm@linux-foundation.org Signed-off-by: Sasha Levin sashal@kernel.org --- include/linux/mm.h | 8 ++++---- 1 file changed, 4 insertions(+), 4 deletions(-)
diff --git a/include/linux/mm.h b/include/linux/mm.h index 61efff50eb67..c5720c0b7b77 100644 --- a/include/linux/mm.h +++ b/include/linux/mm.h @@ -1823,10 +1823,10 @@ static inline int folio_expected_ref_count(struct folio *folio) if (WARN_ON_ONCE(page_has_type(&folio->page) && !folio_test_hugetlb(folio))) return 0;
- if (folio_test_anon(folio)) { - /* One reference per page from the swapcache. */ - ref_count += folio_test_swapcache(folio) << order; - } else { + /* One reference per page from the swapcache. */ + ref_count += folio_test_swapcache(folio) << order; + + if (!folio_test_anon(folio)) { /* One reference per page from the pagecache. */ ref_count += !!folio->mapping << order; /* One reference from PG_private. */
linux-stable-mirror@lists.linaro.org