The patch below does not apply to the 5.15-stable tree.
If someone wants it applied there, or to any other stable or longterm
tree, then please email the backport, including the original git commit
id to <stable(a)vger.kernel.org>.
To reproduce the conflict and resubmit, you may use the following commands:
git fetch https://git.kernel.org/pub/scm/linux/kernel/git/stable/linux.git/ linux-5.15.y
git checkout FETCH_HEAD
git cherry-pick -x 78d9161d2bcd442d93d917339297ffa057dbee8c
# <resolve conflicts, build, test, etc.>
git commit -s
git send-email --to '<stable(a)vger.kernel.org>' --in-reply-to '2024042951-barbell-aeration-a1ce@gregkh' --subject-prefix 'PATCH 5.15.y' HEAD^..
Possible dependencies:
78d9161d2bcd ("fbdev: fix incorrect address computation in deferred IO")
3ed3811283dd ("fbdev: Refactor implementation of page_mkwrite")
56c134f7f1b5 ("fbdev: Track deferred-I/O pages in pageref struct")
856082f021a2 ("fbdev: defio: fix the pagelist corruption")
8c30e2d81bfd ("fbdev: Don't sort deferred-I/O pages by default")
105a940416fc ("fbdev/defio: Early-out if page is already enlisted")
67b723f5b742 ("drm/fb-helper: Calculate damaged area in separate helper")
aa15c677cc34 ("drm/fb-helper: Fix vertical damage clipping")
a3c286dcef7f ("drm/fb-helper: Fix clip rectangle height")
thanks,
greg k-h
------------------ original commit in Linus's tree ------------------
From 78d9161d2bcd442d93d917339297ffa057dbee8c Mon Sep 17 00:00:00 2001
From: Nam Cao <namcao(a)linutronix.de>
Date: Tue, 23 Apr 2024 13:50:53 +0200
Subject: [PATCH] fbdev: fix incorrect address computation in deferred IO
With deferred IO enabled, a page fault happens when data is written to the
framebuffer device. Then driver determines which page is being updated by
calculating the offset of the written virtual address within the virtual
memory area, and uses this offset to get the updated page within the
internal buffer. This page is later copied to hardware (thus the name
"deferred IO").
This offset calculation is only correct if the virtual memory area is
mapped to the beginning of the internal buffer. Otherwise this is wrong.
For example, if users do:
mmap(ptr, 4096, PROT_WRITE, MAP_FIXED | MAP_SHARED, fd, 0xff000);
Then the virtual memory area will mapped at offset 0xff000 within the
internal buffer. This offset 0xff000 is not accounted for, and wrong page
is updated.
Correct the calculation by using vmf->pgoff instead. With this change, the
variable "offset" will no longer hold the exact offset value, but it is
rounded down to multiples of PAGE_SIZE. But this is still correct, because
this variable is only used to calculate the page offset.
Reported-by: Harshit Mogalapalli <harshit.m.mogalapalli(a)oracle.com>
Closes: https://lore.kernel.org/linux-fbdev/271372d6-e665-4e7f-b088-dee5f4ab341a@or…
Fixes: 56c134f7f1b5 ("fbdev: Track deferred-I/O pages in pageref struct")
Cc: <stable(a)vger.kernel.org>
Signed-off-by: Nam Cao <namcao(a)linutronix.de>
Reviewed-by: Thomas Zimmermann <tzimmermann(a)suse.de>
Tested-by: Harshit Mogalapalli <harshit.m.mogalapalli(a)oracle.com>
Signed-off-by: Thomas Zimmermann <tzimmermann(a)suse.de>
Link: https://patchwork.freedesktop.org/patch/msgid/20240423115053.4490-1-namcao@…
diff --git a/drivers/video/fbdev/core/fb_defio.c b/drivers/video/fbdev/core/fb_defio.c
index dae96c9f61cf..806ecd32219b 100644
--- a/drivers/video/fbdev/core/fb_defio.c
+++ b/drivers/video/fbdev/core/fb_defio.c
@@ -196,7 +196,7 @@ static vm_fault_t fb_deferred_io_track_page(struct fb_info *info, unsigned long
*/
static vm_fault_t fb_deferred_io_page_mkwrite(struct fb_info *info, struct vm_fault *vmf)
{
- unsigned long offset = vmf->address - vmf->vma->vm_start;
+ unsigned long offset = vmf->pgoff << PAGE_SHIFT;
struct page *page = vmf->page;
file_update_time(vmf->vma->vm_file);
The patch below does not apply to the 6.8-stable tree.
If someone wants it applied there, or to any other stable or longterm
tree, then please email the backport, including the original git commit
id to <stable(a)vger.kernel.org>.
To reproduce the conflict and resubmit, you may use the following commands:
git fetch https://git.kernel.org/pub/scm/linux/kernel/git/stable/linux.git/ linux-6.8.y
git checkout FETCH_HEAD
git cherry-pick -x 682886ec69d22363819a83ddddd5d66cb5c791e1
# <resolve conflicts, build, test, etc.>
git commit -s
git send-email --to '<stable(a)vger.kernel.org>' --in-reply-to '2024042923-monday-hamlet-26ca@gregkh' --subject-prefix 'PATCH 6.8.y' HEAD^..
Possible dependencies:
682886ec69d2 ("mm: zswap: fix shrinker NULL crash with cgroup_disable=memory")
30fb6a8d9e33 ("mm: zswap: fix writeback shinker GFP_NOIO/GFP_NOFS recursion")
e35606e4167d ("mm/zswap: global lru and shrinker shared by all zswap_pools fix")
bf9b7df23cb3 ("mm/zswap: global lru and shrinker shared by all zswap_pools")
5182661a11ba ("mm: zswap: function ordering: move entry sections out of LRU section")
506a86c5e221 ("mm: zswap: function ordering: public lru api")
abca07c04aa5 ("mm: zswap: function ordering: pool params")
c1a0ecb82bdc ("mm: zswap: function ordering: zswap_pools")
39f3ec8eaa60 ("mm: zswap: function ordering: pool refcounting")
a984649b5c1f ("mm: zswap: function ordering: pool alloc & free")
be7fc97c5283 ("mm: zswap: further cleanup zswap_store()")
fa9ad6e21003 ("mm: zswap: break out zwap_compress()")
ff2972aa1b5d ("mm: zswap: rename __zswap_load() to zswap_decompress()")
7dd1f7f0fc1c ("mm: zswap: move zswap_invalidate_entry() to related functions")
5b297f70bb26 ("mm: zswap: inline and remove zswap_entry_find_get()")
5878303c5353 ("mm/zswap: fix race between lru writeback and swapoff")
db128f5fdee9 ("mm: zswap: remove unused tree argument in zswap_entry_put()")
44c7c734a513 ("mm/zswap: split zswap rb-tree")
bb29fd7760ae ("mm/zswap: make sure each swapfile always have zswap rb-tree")
8409a385a6b4 ("mm/zswap: improve with alloc_workqueue() call")
thanks,
greg k-h
------------------ original commit in Linus's tree ------------------
From 682886ec69d22363819a83ddddd5d66cb5c791e1 Mon Sep 17 00:00:00 2001
From: Johannes Weiner <hannes(a)cmpxchg.org>
Date: Thu, 18 Apr 2024 08:26:28 -0400
Subject: [PATCH] mm: zswap: fix shrinker NULL crash with cgroup_disable=memory
Christian reports a NULL deref in zswap that he bisected down to the zswap
shrinker. The issue also cropped up in the bug trackers of libguestfs [1]
and the Red Hat bugzilla [2].
The problem is that when memcg is disabled with the boot time flag, the
zswap shrinker might get called with sc->memcg == NULL. This is okay in
many places, like the lruvec operations. But it crashes in
memcg_page_state() - which is only used due to the non-node accounting of
cgroup's the zswap memory to begin with.
Nhat spotted that the memcg can be NULL in the memcg-disabled case, and I
was then able to reproduce the crash locally as well.
[1] https://github.com/libguestfs/libguestfs/issues/139
[2] https://bugzilla.redhat.com/show_bug.cgi?id=2275252
Link: https://lkml.kernel.org/r/20240418124043.GC1055428@cmpxchg.org
Link: https://lkml.kernel.org/r/20240417143324.GA1055428@cmpxchg.org
Fixes: b5ba474f3f51 ("zswap: shrink zswap pool based on memory pressure")
Signed-off-by: Johannes Weiner <hannes(a)cmpxchg.org>
Reported-by: Christian Heusel <christian(a)heusel.eu>
Debugged-by: Nhat Pham <nphamcs(a)gmail.com>
Suggested-by: Nhat Pham <nphamcs(a)gmail.com>
Tested-by: Christian Heusel <christian(a)heusel.eu>
Acked-by: Yosry Ahmed <yosryahmed(a)google.com>
Cc: Chengming Zhou <chengming.zhou(a)linux.dev>
Cc: Dan Streetman <ddstreet(a)ieee.org>
Cc: Richard W.M. Jones <rjones(a)redhat.com>
Cc: Seth Jennings <sjenning(a)redhat.com>
Cc: Vitaly Wool <vitaly.wool(a)konsulko.com>
Cc: <stable(a)vger.kernel.org> [v6.8]
Signed-off-by: Andrew Morton <akpm(a)linux-foundation.org>
diff --git a/mm/zswap.c b/mm/zswap.c
index caed028945b0..6f8850c44b61 100644
--- a/mm/zswap.c
+++ b/mm/zswap.c
@@ -1331,15 +1331,22 @@ static unsigned long zswap_shrinker_count(struct shrinker *shrinker,
if (!gfp_has_io_fs(sc->gfp_mask))
return 0;
-#ifdef CONFIG_MEMCG_KMEM
- mem_cgroup_flush_stats(memcg);
- nr_backing = memcg_page_state(memcg, MEMCG_ZSWAP_B) >> PAGE_SHIFT;
- nr_stored = memcg_page_state(memcg, MEMCG_ZSWAPPED);
-#else
- /* use pool stats instead of memcg stats */
- nr_backing = zswap_pool_total_size >> PAGE_SHIFT;
- nr_stored = atomic_read(&zswap_nr_stored);
-#endif
+ /*
+ * For memcg, use the cgroup-wide ZSWAP stats since we don't
+ * have them per-node and thus per-lruvec. Careful if memcg is
+ * runtime-disabled: we can get sc->memcg == NULL, which is ok
+ * for the lruvec, but not for memcg_page_state().
+ *
+ * Without memcg, use the zswap pool-wide metrics.
+ */
+ if (!mem_cgroup_disabled()) {
+ mem_cgroup_flush_stats(memcg);
+ nr_backing = memcg_page_state(memcg, MEMCG_ZSWAP_B) >> PAGE_SHIFT;
+ nr_stored = memcg_page_state(memcg, MEMCG_ZSWAPPED);
+ } else {
+ nr_backing = zswap_pool_total_size >> PAGE_SHIFT;
+ nr_stored = atomic_read(&zswap_nr_stored);
+ }
if (!nr_stored)
return 0;
The patch below does not apply to the 6.8-stable tree.
If someone wants it applied there, or to any other stable or longterm
tree, then please email the backport, including the original git commit
id to <stable(a)vger.kernel.org>.
To reproduce the conflict and resubmit, you may use the following commands:
git fetch https://git.kernel.org/pub/scm/linux/kernel/git/stable/linux.git/ linux-6.8.y
git checkout FETCH_HEAD
git cherry-pick -x d99e3140a4d33e26066183ff727d8f02f56bec64
# <resolve conflicts, build, test, etc.>
git commit -s
git send-email --to '<stable(a)vger.kernel.org>' --in-reply-to '2024042907-unquote-thank-8de2@gregkh' --subject-prefix 'PATCH 6.8.y' HEAD^..
Possible dependencies:
d99e3140a4d3 ("mm: turn folio_test_hugetlb into a PageType")
29cfe7556bfd ("mm: constify more page/folio tests")
443cbaf9e2fd ("crash: split vmcoreinfo exporting code out from crash_core.c")
85fcde402db1 ("kexec: split crashkernel reservation code out from crash_core.c")
55c49fee57af ("mm/vmalloc: remove vmap_area_list")
d093602919ad ("mm: vmalloc: remove global vmap_area_root rb-tree")
7fa8cee00316 ("mm: vmalloc: move vmap_init_free_space() down in vmalloc.c")
thanks,
greg k-h
------------------ original commit in Linus's tree ------------------
From d99e3140a4d33e26066183ff727d8f02f56bec64 Mon Sep 17 00:00:00 2001
From: "Matthew Wilcox (Oracle)" <willy(a)infradead.org>
Date: Thu, 21 Mar 2024 14:24:43 +0000
Subject: [PATCH] mm: turn folio_test_hugetlb into a PageType
The current folio_test_hugetlb() can be fooled by a concurrent folio split
into returning true for a folio which has never belonged to hugetlbfs.
This can't happen if the caller holds a refcount on it, but we have a few
places (memory-failure, compaction, procfs) which do not and should not
take a speculative reference.
Since hugetlb pages do not use individual page mapcounts (they are always
fully mapped and use the entire_mapcount field to record the number of
mappings), the PageType field is available now that page_mapcount()
ignores the value in this field.
In compaction and with CONFIG_DEBUG_VM enabled, the current implementation
can result in an oops, as reported by Luis. This happens since 9c5ccf2db04b
("mm: remove HUGETLB_PAGE_DTOR") effectively added some VM_BUG_ON() checks
in the PageHuge() testing path.
[willy(a)infradead.org: update vmcoreinfo]
Link: https://lkml.kernel.org/r/ZgGZUvsdhaT1Va-T@casper.infradead.org
Link: https://lkml.kernel.org/r/20240321142448.1645400-6-willy@infradead.org
Fixes: 9c5ccf2db04b ("mm: remove HUGETLB_PAGE_DTOR")
Signed-off-by: Matthew Wilcox (Oracle) <willy(a)infradead.org>
Reviewed-by: David Hildenbrand <david(a)redhat.com>
Acked-by: Vlastimil Babka <vbabka(a)suse.cz>
Reported-by: Luis Chamberlain <mcgrof(a)kernel.org>
Closes: https://bugzilla.kernel.org/show_bug.cgi?id=218227
Cc: Miaohe Lin <linmiaohe(a)huawei.com>
Cc: Muchun Song <muchun.song(a)linux.dev>
Cc: Oscar Salvador <osalvador(a)suse.de>
Cc: <stable(a)vger.kernel.org>
Signed-off-by: Andrew Morton <akpm(a)linux-foundation.org>
diff --git a/include/linux/page-flags.h b/include/linux/page-flags.h
index 35a0087d0910..4bf1c25fd1dc 100644
--- a/include/linux/page-flags.h
+++ b/include/linux/page-flags.h
@@ -190,7 +190,6 @@ enum pageflags {
/* At least one page in this folio has the hwpoison flag set */
PG_has_hwpoisoned = PG_error,
- PG_hugetlb = PG_active,
PG_large_rmappable = PG_workingset, /* anon or file-backed */
};
@@ -876,29 +875,6 @@ TESTPAGEFLAG_FALSE(LargeRmappable, large_rmappable)
#define PG_head_mask ((1UL << PG_head))
-#ifdef CONFIG_HUGETLB_PAGE
-int PageHuge(const struct page *page);
-SETPAGEFLAG(HugeTLB, hugetlb, PF_SECOND)
-CLEARPAGEFLAG(HugeTLB, hugetlb, PF_SECOND)
-
-/**
- * folio_test_hugetlb - Determine if the folio belongs to hugetlbfs
- * @folio: The folio to test.
- *
- * Context: Any context. Caller should have a reference on the folio to
- * prevent it from being turned into a tail page.
- * Return: True for hugetlbfs folios, false for anon folios or folios
- * belonging to other filesystems.
- */
-static inline bool folio_test_hugetlb(const struct folio *folio)
-{
- return folio_test_large(folio) &&
- test_bit(PG_hugetlb, const_folio_flags(folio, 1));
-}
-#else
-TESTPAGEFLAG_FALSE(Huge, hugetlb)
-#endif
-
#ifdef CONFIG_TRANSPARENT_HUGEPAGE
/*
* PageHuge() only returns true for hugetlbfs pages, but not for
@@ -954,18 +930,6 @@ PAGEFLAG_FALSE(HasHWPoisoned, has_hwpoisoned)
TESTSCFLAG_FALSE(HasHWPoisoned, has_hwpoisoned)
#endif
-/*
- * Check if a page is currently marked HWPoisoned. Note that this check is
- * best effort only and inherently racy: there is no way to synchronize with
- * failing hardware.
- */
-static inline bool is_page_hwpoison(struct page *page)
-{
- if (PageHWPoison(page))
- return true;
- return PageHuge(page) && PageHWPoison(compound_head(page));
-}
-
/*
* For pages that are never mapped to userspace (and aren't PageSlab),
* page_type may be used. Because it is initialised to -1, we invert the
@@ -982,6 +946,7 @@ static inline bool is_page_hwpoison(struct page *page)
#define PG_offline 0x00000100
#define PG_table 0x00000200
#define PG_guard 0x00000400
+#define PG_hugetlb 0x00000800
#define PageType(page, flag) \
((page->page_type & (PAGE_TYPE_BASE | flag)) == PAGE_TYPE_BASE)
@@ -1076,6 +1041,37 @@ PAGE_TYPE_OPS(Table, table, pgtable)
*/
PAGE_TYPE_OPS(Guard, guard, guard)
+#ifdef CONFIG_HUGETLB_PAGE
+FOLIO_TYPE_OPS(hugetlb, hugetlb)
+#else
+FOLIO_TEST_FLAG_FALSE(hugetlb)
+#endif
+
+/**
+ * PageHuge - Determine if the page belongs to hugetlbfs
+ * @page: The page to test.
+ *
+ * Context: Any context.
+ * Return: True for hugetlbfs pages, false for anon pages or pages
+ * belonging to other filesystems.
+ */
+static inline bool PageHuge(const struct page *page)
+{
+ return folio_test_hugetlb(page_folio(page));
+}
+
+/*
+ * Check if a page is currently marked HWPoisoned. Note that this check is
+ * best effort only and inherently racy: there is no way to synchronize with
+ * failing hardware.
+ */
+static inline bool is_page_hwpoison(struct page *page)
+{
+ if (PageHWPoison(page))
+ return true;
+ return PageHuge(page) && PageHWPoison(compound_head(page));
+}
+
extern bool is_free_buddy_page(struct page *page);
PAGEFLAG(Isolated, isolated, PF_ANY);
@@ -1142,7 +1138,7 @@ static __always_inline void __ClearPageAnonExclusive(struct page *page)
*/
#define PAGE_FLAGS_SECOND \
(0xffUL /* order */ | 1UL << PG_has_hwpoisoned | \
- 1UL << PG_hugetlb | 1UL << PG_large_rmappable)
+ 1UL << PG_large_rmappable)
#define PAGE_FLAGS_PRIVATE \
(1UL << PG_private | 1UL << PG_private_2)
diff --git a/include/trace/events/mmflags.h b/include/trace/events/mmflags.h
index d801409b33cf..d55e53ac91bd 100644
--- a/include/trace/events/mmflags.h
+++ b/include/trace/events/mmflags.h
@@ -135,6 +135,7 @@ IF_HAVE_PG_ARCH_X(arch_3)
#define DEF_PAGETYPE_NAME(_name) { PG_##_name, __stringify(_name) }
#define __def_pagetype_names \
+ DEF_PAGETYPE_NAME(hugetlb), \
DEF_PAGETYPE_NAME(offline), \
DEF_PAGETYPE_NAME(guard), \
DEF_PAGETYPE_NAME(table), \
diff --git a/kernel/vmcore_info.c b/kernel/vmcore_info.c
index f95516cd45bb..23c125c2e243 100644
--- a/kernel/vmcore_info.c
+++ b/kernel/vmcore_info.c
@@ -205,11 +205,10 @@ static int __init crash_save_vmcoreinfo_init(void)
VMCOREINFO_NUMBER(PG_head_mask);
#define PAGE_BUDDY_MAPCOUNT_VALUE (~PG_buddy)
VMCOREINFO_NUMBER(PAGE_BUDDY_MAPCOUNT_VALUE);
-#ifdef CONFIG_HUGETLB_PAGE
- VMCOREINFO_NUMBER(PG_hugetlb);
+#define PAGE_HUGETLB_MAPCOUNT_VALUE (~PG_hugetlb)
+ VMCOREINFO_NUMBER(PAGE_HUGETLB_MAPCOUNT_VALUE);
#define PAGE_OFFLINE_MAPCOUNT_VALUE (~PG_offline)
VMCOREINFO_NUMBER(PAGE_OFFLINE_MAPCOUNT_VALUE);
-#endif
#ifdef CONFIG_KALLSYMS
VMCOREINFO_SYMBOL(kallsyms_names);
diff --git a/mm/hugetlb.c b/mm/hugetlb.c
index 53e0ab5c0845..4553241f0fb2 100644
--- a/mm/hugetlb.c
+++ b/mm/hugetlb.c
@@ -1624,7 +1624,7 @@ static inline void __clear_hugetlb_destructor(struct hstate *h,
{
lockdep_assert_held(&hugetlb_lock);
- folio_clear_hugetlb(folio);
+ __folio_clear_hugetlb(folio);
}
/*
@@ -1711,7 +1711,7 @@ static void add_hugetlb_folio(struct hstate *h, struct folio *folio,
h->surplus_huge_pages_node[nid]++;
}
- folio_set_hugetlb(folio);
+ __folio_set_hugetlb(folio);
folio_change_private(folio, NULL);
/*
* We have to set hugetlb_vmemmap_optimized again as above
@@ -2049,7 +2049,7 @@ static void __prep_account_new_huge_page(struct hstate *h, int nid)
static void init_new_hugetlb_folio(struct hstate *h, struct folio *folio)
{
- folio_set_hugetlb(folio);
+ __folio_set_hugetlb(folio);
INIT_LIST_HEAD(&folio->lru);
hugetlb_set_folio_subpool(folio, NULL);
set_hugetlb_cgroup(folio, NULL);
@@ -2159,22 +2159,6 @@ static bool prep_compound_gigantic_folio_for_demote(struct folio *folio,
return __prep_compound_gigantic_folio(folio, order, true);
}
-/*
- * PageHuge() only returns true for hugetlbfs pages, but not for normal or
- * transparent huge pages. See the PageTransHuge() documentation for more
- * details.
- */
-int PageHuge(const struct page *page)
-{
- const struct folio *folio;
-
- if (!PageCompound(page))
- return 0;
- folio = page_folio(page);
- return folio_test_hugetlb(folio);
-}
-EXPORT_SYMBOL_GPL(PageHuge);
-
/*
* Find and lock address space (mapping) in write mode.
*
The patch below does not apply to the 6.6-stable tree.
If someone wants it applied there, or to any other stable or longterm
tree, then please email the backport, including the original git commit
id to <stable(a)vger.kernel.org>.
To reproduce the conflict and resubmit, you may use the following commands:
git fetch https://git.kernel.org/pub/scm/linux/kernel/git/stable/linux.git/ linux-6.6.y
git checkout FETCH_HEAD
git cherry-pick -x a0a8d15a798be4b8f20aca2ba91bf6b688c6a640
# <resolve conflicts, build, test, etc.>
git commit -s
git send-email --to '<stable(a)vger.kernel.org>' --in-reply-to '2024042908-modular-case-bbd4@gregkh' --subject-prefix 'PATCH 6.6.y' HEAD^..
Possible dependencies:
a0a8d15a798b ("x86/tdx: Preserve shared bit on mprotect()")
e45964771007 ("x86/coco: Define cc_vendor without CONFIG_ARCH_HAS_CC_PLATFORM")
thanks,
greg k-h
------------------ original commit in Linus's tree ------------------
From a0a8d15a798be4b8f20aca2ba91bf6b688c6a640 Mon Sep 17 00:00:00 2001
From: "Kirill A. Shutemov" <kirill.shutemov(a)linux.intel.com>
Date: Wed, 24 Apr 2024 11:20:35 +0300
Subject: [PATCH] x86/tdx: Preserve shared bit on mprotect()
The TDX guest platform takes one bit from the physical address to
indicate if the page is shared (accessible by VMM). This bit is not part
of the physical_mask and is not preserved during mprotect(). As a
result, the 'shared' bit is lost during mprotect() on shared mappings.
_COMMON_PAGE_CHG_MASK specifies which PTE bits need to be preserved
during modification. AMD includes 'sme_me_mask' in the define to
preserve the 'encrypt' bit.
To cover both Intel and AMD cases, include 'cc_mask' in
_COMMON_PAGE_CHG_MASK instead of 'sme_me_mask'.
Reported-and-tested-by: Chris Oo <cho(a)microsoft.com>
Fixes: 41394e33f3a0 ("x86/tdx: Extend the confidential computing API to support TDX guests")
Signed-off-by: Kirill A. Shutemov <kirill.shutemov(a)linux.intel.com>
Signed-off-by: Dave Hansen <dave.hansen(a)linux.intel.com>
Reviewed-by: Rick Edgecombe <rick.p.edgecombe(a)intel.com>
Reviewed-by: Kuppuswamy Sathyanarayanan <sathyanarayanan.kuppuswamy(a)linux.intel.com>
Reviewed-by: Tom Lendacky <thomas.lendacky(a)amd.com>
Cc: stable(a)vger.kernel.org
Link: https://lore.kernel.org/all/20240424082035.4092071-1-kirill.shutemov%40linu…
diff --git a/arch/x86/include/asm/coco.h b/arch/x86/include/asm/coco.h
index c086699b0d0c..aa6c8f8ca958 100644
--- a/arch/x86/include/asm/coco.h
+++ b/arch/x86/include/asm/coco.h
@@ -25,6 +25,7 @@ u64 cc_mkdec(u64 val);
void cc_random_init(void);
#else
#define cc_vendor (CC_VENDOR_NONE)
+static const u64 cc_mask = 0;
static inline u64 cc_mkenc(u64 val)
{
diff --git a/arch/x86/include/asm/pgtable_types.h b/arch/x86/include/asm/pgtable_types.h
index 0b748ee16b3d..9abb8cc4cd47 100644
--- a/arch/x86/include/asm/pgtable_types.h
+++ b/arch/x86/include/asm/pgtable_types.h
@@ -148,7 +148,7 @@
#define _COMMON_PAGE_CHG_MASK (PTE_PFN_MASK | _PAGE_PCD | _PAGE_PWT | \
_PAGE_SPECIAL | _PAGE_ACCESSED | \
_PAGE_DIRTY_BITS | _PAGE_SOFT_DIRTY | \
- _PAGE_DEVMAP | _PAGE_ENC | _PAGE_UFFD_WP)
+ _PAGE_DEVMAP | _PAGE_CC | _PAGE_UFFD_WP)
#define _PAGE_CHG_MASK (_COMMON_PAGE_CHG_MASK | _PAGE_PAT)
#define _HPAGE_CHG_MASK (_COMMON_PAGE_CHG_MASK | _PAGE_PSE | _PAGE_PAT_LARGE)
@@ -173,6 +173,7 @@ enum page_cache_mode {
};
#endif
+#define _PAGE_CC (_AT(pteval_t, cc_mask))
#define _PAGE_ENC (_AT(pteval_t, sme_me_mask))
#define _PAGE_CACHE_MASK (_PAGE_PWT | _PAGE_PCD | _PAGE_PAT)
The patch below does not apply to the 6.1-stable tree.
If someone wants it applied there, or to any other stable or longterm
tree, then please email the backport, including the original git commit
id to <stable(a)vger.kernel.org>.
To reproduce the conflict and resubmit, you may use the following commands:
git fetch https://git.kernel.org/pub/scm/linux/kernel/git/stable/linux.git/ linux-6.1.y
git checkout FETCH_HEAD
git cherry-pick -x 19843452dca40e28d6d3f4793d998b681d505c7f
# <resolve conflicts, build, test, etc.>
git commit -s
git send-email --to '<stable(a)vger.kernel.org>' --in-reply-to '2024042924-ribcage-browsing-7e8b@gregkh' --subject-prefix 'PATCH 6.1.y' HEAD^..
Possible dependencies:
19843452dca4 ("rust: remove `params` from `module` macro example")
b13c9880f909 ("rust: macros: take string literals in `module!`")
c3630df66f95 ("rust: samples: add `rust_print` example")
thanks,
greg k-h
------------------ original commit in Linus's tree ------------------
From 19843452dca40e28d6d3f4793d998b681d505c7f Mon Sep 17 00:00:00 2001
From: Aswin Unnikrishnan <aswinunni01(a)gmail.com>
Date: Fri, 19 Apr 2024 21:50:13 +0000
Subject: [PATCH] rust: remove `params` from `module` macro example
Remove argument `params` from the `module` macro example, because the
macro does not currently support module parameters since it was not sent
with the initial merge.
Signed-off-by: Aswin Unnikrishnan <aswinunni01(a)gmail.com>
Reviewed-by: Alice Ryhl <aliceryhl(a)google.com>
Cc: stable(a)vger.kernel.org
Fixes: 1fbde52bde73 ("rust: add `macros` crate")
Link: https://lore.kernel.org/r/20240419215015.157258-1-aswinunni01@gmail.com
[ Reworded slightly. ]
Signed-off-by: Miguel Ojeda <ojeda(a)kernel.org>
diff --git a/rust/macros/lib.rs b/rust/macros/lib.rs
index f489f3157383..520eae5fd792 100644
--- a/rust/macros/lib.rs
+++ b/rust/macros/lib.rs
@@ -35,18 +35,6 @@ use proc_macro::TokenStream;
/// author: "Rust for Linux Contributors",
/// description: "My very own kernel module!",
/// license: "GPL",
-/// params: {
-/// my_i32: i32 {
-/// default: 42,
-/// permissions: 0o000,
-/// description: "Example of i32",
-/// },
-/// writeable_i32: i32 {
-/// default: 42,
-/// permissions: 0o644,
-/// description: "Example of i32",
-/// },
-/// },
/// }
///
/// struct MyModule;
The ISRs of the tps25750 and tps6598x do not handle generated events
properly under all circumstances.
The tps6598x ISR does not read all bits of the INT_EVENTX registers in
newer devices (tps65987 and tps65988, which have 11-byte interrupt
registers), leaving events signaled with bits above 64 unattended.
Moreover, these events are not cleared, leaving the interrupt enabled.
The tps25750 reads all bits of the INT_EVENT1 register, but the event
checking is not right because the same event is checked in two different
regions of the same register by means of an OR operation.
This series aims to fix both issues by reading all bits of the
INT_EVENTX registers, and limiting the event checking to the region
where the supported events are defined (currently they are limited to
the first 64 bits of the registers, as the are defined as BIT_ULL()).
In order to identify the interrupt register size, a simple mechanism has
been added to read the version register and select the right size
accordingly. This mechanism has been tested with a TPS65987DDHRSHR (DH
variant) which returned the expected value (0xF7).
If the need for events above the first 64 bits of the INT_EVENTX
registers arises, a different mechanism might be required. But for the
current needs, all definitions can be left as they are.
When at it, an explicit device_get_match_data() (which is already
contained in i2c_get_match_data()) has been removed.
Signed-off-by: Javier Carrasco <javier.carrasco(a)wolfvision.net>
---
Changes in v3:
- Remove explicit usage of device_get_match_data() (new patch).
- Fix series versioning (last resend should have been v2).
- Keep 64 bit reads in the interrupt handler for tps65981/2/6 by reading
the version register.
- Link to v2: https://lore.kernel.org/r/20240328-tps6598x_fix_event_handling-v1-0-502721f…
Changes in v2:
- Add Cc tag for 'stable' (fixes).
- Link to v1: https://lore.kernel.org/r/20240325-tps6598x_fix_event_handling-v1-0-633fe0c…
---
Javier Carrasco (3):
usb: typec: tipd: fix event checking for tps25750
usb: typec: tipd: fix event checking for tps6598x
usb: typec: tipd: rely on i2c_get_match_data()
drivers/usb/typec/tipd/core.c | 56 +++++++++++++++++++++++++--------------
drivers/usb/typec/tipd/tps6598x.h | 11 ++++++++
2 files changed, 47 insertions(+), 20 deletions(-)
---
base-commit: 4cece764965020c22cff7665b18a012006359095
change-id: 20240328-tps6598x_fix_event_handling-3398d3d82f85
Best regards,
--
Javier Carrasco <javier.carrasco(a)wolfvision.net>
For the CVE-2024-26865 issue, the stable 5.10 branch code also involves
this issue. However, the patch of the mainline version cannot be used on
stable 5.10 branch. The commit 740ea3c4a0b2("tcp: Clean up kernel
listener' s reqsk in inet_twsk_purge()") is required to stop the timer.
Eric Dumazet (1):
tcp: Fix NEW_SYN_RECV handling in inet_twsk_purge()
Kuniyuki Iwashima (1):
tcp: Clean up kernel listener's reqsk in inet_twsk_purge()
net/ipv4/inet_timewait_sock.c | 32 +++++++++++++++++++++-----------
1 file changed, 21 insertions(+), 11 deletions(-)
--
2.34.1
The patch below does not apply to the 6.6-stable tree.
If someone wants it applied there, or to any other stable or longterm
tree, then please email the backport, including the original git commit
id to <stable(a)vger.kernel.org>.
To reproduce the conflict and resubmit, you may use the following commands:
git fetch https://git.kernel.org/pub/scm/linux/kernel/git/stable/linux.git/ linux-6.6.y
git checkout FETCH_HEAD
git cherry-pick -x d99e3140a4d33e26066183ff727d8f02f56bec64
# <resolve conflicts, build, test, etc.>
git commit -s
git send-email --to '<stable(a)vger.kernel.org>' --in-reply-to '2024042908-squint-barometer-51de@gregkh' --subject-prefix 'PATCH 6.6.y' HEAD^..
Possible dependencies:
d99e3140a4d3 ("mm: turn folio_test_hugetlb into a PageType")
29cfe7556bfd ("mm: constify more page/folio tests")
443cbaf9e2fd ("crash: split vmcoreinfo exporting code out from crash_core.c")
85fcde402db1 ("kexec: split crashkernel reservation code out from crash_core.c")
55c49fee57af ("mm/vmalloc: remove vmap_area_list")
d093602919ad ("mm: vmalloc: remove global vmap_area_root rb-tree")
7fa8cee00316 ("mm: vmalloc: move vmap_init_free_space() down in vmalloc.c")
4a693ce65b18 ("kdump: defer the insertion of crashkernel resources")
9f2a63523582 ("Merge tag 'mm-nonmm-stable-2024-01-09-10-33' of git://git.kernel.org/pub/scm/linux/kernel/git/akpm/mm")
thanks,
greg k-h
------------------ original commit in Linus's tree ------------------
From d99e3140a4d33e26066183ff727d8f02f56bec64 Mon Sep 17 00:00:00 2001
From: "Matthew Wilcox (Oracle)" <willy(a)infradead.org>
Date: Thu, 21 Mar 2024 14:24:43 +0000
Subject: [PATCH] mm: turn folio_test_hugetlb into a PageType
The current folio_test_hugetlb() can be fooled by a concurrent folio split
into returning true for a folio which has never belonged to hugetlbfs.
This can't happen if the caller holds a refcount on it, but we have a few
places (memory-failure, compaction, procfs) which do not and should not
take a speculative reference.
Since hugetlb pages do not use individual page mapcounts (they are always
fully mapped and use the entire_mapcount field to record the number of
mappings), the PageType field is available now that page_mapcount()
ignores the value in this field.
In compaction and with CONFIG_DEBUG_VM enabled, the current implementation
can result in an oops, as reported by Luis. This happens since 9c5ccf2db04b
("mm: remove HUGETLB_PAGE_DTOR") effectively added some VM_BUG_ON() checks
in the PageHuge() testing path.
[willy(a)infradead.org: update vmcoreinfo]
Link: https://lkml.kernel.org/r/ZgGZUvsdhaT1Va-T@casper.infradead.org
Link: https://lkml.kernel.org/r/20240321142448.1645400-6-willy@infradead.org
Fixes: 9c5ccf2db04b ("mm: remove HUGETLB_PAGE_DTOR")
Signed-off-by: Matthew Wilcox (Oracle) <willy(a)infradead.org>
Reviewed-by: David Hildenbrand <david(a)redhat.com>
Acked-by: Vlastimil Babka <vbabka(a)suse.cz>
Reported-by: Luis Chamberlain <mcgrof(a)kernel.org>
Closes: https://bugzilla.kernel.org/show_bug.cgi?id=218227
Cc: Miaohe Lin <linmiaohe(a)huawei.com>
Cc: Muchun Song <muchun.song(a)linux.dev>
Cc: Oscar Salvador <osalvador(a)suse.de>
Cc: <stable(a)vger.kernel.org>
Signed-off-by: Andrew Morton <akpm(a)linux-foundation.org>
diff --git a/include/linux/page-flags.h b/include/linux/page-flags.h
index 35a0087d0910..4bf1c25fd1dc 100644
--- a/include/linux/page-flags.h
+++ b/include/linux/page-flags.h
@@ -190,7 +190,6 @@ enum pageflags {
/* At least one page in this folio has the hwpoison flag set */
PG_has_hwpoisoned = PG_error,
- PG_hugetlb = PG_active,
PG_large_rmappable = PG_workingset, /* anon or file-backed */
};
@@ -876,29 +875,6 @@ TESTPAGEFLAG_FALSE(LargeRmappable, large_rmappable)
#define PG_head_mask ((1UL << PG_head))
-#ifdef CONFIG_HUGETLB_PAGE
-int PageHuge(const struct page *page);
-SETPAGEFLAG(HugeTLB, hugetlb, PF_SECOND)
-CLEARPAGEFLAG(HugeTLB, hugetlb, PF_SECOND)
-
-/**
- * folio_test_hugetlb - Determine if the folio belongs to hugetlbfs
- * @folio: The folio to test.
- *
- * Context: Any context. Caller should have a reference on the folio to
- * prevent it from being turned into a tail page.
- * Return: True for hugetlbfs folios, false for anon folios or folios
- * belonging to other filesystems.
- */
-static inline bool folio_test_hugetlb(const struct folio *folio)
-{
- return folio_test_large(folio) &&
- test_bit(PG_hugetlb, const_folio_flags(folio, 1));
-}
-#else
-TESTPAGEFLAG_FALSE(Huge, hugetlb)
-#endif
-
#ifdef CONFIG_TRANSPARENT_HUGEPAGE
/*
* PageHuge() only returns true for hugetlbfs pages, but not for
@@ -954,18 +930,6 @@ PAGEFLAG_FALSE(HasHWPoisoned, has_hwpoisoned)
TESTSCFLAG_FALSE(HasHWPoisoned, has_hwpoisoned)
#endif
-/*
- * Check if a page is currently marked HWPoisoned. Note that this check is
- * best effort only and inherently racy: there is no way to synchronize with
- * failing hardware.
- */
-static inline bool is_page_hwpoison(struct page *page)
-{
- if (PageHWPoison(page))
- return true;
- return PageHuge(page) && PageHWPoison(compound_head(page));
-}
-
/*
* For pages that are never mapped to userspace (and aren't PageSlab),
* page_type may be used. Because it is initialised to -1, we invert the
@@ -982,6 +946,7 @@ static inline bool is_page_hwpoison(struct page *page)
#define PG_offline 0x00000100
#define PG_table 0x00000200
#define PG_guard 0x00000400
+#define PG_hugetlb 0x00000800
#define PageType(page, flag) \
((page->page_type & (PAGE_TYPE_BASE | flag)) == PAGE_TYPE_BASE)
@@ -1076,6 +1041,37 @@ PAGE_TYPE_OPS(Table, table, pgtable)
*/
PAGE_TYPE_OPS(Guard, guard, guard)
+#ifdef CONFIG_HUGETLB_PAGE
+FOLIO_TYPE_OPS(hugetlb, hugetlb)
+#else
+FOLIO_TEST_FLAG_FALSE(hugetlb)
+#endif
+
+/**
+ * PageHuge - Determine if the page belongs to hugetlbfs
+ * @page: The page to test.
+ *
+ * Context: Any context.
+ * Return: True for hugetlbfs pages, false for anon pages or pages
+ * belonging to other filesystems.
+ */
+static inline bool PageHuge(const struct page *page)
+{
+ return folio_test_hugetlb(page_folio(page));
+}
+
+/*
+ * Check if a page is currently marked HWPoisoned. Note that this check is
+ * best effort only and inherently racy: there is no way to synchronize with
+ * failing hardware.
+ */
+static inline bool is_page_hwpoison(struct page *page)
+{
+ if (PageHWPoison(page))
+ return true;
+ return PageHuge(page) && PageHWPoison(compound_head(page));
+}
+
extern bool is_free_buddy_page(struct page *page);
PAGEFLAG(Isolated, isolated, PF_ANY);
@@ -1142,7 +1138,7 @@ static __always_inline void __ClearPageAnonExclusive(struct page *page)
*/
#define PAGE_FLAGS_SECOND \
(0xffUL /* order */ | 1UL << PG_has_hwpoisoned | \
- 1UL << PG_hugetlb | 1UL << PG_large_rmappable)
+ 1UL << PG_large_rmappable)
#define PAGE_FLAGS_PRIVATE \
(1UL << PG_private | 1UL << PG_private_2)
diff --git a/include/trace/events/mmflags.h b/include/trace/events/mmflags.h
index d801409b33cf..d55e53ac91bd 100644
--- a/include/trace/events/mmflags.h
+++ b/include/trace/events/mmflags.h
@@ -135,6 +135,7 @@ IF_HAVE_PG_ARCH_X(arch_3)
#define DEF_PAGETYPE_NAME(_name) { PG_##_name, __stringify(_name) }
#define __def_pagetype_names \
+ DEF_PAGETYPE_NAME(hugetlb), \
DEF_PAGETYPE_NAME(offline), \
DEF_PAGETYPE_NAME(guard), \
DEF_PAGETYPE_NAME(table), \
diff --git a/kernel/vmcore_info.c b/kernel/vmcore_info.c
index f95516cd45bb..23c125c2e243 100644
--- a/kernel/vmcore_info.c
+++ b/kernel/vmcore_info.c
@@ -205,11 +205,10 @@ static int __init crash_save_vmcoreinfo_init(void)
VMCOREINFO_NUMBER(PG_head_mask);
#define PAGE_BUDDY_MAPCOUNT_VALUE (~PG_buddy)
VMCOREINFO_NUMBER(PAGE_BUDDY_MAPCOUNT_VALUE);
-#ifdef CONFIG_HUGETLB_PAGE
- VMCOREINFO_NUMBER(PG_hugetlb);
+#define PAGE_HUGETLB_MAPCOUNT_VALUE (~PG_hugetlb)
+ VMCOREINFO_NUMBER(PAGE_HUGETLB_MAPCOUNT_VALUE);
#define PAGE_OFFLINE_MAPCOUNT_VALUE (~PG_offline)
VMCOREINFO_NUMBER(PAGE_OFFLINE_MAPCOUNT_VALUE);
-#endif
#ifdef CONFIG_KALLSYMS
VMCOREINFO_SYMBOL(kallsyms_names);
diff --git a/mm/hugetlb.c b/mm/hugetlb.c
index 53e0ab5c0845..4553241f0fb2 100644
--- a/mm/hugetlb.c
+++ b/mm/hugetlb.c
@@ -1624,7 +1624,7 @@ static inline void __clear_hugetlb_destructor(struct hstate *h,
{
lockdep_assert_held(&hugetlb_lock);
- folio_clear_hugetlb(folio);
+ __folio_clear_hugetlb(folio);
}
/*
@@ -1711,7 +1711,7 @@ static void add_hugetlb_folio(struct hstate *h, struct folio *folio,
h->surplus_huge_pages_node[nid]++;
}
- folio_set_hugetlb(folio);
+ __folio_set_hugetlb(folio);
folio_change_private(folio, NULL);
/*
* We have to set hugetlb_vmemmap_optimized again as above
@@ -2049,7 +2049,7 @@ static void __prep_account_new_huge_page(struct hstate *h, int nid)
static void init_new_hugetlb_folio(struct hstate *h, struct folio *folio)
{
- folio_set_hugetlb(folio);
+ __folio_set_hugetlb(folio);
INIT_LIST_HEAD(&folio->lru);
hugetlb_set_folio_subpool(folio, NULL);
set_hugetlb_cgroup(folio, NULL);
@@ -2159,22 +2159,6 @@ static bool prep_compound_gigantic_folio_for_demote(struct folio *folio,
return __prep_compound_gigantic_folio(folio, order, true);
}
-/*
- * PageHuge() only returns true for hugetlbfs pages, but not for normal or
- * transparent huge pages. See the PageTransHuge() documentation for more
- * details.
- */
-int PageHuge(const struct page *page)
-{
- const struct folio *folio;
-
- if (!PageCompound(page))
- return 0;
- folio = page_folio(page);
- return folio_test_hugetlb(folio);
-}
-EXPORT_SYMBOL_GPL(PageHuge);
-
/*
* Find and lock address space (mapping) in write mode.
*
We missed setting the CCS mode during resume and engine resets.
Create a workaround to be added in the engine's workaround list.
This workaround sets the XEHP_CCS_MODE value at every reset.
The issue can be reproduced by running:
$ clpeak --kernel-latency
Without resetting the CCS mode, we encounter a fence timeout:
Fence expiration time out i915-0000:03:00.0:clpeak[2387]:2!
Closes: https://gitlab.freedesktop.org/drm/i915/kernel/-/issues/10895
Fixes: 6db31251bb26 ("drm/i915/gt: Enable only one CCS for compute workload")
Reported-by: Gnattu OC <gnattuoc(a)me.com>
Signed-off-by: Andi Shyti <andi.shyti(a)linux.intel.com>
Cc: Chris Wilson <chris.p.wilson(a)linux.intel.com>
Cc: Joonas Lahtinen <joonas.lahtinen(a)linux.intel.com>
Cc: Matt Roper <matthew.d.roper(a)intel.com>
Cc: <stable(a)vger.kernel.org> # v6.2+
---
Hi Gnattu,
thanks again for reporting this issue and for your prompt
replies on the issue. Would you give this patch a chance?
Andi
drivers/gpu/drm/i915/gt/intel_gt_ccs_mode.c | 6 +++---
drivers/gpu/drm/i915/gt/intel_gt_ccs_mode.h | 2 +-
drivers/gpu/drm/i915/gt/intel_workarounds.c | 4 +++-
3 files changed, 7 insertions(+), 5 deletions(-)
diff --git a/drivers/gpu/drm/i915/gt/intel_gt_ccs_mode.c b/drivers/gpu/drm/i915/gt/intel_gt_ccs_mode.c
index 044219c5960a..99b71bb7da0a 100644
--- a/drivers/gpu/drm/i915/gt/intel_gt_ccs_mode.c
+++ b/drivers/gpu/drm/i915/gt/intel_gt_ccs_mode.c
@@ -8,14 +8,14 @@
#include "intel_gt_ccs_mode.h"
#include "intel_gt_regs.h"
-void intel_gt_apply_ccs_mode(struct intel_gt *gt)
+unsigned int intel_gt_apply_ccs_mode(struct intel_gt *gt)
{
int cslice;
u32 mode = 0;
int first_ccs = __ffs(CCS_MASK(gt));
if (!IS_DG2(gt->i915))
- return;
+ return 0;
/* Build the value for the fixed CCS load balancing */
for (cslice = 0; cslice < I915_MAX_CCS; cslice++) {
@@ -35,5 +35,5 @@ void intel_gt_apply_ccs_mode(struct intel_gt *gt)
XEHP_CCS_MODE_CSLICE_MASK);
}
- intel_uncore_write(gt->uncore, XEHP_CCS_MODE, mode);
+ return mode;
}
diff --git a/drivers/gpu/drm/i915/gt/intel_gt_ccs_mode.h b/drivers/gpu/drm/i915/gt/intel_gt_ccs_mode.h
index 9e5549caeb26..55547f2ff426 100644
--- a/drivers/gpu/drm/i915/gt/intel_gt_ccs_mode.h
+++ b/drivers/gpu/drm/i915/gt/intel_gt_ccs_mode.h
@@ -8,6 +8,6 @@
struct intel_gt;
-void intel_gt_apply_ccs_mode(struct intel_gt *gt);
+unsigned int intel_gt_apply_ccs_mode(struct intel_gt *gt);
#endif /* __INTEL_GT_CCS_MODE_H__ */
diff --git a/drivers/gpu/drm/i915/gt/intel_workarounds.c b/drivers/gpu/drm/i915/gt/intel_workarounds.c
index 68b6aa11bcf7..58693923bf6c 100644
--- a/drivers/gpu/drm/i915/gt/intel_workarounds.c
+++ b/drivers/gpu/drm/i915/gt/intel_workarounds.c
@@ -2703,6 +2703,7 @@ add_render_compute_tuning_settings(struct intel_gt *gt,
static void ccs_engine_wa_mode(struct intel_engine_cs *engine, struct i915_wa_list *wal)
{
struct intel_gt *gt = engine->gt;
+ u32 mode;
if (!IS_DG2(gt->i915))
return;
@@ -2719,7 +2720,8 @@ static void ccs_engine_wa_mode(struct intel_engine_cs *engine, struct i915_wa_li
* After having disabled automatic load balancing we need to
* assign all slices to a single CCS. We will call it CCS mode 1
*/
- intel_gt_apply_ccs_mode(gt);
+ mode = intel_gt_apply_ccs_mode(gt);
+ wa_masked_en(wal, XEHP_CCS_MODE, mode);
}
/*
--
2.43.0