- Linux-stable-mirror - lists.linaro.org

FAILED: patch "[PATCH] KVM: VMX: Inject #UD if guest tries to execute SEAMCALL or" failed to apply to 6.17-stable tree

by gregkh＠linuxfoundation.org

The patch below does not apply to the 6.17-stable tree. If someone wants it applied there, or to any other stable or longterm tree, then please email the backport, including the original git commit id to <stable(a)vger.kernel.org>. To reproduce the conflict and resubmit, you may use the following commands: git fetch https://git.kernel.org/pub/scm/linux/kernel/git/stable/linux.git/ linux-6.17.y git checkout FETCH_HEAD git cherry-pick -x 9d7dfb95da2cb5c1287df2f3468bcb70d8b31087 # <resolve conflicts, build, test, etc.> git commit -s git send-email --to '<stable(a)vger.kernel.org>' --in-reply-to '2025112027-ranch-retool-efaa@gregkh' --subject-prefix 'PATCH 6.17.y' HEAD^.. Possible dependencies: thanks, greg k-h ------------------ original commit in Linus's tree ------------------ From 9d7dfb95da2cb5c1287df2f3468bcb70d8b31087 Mon Sep 17 00:00:00 2001 From: Sean Christopherson <seanjc(a)google.com> Date: Thu, 16 Oct 2025 11:21:47 -0700 Subject: [PATCH] KVM: VMX: Inject #UD if guest tries to execute SEAMCALL or TDCALL MIME-Version: 1.0 Content-Type: text/plain; charset=UTF-8 Content-Transfer-Encoding: 8bit Add VMX exit handlers for SEAMCALL and TDCALL to inject a #UD if a non-TD guest attempts to execute SEAMCALL or TDCALL. Neither SEAMCALL nor TDCALL is gated by any software enablement other than VMXON, and so will generate a VM-Exit instead of e.g. a native #UD when executed from the guest kernel. Note! No unprivileged DoS of the L1 kernel is possible as TDCALL and SEAMCALL #GP at CPL > 0, and the CPL check is performed prior to the VMX non-root (VM-Exit) check, i.e. userspace can't crash the VM. And for a nested guest, KVM forwards unknown exits to L1, i.e. an L2 kernel can crash itself, but not L1. Note #2! The Intel® Trust Domain CPU Architectural Extensions spec's pseudocode shows the CPL > 0 check for SEAMCALL coming _after_ the VM-Exit, but that appears to be a documentation bug (likely because the CPL > 0 check was incorrectly bundled with other lower-priority #GP checks). Testing on SPR and EMR shows that the CPL > 0 check is performed before the VMX non-root check, i.e. SEAMCALL #GPs when executed in usermode. Note #3! The aforementioned Trust Domain spec uses confusing pseudocode that says that SEAMCALL will #UD if executed "inSEAM", but "inSEAM" specifically means in SEAM Root Mode, i.e. in the TDX-Module. The long- form description explicitly states that SEAMCALL generates an exit when executed in "SEAM VMX non-root operation". But that's a moot point as the TDX-Module injects #UD if the guest attempts to execute SEAMCALL, as documented in the "Unconditionally Blocked Instructions" section of the TDX-Module base specification. Cc: stable(a)vger.kernel.org Cc: Kai Huang <kai.huang(a)intel.com> Cc: Xiaoyao Li <xiaoyao.li(a)intel.com> Cc: Rick Edgecombe <rick.p.edgecombe(a)intel.com> Cc: Dan Williams <dan.j.williams(a)intel.com> Cc: Binbin Wu <binbin.wu(a)linux.intel.com> Reviewed-by: Kai Huang <kai.huang(a)intel.com> Reviewed-by: Binbin Wu <binbin.wu(a)linux.intel.com> Reviewed-by: Xiaoyao Li <xiaoyao.li(a)intel.com> Link: https://lore.kernel.org/r/20251016182148.69085-2-seanjc@google.com Signed-off-by: Sean Christopherson <seanjc(a)google.com> diff --git a/arch/x86/include/uapi/asm/vmx.h b/arch/x86/include/uapi/asm/vmx.h index 9792e329343e..1baa86dfe029 100644 --- a/arch/x86/include/uapi/asm/vmx.h +++ b/arch/x86/include/uapi/asm/vmx.h @@ -93,6 +93,7 @@ #define EXIT_REASON_TPAUSE 68 #define EXIT_REASON_BUS_LOCK 74 #define EXIT_REASON_NOTIFY 75 +#define EXIT_REASON_SEAMCALL 76 #define EXIT_REASON_TDCALL 77 #define EXIT_REASON_MSR_READ_IMM 84 #define EXIT_REASON_MSR_WRITE_IMM 85 diff --git a/arch/x86/kvm/vmx/nested.c b/arch/x86/kvm/vmx/nested.c index 76271962cb70..bcea087b642f 100644 --- a/arch/x86/kvm/vmx/nested.c +++ b/arch/x86/kvm/vmx/nested.c @@ -6728,6 +6728,14 @@ static bool nested_vmx_l1_wants_exit(struct kvm_vcpu *vcpu, case EXIT_REASON_NOTIFY: /* Notify VM exit is not exposed to L1 */ return false; + case EXIT_REASON_SEAMCALL: + case EXIT_REASON_TDCALL: + /* + * SEAMCALL and TDCALL unconditionally VM-Exit, but aren't + * virtualized by KVM for L1 hypervisors, i.e. L1 should + * never want or expect such an exit. + */ + return false; default: return true; } diff --git a/arch/x86/kvm/vmx/vmx.c b/arch/x86/kvm/vmx/vmx.c index f87c216d976d..91b6f2f3edc2 100644 --- a/arch/x86/kvm/vmx/vmx.c +++ b/arch/x86/kvm/vmx/vmx.c @@ -6032,6 +6032,12 @@ static int handle_vmx_instruction(struct kvm_vcpu *vcpu) return 1; } +static int handle_tdx_instruction(struct kvm_vcpu *vcpu) +{ + kvm_queue_exception(vcpu, UD_VECTOR); + return 1; +} + #ifndef CONFIG_X86_SGX_KVM static int handle_encls(struct kvm_vcpu *vcpu) { @@ -6157,6 +6163,8 @@ static int (*kvm_vmx_exit_handlers[])(struct kvm_vcpu *vcpu) = { [EXIT_REASON_ENCLS] = handle_encls, [EXIT_REASON_BUS_LOCK] = handle_bus_lock_vmexit, [EXIT_REASON_NOTIFY] = handle_notify, + [EXIT_REASON_SEAMCALL] = handle_tdx_instruction, + [EXIT_REASON_TDCALL] = handle_tdx_instruction, [EXIT_REASON_MSR_READ_IMM] = handle_rdmsr_imm, [EXIT_REASON_MSR_WRITE_IMM] = handle_wrmsr_imm, };

2 weeks, 5 days

2
3
0 0

[PATCH 23/26] drm/amd/display: Correct DSC padding accounting

by Alex Hung

From: Relja Vojvodic <rvojvodi(a)amd.com> [WHY] - After the addition of all OVT patches, DSC padding was being accounted for multiple times, effectively doubling the padding - This caused compliance failures or corruption [HOW] - Add padding to DSC pic width when required by HW, and do not re-add when calculating reg values - Do not add padding when computing PPS values, and instead track padding separately to add when calculating slice width values Cc: Mario Limonciello <mario.limonciello(a)amd.com> Cc: Alex Deucher <alexander.deucher(a)amd.com> Cc: stable(a)vger.kernel.org Reviewed-by: Chris Park <chris.park(a)amd.com> Reviewed-by: Wenjing Liu <wenjing.liu(a)amd.com> Signed-off-by: Relja Vojvodic <rvojvodi(a)amd.com> Signed-off-by: Alex Hung <alex.hung(a)amd.com> --- drivers/gpu/drm/amd/display/dc/hwss/dcn314/dcn314_hwseq.c | 2 +- drivers/gpu/drm/amd/display/dc/hwss/dcn32/dcn32_hwseq.c | 2 +- drivers/gpu/drm/amd/display/dc/hwss/dcn35/dcn35_hwseq.c | 2 +- drivers/gpu/drm/amd/display/dc/link/link_dpms.c | 3 ++- .../gpu/drm/amd/display/dc/resource/dcn20/dcn20_resource.c | 6 +++--- 5 files changed, 8 insertions(+), 7 deletions(-) diff --git a/drivers/gpu/drm/amd/display/dc/hwss/dcn314/dcn314_hwseq.c b/drivers/gpu/drm/amd/display/dc/hwss/dcn314/dcn314_hwseq.c index 4ee6ed610de0..3e239124c17d 100644 --- a/drivers/gpu/drm/amd/display/dc/hwss/dcn314/dcn314_hwseq.c +++ b/drivers/gpu/drm/amd/display/dc/hwss/dcn314/dcn314_hwseq.c @@ -108,7 +108,7 @@ static void update_dsc_on_stream(struct pipe_ctx *pipe_ctx, bool enable) dsc_cfg.dc_dsc_cfg = stream->timing.dsc_cfg; ASSERT(dsc_cfg.dc_dsc_cfg.num_slices_h % opp_cnt == 0); dsc_cfg.dc_dsc_cfg.num_slices_h /= opp_cnt; - dsc_cfg.dsc_padding = pipe_ctx->dsc_padding_params.dsc_hactive_padding; + dsc_cfg.dsc_padding = 0; dsc->funcs->dsc_set_config(dsc, &dsc_cfg, &dsc_optc_cfg); dsc->funcs->dsc_enable(dsc, pipe_ctx->stream_res.opp->inst); diff --git a/drivers/gpu/drm/amd/display/dc/hwss/dcn32/dcn32_hwseq.c b/drivers/gpu/drm/amd/display/dc/hwss/dcn32/dcn32_hwseq.c index bf19ba65d09a..b213a2ac827a 100644 --- a/drivers/gpu/drm/amd/display/dc/hwss/dcn32/dcn32_hwseq.c +++ b/drivers/gpu/drm/amd/display/dc/hwss/dcn32/dcn32_hwseq.c @@ -1061,7 +1061,7 @@ void dcn32_update_dsc_on_stream(struct pipe_ctx *pipe_ctx, bool enable) dsc_cfg.dc_dsc_cfg = stream->timing.dsc_cfg; ASSERT(dsc_cfg.dc_dsc_cfg.num_slices_h % opp_cnt == 0); dsc_cfg.dc_dsc_cfg.num_slices_h /= opp_cnt; - dsc_cfg.dsc_padding = pipe_ctx->dsc_padding_params.dsc_hactive_padding; + dsc_cfg.dsc_padding = 0; if (should_use_dto_dscclk) dccg->funcs->set_dto_dscclk(dccg, dsc->inst, dsc_cfg.dc_dsc_cfg.num_slices_h); diff --git a/drivers/gpu/drm/amd/display/dc/hwss/dcn35/dcn35_hwseq.c b/drivers/gpu/drm/amd/display/dc/hwss/dcn35/dcn35_hwseq.c index 7aa0f452e8f7..cb2dfd34b5e2 100644 --- a/drivers/gpu/drm/amd/display/dc/hwss/dcn35/dcn35_hwseq.c +++ b/drivers/gpu/drm/amd/display/dc/hwss/dcn35/dcn35_hwseq.c @@ -364,7 +364,7 @@ static void update_dsc_on_stream(struct pipe_ctx *pipe_ctx, bool enable) dsc_cfg.dc_dsc_cfg = stream->timing.dsc_cfg; ASSERT(dsc_cfg.dc_dsc_cfg.num_slices_h % opp_cnt == 0); dsc_cfg.dc_dsc_cfg.num_slices_h /= opp_cnt; - dsc_cfg.dsc_padding = pipe_ctx->dsc_padding_params.dsc_hactive_padding; + dsc_cfg.dsc_padding = 0; dsc->funcs->dsc_set_config(dsc, &dsc_cfg, &dsc_optc_cfg); dsc->funcs->dsc_enable(dsc, pipe_ctx->stream_res.opp->inst); diff --git a/drivers/gpu/drm/amd/display/dc/link/link_dpms.c b/drivers/gpu/drm/amd/display/dc/link/link_dpms.c index 1b1ce3839922..77e049917c4d 100644 --- a/drivers/gpu/drm/amd/display/dc/link/link_dpms.c +++ b/drivers/gpu/drm/amd/display/dc/link/link_dpms.c @@ -841,7 +841,7 @@ void link_set_dsc_on_stream(struct pipe_ctx *pipe_ctx, bool enable) dsc_cfg.dc_dsc_cfg = stream->timing.dsc_cfg; ASSERT(dsc_cfg.dc_dsc_cfg.num_slices_h % opp_cnt == 0); dsc_cfg.dc_dsc_cfg.num_slices_h /= opp_cnt; - dsc_cfg.dsc_padding = pipe_ctx->dsc_padding_params.dsc_hactive_padding; + dsc_cfg.dsc_padding = 0; if (should_use_dto_dscclk) dccg->funcs->set_dto_dscclk(dccg, dsc->inst, dsc_cfg.dc_dsc_cfg.num_slices_h); @@ -857,6 +857,7 @@ void link_set_dsc_on_stream(struct pipe_ctx *pipe_ctx, bool enable) } dsc_cfg.dc_dsc_cfg.num_slices_h *= opp_cnt; dsc_cfg.pic_width *= opp_cnt; + dsc_cfg.dsc_padding = pipe_ctx->dsc_padding_params.dsc_hactive_padding; optc_dsc_mode = dsc_optc_cfg.is_pixel_format_444 ? OPTC_DSC_ENABLED_444 : OPTC_DSC_ENABLED_NATIVE_SUBSAMPLED; diff --git a/drivers/gpu/drm/amd/display/dc/resource/dcn20/dcn20_resource.c b/drivers/gpu/drm/amd/display/dc/resource/dcn20/dcn20_resource.c index 6679c1a14f2f..8d10aac9c510 100644 --- a/drivers/gpu/drm/amd/display/dc/resource/dcn20/dcn20_resource.c +++ b/drivers/gpu/drm/amd/display/dc/resource/dcn20/dcn20_resource.c @@ -1660,8 +1660,8 @@ bool dcn20_validate_dsc(struct dc *dc, struct dc_state *new_ctx) if (pipe_ctx->top_pipe || pipe_ctx->prev_odm_pipe || !stream || !stream->timing.flags.DSC) continue; - dsc_cfg.pic_width = (stream->timing.h_addressable + stream->timing.h_border_left - + stream->timing.h_border_right) / opp_cnt; + dsc_cfg.pic_width = (stream->timing.h_addressable + pipe_ctx->dsc_padding_params.dsc_hactive_padding + + stream->timing.h_border_left + stream->timing.h_border_right) / opp_cnt; dsc_cfg.pic_height = stream->timing.v_addressable + stream->timing.v_border_top + stream->timing.v_border_bottom; dsc_cfg.pixel_encoding = stream->timing.pixel_encoding; @@ -1669,7 +1669,7 @@ bool dcn20_validate_dsc(struct dc *dc, struct dc_state *new_ctx) dsc_cfg.is_odm = pipe_ctx->next_odm_pipe ? true : false; dsc_cfg.dc_dsc_cfg = stream->timing.dsc_cfg; dsc_cfg.dc_dsc_cfg.num_slices_h /= opp_cnt; - dsc_cfg.dsc_padding = pipe_ctx->dsc_padding_params.dsc_hactive_padding; + dsc_cfg.dsc_padding = 0; if (!pipe_ctx->stream_res.dsc->funcs->dsc_validate_stream(pipe_ctx->stream_res.dsc, &dsc_cfg)) return false; -- 2.43.0

2 weeks, 5 days

2
2
0 0

FAILED: patch "[PATCH] mm/truncate: unmap large folio on split failure" failed to apply to 6.6-stable tree

by gregkh＠linuxfoundation.org

The patch below does not apply to the 6.6-stable tree. If someone wants it applied there, or to any other stable or longterm tree, then please email the backport, including the original git commit id to <stable(a)vger.kernel.org>. To reproduce the conflict and resubmit, you may use the following commands: git fetch https://git.kernel.org/pub/scm/linux/kernel/git/stable/linux.git/ linux-6.6.y git checkout FETCH_HEAD git cherry-pick -x fa04f5b60fda62c98a53a60de3a1e763f11feb41 # <resolve conflicts, build, test, etc.> git commit -s git send-email --to '<stable(a)vger.kernel.org>' --in-reply-to '2025112038-designate-amino-d210@gregkh' --subject-prefix 'PATCH 6.6.y' HEAD^.. Possible dependencies: thanks, greg k-h ------------------ original commit in Linus's tree ------------------ From fa04f5b60fda62c98a53a60de3a1e763f11feb41 Mon Sep 17 00:00:00 2001 From: Kiryl Shutsemau <kas(a)kernel.org> Date: Mon, 27 Oct 2025 11:56:36 +0000 Subject: [PATCH] mm/truncate: unmap large folio on split failure Accesses within VMA, but beyond i_size rounded up to PAGE_SIZE are supposed to generate SIGBUS. This behavior might not be respected on truncation. During truncation, the kernel splits a large folio in order to reclaim memory. As a side effect, it unmaps the folio and destroys PMD mappings of the folio. The folio will be refaulted as PTEs and SIGBUS semantics are preserved. However, if the split fails, PMD mappings are preserved and the user will not receive SIGBUS on any accesses within the PMD. Unmap the folio on split failure. It will lead to refault as PTEs and preserve SIGBUS semantics. Make an exception for shmem/tmpfs that for long time intentionally mapped with PMDs across i_size. Link: https://lkml.kernel.org/r/20251027115636.82382-3-kirill@shutemov.name Fixes: b9a8a4195c7d ("truncate,shmem: Handle truncates that split large folios") Signed-off-by: Kiryl Shutsemau <kas(a)kernel.org> Cc: Al Viro <viro(a)zeniv.linux.org.uk> Cc: Baolin Wang <baolin.wang(a)linux.alibaba.com> Cc: Christian Brauner <brauner(a)kernel.org> Cc: "Darrick J. Wong" <djwong(a)kernel.org> Cc: Dave Chinner <david(a)fromorbit.com> Cc: David Hildenbrand <david(a)redhat.com> Cc: Hugh Dickins <hughd(a)google.com> Cc: Johannes Weiner <hannes(a)cmpxchg.org> Cc: Liam Howlett <liam.howlett(a)oracle.com> Cc: Lorenzo Stoakes <lorenzo.stoakes(a)oracle.com> Cc: Matthew Wilcox (Oracle) <willy(a)infradead.org> Cc: Michal Hocko <mhocko(a)suse.com> Cc: Mike Rapoport <rppt(a)kernel.org> Cc: Rik van Riel <riel(a)surriel.com> Cc: Shakeel Butt <shakeel.butt(a)linux.dev> Cc: Suren Baghdasaryan <surenb(a)google.com> Cc: Vlastimil Babka <vbabka(a)suse.cz> Cc: <stable(a)vger.kernel.org> Signed-off-by: Andrew Morton <akpm(a)linux-foundation.org> diff --git a/mm/truncate.c b/mm/truncate.c index 9210cf808f5c..3c5a50ae3274 100644 --- a/mm/truncate.c +++ b/mm/truncate.c @@ -177,6 +177,32 @@ int truncate_inode_folio(struct address_space *mapping, struct folio *folio) return 0; } +static int try_folio_split_or_unmap(struct folio *folio, struct page *split_at, + unsigned long min_order) +{ + enum ttu_flags ttu_flags = + TTU_SYNC | + TTU_SPLIT_HUGE_PMD | + TTU_IGNORE_MLOCK; + int ret; + + ret = try_folio_split_to_order(folio, split_at, min_order); + + /* + * If the split fails, unmap the folio, so it will be refaulted + * with PTEs to respect SIGBUS semantics. + * + * Make an exception for shmem/tmpfs that for long time + * intentionally mapped with PMDs across i_size. + */ + if (ret && !shmem_mapping(folio->mapping)) { + try_to_unmap(folio, ttu_flags); + WARN_ON(folio_mapped(folio)); + } + + return ret; +} + /* * Handle partial folios. The folio may be entirely within the * range if a split has raced with us. If not, we zero the part of the @@ -226,7 +252,7 @@ bool truncate_inode_partial_folio(struct folio *folio, loff_t start, loff_t end) min_order = mapping_min_folio_order(folio->mapping); split_at = folio_page(folio, PAGE_ALIGN_DOWN(offset) / PAGE_SIZE); - if (!try_folio_split_to_order(folio, split_at, min_order)) { + if (!try_folio_split_or_unmap(folio, split_at, min_order)) { /* * try to split at offset + length to make sure folios within * the range can be dropped, especially to avoid memory waste @@ -250,13 +276,10 @@ bool truncate_inode_partial_folio(struct folio *folio, loff_t start, loff_t end) if (!folio_trylock(folio2)) goto out; - /* - * make sure folio2 is large and does not change its mapping. - * Its split result does not matter here. - */ + /* make sure folio2 is large and does not change its mapping */ if (folio_test_large(folio2) && folio2->mapping == folio->mapping) - try_folio_split_to_order(folio2, split_at2, min_order); + try_folio_split_or_unmap(folio2, split_at2, min_order); folio_unlock(folio2); out:

2 weeks, 5 days

2
1
0 0

FAILED: patch "[PATCH] mm/memory: do not populate page table entries beyond i_size" failed to apply to 6.6-stable tree

by gregkh＠linuxfoundation.org

The patch below does not apply to the 6.6-stable tree. If someone wants it applied there, or to any other stable or longterm tree, then please email the backport, including the original git commit id to <stable(a)vger.kernel.org>. To reproduce the conflict and resubmit, you may use the following commands: git fetch https://git.kernel.org/pub/scm/linux/kernel/git/stable/linux.git/ linux-6.6.y git checkout FETCH_HEAD git cherry-pick -x 74207de2ba10c2973334906822dc94d2e859ffc5 # <resolve conflicts, build, test, etc.> git commit -s git send-email --to '<stable(a)vger.kernel.org>' --in-reply-to '2025112025-voucher-hexagram-61bf@gregkh' --subject-prefix 'PATCH 6.6.y' HEAD^.. Possible dependencies: thanks, greg k-h ------------------ original commit in Linus's tree ------------------ From 74207de2ba10c2973334906822dc94d2e859ffc5 Mon Sep 17 00:00:00 2001 From: Kiryl Shutsemau <kas(a)kernel.org> Date: Mon, 27 Oct 2025 11:56:35 +0000 Subject: [PATCH] mm/memory: do not populate page table entries beyond i_size Patch series "Fix SIGBUS semantics with large folios", v3. Accessing memory within a VMA, but beyond i_size rounded up to the next page size, is supposed to generate SIGBUS. Darrick reported[1] an xfstests regression in v6.18-rc1. generic/749 failed due to missing SIGBUS. This was caused by my recent changes that try to fault in the whole folio where possible: 19773df031bc ("mm/fault: try to map the entire file folio in finish_fault()") 357b92761d94 ("mm/filemap: map entire large folio faultaround") These changes did not consider i_size when setting up PTEs, leading to xfstest breakage. However, the problem has been present in the kernel for a long time - since huge tmpfs was introduced in 2016. The kernel happily maps PMD-sized folios as PMD without checking i_size. And huge=always tmpfs allocates PMD-size folios on any writes. I considered this corner case when I implemented a large tmpfs, and my conclusion was that no one in their right mind should rely on receiving a SIGBUS signal when accessing beyond i_size. I cannot imagine how it could be useful for the workload. But apparently filesystem folks care a lot about preserving strict SIGBUS semantics. Generic/749 was introduced last year with reference to POSIX, but no real workloads were mentioned. It also acknowledged the tmpfs deviation from the test case. POSIX indeed says[3]: References within the address range starting at pa and continuing for len bytes to whole pages following the end of an object shall result in delivery of a SIGBUS signal. The patchset fixes the regression introduced by recent changes as well as more subtle SIGBUS breakage due to split failure on truncation. This patch (of 2): Accesses within VMA, but beyond i_size rounded up to PAGE_SIZE are supposed to generate SIGBUS. Recent changes attempted to fault in full folio where possible. They did not respect i_size, which led to populating PTEs beyond i_size and breaking SIGBUS semantics. Darrick reported generic/749 breakage because of this. However, the problem existed before the recent changes. With huge=always tmpfs, any write to a file leads to PMD-size allocation. Following the fault-in of the folio will install PMD mapping regardless of i_size. Fix filemap_map_pages() and finish_fault() to not install: - PTEs beyond i_size; - PMD mappings across i_size; Make an exception for shmem/tmpfs that for long time intentionally mapped with PMDs across i_size. Link: https://lkml.kernel.org/r/20251027115636.82382-1-kirill@shutemov.name Link: https://lkml.kernel.org/r/20251027115636.82382-2-kirill@shutemov.name Signed-off-by: Kiryl Shutsemau <kas(a)kernel.org> Fixes: 6795801366da ("xfs: Support large folios") Reported-by: "Darrick J. Wong" <djwong(a)kernel.org> Cc: Al Viro <viro(a)zeniv.linux.org.uk> Cc: Baolin Wang <baolin.wang(a)linux.alibaba.com> Cc: Christian Brauner <brauner(a)kernel.org> Cc: Dave Chinner <david(a)fromorbit.com> Cc: David Hildenbrand <david(a)redhat.com> Cc: Hugh Dickins <hughd(a)google.com> Cc: Johannes Weiner <hannes(a)cmpxchg.org> Cc: Liam Howlett <liam.howlett(a)oracle.com> Cc: Lorenzo Stoakes <lorenzo.stoakes(a)oracle.com> Cc: Matthew Wilcox (Oracle) <willy(a)infradead.org> Cc: Michal Hocko <mhocko(a)suse.com> Cc: Mike Rapoport <rppt(a)kernel.org> Cc: Rik van Riel <riel(a)surriel.com> Cc: Shakeel Butt <shakeel.butt(a)linux.dev> Cc: Suren Baghdasaryan <surenb(a)google.com> Cc: Vlastimil Babka <vbabka(a)suse.cz> Cc: <stable(a)vger.kernel.org> Signed-off-by: Andrew Morton <akpm(a)linux-foundation.org> diff --git a/mm/filemap.c b/mm/filemap.c index 13f0259d993c..2f1e7e283a51 100644 --- a/mm/filemap.c +++ b/mm/filemap.c @@ -3681,7 +3681,8 @@ static struct folio *next_uptodate_folio(struct xa_state *xas, static vm_fault_t filemap_map_folio_range(struct vm_fault *vmf, struct folio *folio, unsigned long start, unsigned long addr, unsigned int nr_pages, - unsigned long *rss, unsigned short *mmap_miss) + unsigned long *rss, unsigned short *mmap_miss, + bool can_map_large) { unsigned int ref_from_caller = 1; vm_fault_t ret = 0; @@ -3696,7 +3697,7 @@ static vm_fault_t filemap_map_folio_range(struct vm_fault *vmf, * The folio must not cross VMA or page table boundary. */ addr0 = addr - start * PAGE_SIZE; - if (folio_within_vma(folio, vmf->vma) && + if (can_map_large && folio_within_vma(folio, vmf->vma) && (addr0 & PMD_MASK) == ((addr0 + folio_size(folio) - 1) & PMD_MASK)) { vmf->pte -= start; page -= start; @@ -3811,13 +3812,27 @@ vm_fault_t filemap_map_pages(struct vm_fault *vmf, unsigned long rss = 0; unsigned int nr_pages = 0, folio_type; unsigned short mmap_miss = 0, mmap_miss_saved; + bool can_map_large; rcu_read_lock(); folio = next_uptodate_folio(&xas, mapping, end_pgoff); if (!folio) goto out; - if (filemap_map_pmd(vmf, folio, start_pgoff)) { + file_end = DIV_ROUND_UP(i_size_read(mapping->host), PAGE_SIZE) - 1; + end_pgoff = min(end_pgoff, file_end); + + /* + * Do not allow to map with PTEs beyond i_size and with PMD + * across i_size to preserve SIGBUS semantics. + * + * Make an exception for shmem/tmpfs that for long time + * intentionally mapped with PMDs across i_size. + */ + can_map_large = shmem_mapping(mapping) || + file_end >= folio_next_index(folio); + + if (can_map_large && filemap_map_pmd(vmf, folio, start_pgoff)) { ret = VM_FAULT_NOPAGE; goto out; } @@ -3830,10 +3845,6 @@ vm_fault_t filemap_map_pages(struct vm_fault *vmf, goto out; } - file_end = DIV_ROUND_UP(i_size_read(mapping->host), PAGE_SIZE) - 1; - if (end_pgoff > file_end) - end_pgoff = file_end; - folio_type = mm_counter_file(folio); do { unsigned long end; @@ -3850,7 +3861,8 @@ vm_fault_t filemap_map_pages(struct vm_fault *vmf, else ret |= filemap_map_folio_range(vmf, folio, xas.xa_index - folio->index, addr, - nr_pages, &rss, &mmap_miss); + nr_pages, &rss, &mmap_miss, + can_map_large); folio_unlock(folio); } while ((folio = next_uptodate_folio(&xas, mapping, end_pgoff)) != NULL); diff --git a/mm/memory.c b/mm/memory.c index 74b45e258323..b59ae7ce42eb 100644 --- a/mm/memory.c +++ b/mm/memory.c @@ -65,6 +65,7 @@ #include <linux/gfp.h> #include <linux/migrate.h> #include <linux/string.h> +#include <linux/shmem_fs.h> #include <linux/memory-tiers.h> #include <linux/debugfs.h> #include <linux/userfaultfd_k.h> @@ -5501,8 +5502,25 @@ vm_fault_t finish_fault(struct vm_fault *vmf) return ret; } + if (!needs_fallback && vma->vm_file) { + struct address_space *mapping = vma->vm_file->f_mapping; + pgoff_t file_end; + + file_end = DIV_ROUND_UP(i_size_read(mapping->host), PAGE_SIZE); + + /* + * Do not allow to map with PTEs beyond i_size and with PMD + * across i_size to preserve SIGBUS semantics. + * + * Make an exception for shmem/tmpfs that for long time + * intentionally mapped with PMDs across i_size. + */ + needs_fallback = !shmem_mapping(mapping) && + file_end < folio_next_index(folio); + } + if (pmd_none(*vmf->pmd)) { - if (folio_test_pmd_mappable(folio)) { + if (!needs_fallback && folio_test_pmd_mappable(folio)) { ret = do_set_pmd(vmf, folio, page); if (ret != VM_FAULT_FALLBACK) return ret;

2 weeks, 5 days

2
2
0 0

[PATCH 19/26] drm/amd/display: Add cursor offload abort to the new HWSS path

by Alex Hung

From: Nicholas Kazlauskas <nicholas.kazlauskas(a)amd.com> [HOW] If cursor attributes or position are passed into DC via a stream update and we take the newer HWSS paths then it's possible that the update races with cursor offloading if it's enabled. This can cause the cursor to remain on the screen if no further updates come in if it results in HW cursor support being disabled. [HOW] Add the abort into the HWSS path so that cursor offloading doesn't attempt to reprogram the cursor with outdated params. Cc: Mario Limonciello <mario.limonciello(a)amd.com> Cc: Alex Deucher <alexander.deucher(a)amd.com> Cc: stable(a)vger.kernel.org Reviewed-by: Dillon Varone <dillon.varone(a)amd.com> Signed-off-by: Nicholas Kazlauskas <nicholas.kazlauskas(a)amd.com> Signed-off-by: Alex Hung <alex.hung(a)amd.com> --- .../drm/amd/display/dc/core/dc_hw_sequencer.c | 24 +++++++++++++++++++ .../amd/display/dc/hwss/dcn401/dcn401_hwseq.c | 2 ++ .../drm/amd/display/dc/hwss/hw_sequencer.h | 13 ++++++++++ 3 files changed, 39 insertions(+) diff --git a/drivers/gpu/drm/amd/display/dc/core/dc_hw_sequencer.c b/drivers/gpu/drm/amd/display/dc/core/dc_hw_sequencer.c index a01cb2897aab..e2763b60482a 100644 --- a/drivers/gpu/drm/amd/display/dc/core/dc_hw_sequencer.c +++ b/drivers/gpu/drm/amd/display/dc/core/dc_hw_sequencer.c @@ -1293,6 +1293,9 @@ void hwss_execute_sequence(struct dc *dc, case HUBP_MEM_PROGRAM_VIEWPORT: hwss_hubp_mem_program_viewport(params); break; + case ABORT_CURSOR_OFFLOAD_UPDATE: + hwss_abort_cursor_offload_update(params); + break; case SET_CURSOR_ATTRIBUTE: hwss_set_cursor_attribute(params); break; @@ -3076,6 +3079,15 @@ void hwss_hubp_mem_program_viewport(union block_sequence_params *params) hubp->funcs->mem_program_viewport(hubp, viewport, viewport_c); } +void hwss_abort_cursor_offload_update(union block_sequence_params *params) +{ + struct dc *dc = params->abort_cursor_offload_update_params.dc; + struct pipe_ctx *pipe_ctx = params->abort_cursor_offload_update_params.pipe_ctx; + + if (dc && dc->hwss.abort_cursor_offload_update) + dc->hwss.abort_cursor_offload_update(dc, pipe_ctx); +} + void hwss_set_cursor_attribute(union block_sequence_params *params) { struct dc *dc = params->set_cursor_attribute_params.dc; @@ -3934,6 +3946,18 @@ void hwss_add_hubp_mem_program_viewport(struct block_sequence_state *seq_state, } } +void hwss_add_abort_cursor_offload_update(struct block_sequence_state *seq_state, + struct dc *dc, + struct pipe_ctx *pipe_ctx) +{ + if (*seq_state->num_steps < MAX_HWSS_BLOCK_SEQUENCE_SIZE) { + seq_state->steps[*seq_state->num_steps].func = ABORT_CURSOR_OFFLOAD_UPDATE; + seq_state->steps[*seq_state->num_steps].params.abort_cursor_offload_update_params.dc = dc; + seq_state->steps[*seq_state->num_steps].params.abort_cursor_offload_update_params.pipe_ctx = pipe_ctx; + (*seq_state->num_steps)++; + } +} + void hwss_add_set_cursor_attribute(struct block_sequence_state *seq_state, struct dc *dc, struct pipe_ctx *pipe_ctx) diff --git a/drivers/gpu/drm/amd/display/dc/hwss/dcn401/dcn401_hwseq.c b/drivers/gpu/drm/amd/display/dc/hwss/dcn401/dcn401_hwseq.c index f02edc9371b0..01b0f72b6623 100644 --- a/drivers/gpu/drm/amd/display/dc/hwss/dcn401/dcn401_hwseq.c +++ b/drivers/gpu/drm/amd/display/dc/hwss/dcn401/dcn401_hwseq.c @@ -3675,6 +3675,8 @@ void dcn401_update_dchubp_dpp_sequence(struct dc *dc, pipe_ctx->update_flags.bits.scaler || viewport_changed == true) && pipe_ctx->stream->cursor_attributes.address.quad_part != 0) { + hwss_add_abort_cursor_offload_update(seq_state, dc, pipe_ctx); + hwss_add_set_cursor_attribute(seq_state, dc, pipe_ctx); /* Step 15: Cursor position setup */ diff --git a/drivers/gpu/drm/amd/display/dc/hwss/hw_sequencer.h b/drivers/gpu/drm/amd/display/dc/hwss/hw_sequencer.h index 3772b4aa11cc..8ed9eea40c56 100644 --- a/drivers/gpu/drm/amd/display/dc/hwss/hw_sequencer.h +++ b/drivers/gpu/drm/amd/display/dc/hwss/hw_sequencer.h @@ -696,6 +696,11 @@ struct hubp_program_mcache_id_and_split_coordinate_params { struct mcache_regs_struct *mcache_regs; }; +struct abort_cursor_offload_update_params { + struct dc *dc; + struct pipe_ctx *pipe_ctx; +}; + struct set_cursor_attribute_params { struct dc *dc; struct pipe_ctx *pipe_ctx; @@ -842,6 +847,7 @@ union block_sequence_params { struct mpc_insert_plane_params mpc_insert_plane_params; struct dpp_set_scaler_params dpp_set_scaler_params; struct hubp_mem_program_viewport_params hubp_mem_program_viewport_params; + struct abort_cursor_offload_update_params abort_cursor_offload_update_params; struct set_cursor_attribute_params set_cursor_attribute_params; struct set_cursor_position_params set_cursor_position_params; struct set_cursor_sdr_white_level_params set_cursor_sdr_white_level_params; @@ -960,6 +966,7 @@ enum block_sequence_func { MPC_INSERT_PLANE, DPP_SET_SCALER, HUBP_MEM_PROGRAM_VIEWPORT, + ABORT_CURSOR_OFFLOAD_UPDATE, SET_CURSOR_ATTRIBUTE, SET_CURSOR_POSITION, SET_CURSOR_SDR_WHITE_LEVEL, @@ -1565,6 +1572,8 @@ void hwss_dpp_set_scaler(union block_sequence_params *params); void hwss_hubp_mem_program_viewport(union block_sequence_params *params); +void hwss_abort_cursor_offload_update(union block_sequence_params *params); + void hwss_set_cursor_attribute(union block_sequence_params *params); void hwss_set_cursor_position(union block_sequence_params *params); @@ -1961,6 +1970,10 @@ void hwss_add_hubp_mem_program_viewport(struct block_sequence_state *seq_state, const struct rect *viewport, const struct rect *viewport_c); +void hwss_add_abort_cursor_offload_update(struct block_sequence_state *seq_state, + struct dc *dc, + struct pipe_ctx *pipe_ctx); + void hwss_add_set_cursor_attribute(struct block_sequence_state *seq_state, struct dc *dc, struct pipe_ctx *pipe_ctx); -- 2.43.0

2 weeks, 5 days

1
0
0 0

[PATCH 15/26] drm/amd/display: Increase EDID read retries

by Alex Hung

From: "Mario Limonciello (AMD)" <superm1(a)kernel.org> [WHY] When monitor is still booting EDID read can fail while DPCD read is successful. In this case no EDID data will be returned, and this could happen for a while. [HOW] Increase number of attempts to read EDID in dm_helpers_read_local_edid() to 25. Closes: https://gitlab.freedesktop.org/drm/amd/-/issues/4672 Cc: Mario Limonciello <mario.limonciello(a)amd.com> Cc: Alex Deucher <alexander.deucher(a)amd.com> Cc: stable(a)vger.kernel.org Reviewed-by: Alex Hung <alex.hung(a)amd.com> Signed-off-by: Mario Limonciello (AMD) <superm1(a)kernel.org> Signed-off-by: Alex Hung <alex.hung(a)amd.com> --- drivers/gpu/drm/amd/display/amdgpu_dm/amdgpu_dm_helpers.c | 8 ++++---- 1 file changed, 4 insertions(+), 4 deletions(-) diff --git a/drivers/gpu/drm/amd/display/amdgpu_dm/amdgpu_dm_helpers.c b/drivers/gpu/drm/amd/display/amdgpu_dm/amdgpu_dm_helpers.c index 582a1c04f035..e5e993d3ef74 100644 --- a/drivers/gpu/drm/amd/display/amdgpu_dm/amdgpu_dm_helpers.c +++ b/drivers/gpu/drm/amd/display/amdgpu_dm/amdgpu_dm_helpers.c @@ -1006,8 +1006,8 @@ enum dc_edid_status dm_helpers_read_local_edid( struct amdgpu_dm_connector *aconnector = link->priv; struct drm_connector *connector = &aconnector->base; struct i2c_adapter *ddc; - int retry = 3; - enum dc_edid_status edid_status; + int retry = 25; + enum dc_edid_status edid_status = EDID_NO_RESPONSE; const struct drm_edid *drm_edid; const struct edid *edid; @@ -1037,7 +1037,7 @@ enum dc_edid_status dm_helpers_read_local_edid( } if (!drm_edid) - return EDID_NO_RESPONSE; + continue; edid = drm_edid_raw(drm_edid); // FIXME: Get rid of drm_edid_raw() if (!edid || @@ -1055,7 +1055,7 @@ enum dc_edid_status dm_helpers_read_local_edid( &sink->dc_edid, &sink->edid_caps); - } while (edid_status == EDID_BAD_CHECKSUM && --retry > 0); + } while ((edid_status == EDID_BAD_CHECKSUM || edid_status == EDID_NO_RESPONSE) && --retry > 0); if (edid_status != EDID_OK) DRM_ERROR("EDID err: %d, on connector: %s", -- 2.43.0

2 weeks, 5 days

1
0
0 0

[PATCH 10/26] drm/amd/display: Don't change brightness for disabled connectors

by Alex Hung

From: "Mario Limonciello (AMD)" <superm1(a)kernel.org> [WHY] When a laptop lid is closed the connector is disabled but userspace can still try to change brightness. This doesn't work because the panel is turned off. It will eventually time out, but there is a lot of stutter along the way. [How] Iterate all connectors to check whether the matching one for the backlight index is enabled. Closes: https://gitlab.freedesktop.org/drm/amd/-/issues/4675 Cc: Mario Limonciello <mario.limonciello(a)amd.com> Cc: Alex Deucher <alexander.deucher(a)amd.com> Cc: stable(a)vger.kernel.org Reviewed-by: Ray Wu <ray.wu(a)amd.com> Signed-off-by: Mario Limonciello (AMD) <superm1(a)kernel.org> Signed-off-by: Alex Hung <alex.hung(a)amd.com> --- drivers/gpu/drm/amd/display/amdgpu_dm/amdgpu_dm.c | 15 +++++++++++++++ 1 file changed, 15 insertions(+) diff --git a/drivers/gpu/drm/amd/display/amdgpu_dm/amdgpu_dm.c b/drivers/gpu/drm/amd/display/amdgpu_dm/amdgpu_dm.c index 8a0555365719..424020c0756d 100644 --- a/drivers/gpu/drm/amd/display/amdgpu_dm/amdgpu_dm.c +++ b/drivers/gpu/drm/amd/display/amdgpu_dm/amdgpu_dm.c @@ -5108,6 +5108,21 @@ static void amdgpu_dm_backlight_set_level(struct amdgpu_display_manager *dm, struct dc_link *link; u32 brightness; bool rc, reallow_idle = false; + struct drm_connector *connector; + + list_for_each_entry(connector, &dm->ddev->mode_config.connector_list, head) { + struct amdgpu_dm_connector *aconnector = to_amdgpu_dm_connector(connector); + + if (aconnector->bl_idx != bl_idx) + continue; + + /* if connector is off, save the brightness for next time it's on */ + if (!aconnector->base.encoder) { + dm->brightness[bl_idx] = user_brightness; + dm->actual_brightness[bl_idx] = 0; + return; + } + } amdgpu_dm_update_backlight_caps(dm, bl_idx); caps = &dm->backlight_caps[bl_idx]; -- 2.43.0

2 weeks, 5 days

1
0
0 0

[PATCH 01/26] drm/amd/display: Check NULL before accessing

by Alex Hung

[WHAT] IGT kms_cursor_legacy's long-nonblocking-modeset-vs-cursor-atomic fails with NULL pointer dereference. This can be reproduced with both an eDP panel and a DP monitors connected. BUG: kernel NULL pointer dereference, address: 0000000000000000 #PF: supervisor read access in kernel mode #PF: error_code(0x0000) - not-present page PGD 0 P4D 0 Oops: Oops: 0000 [#1] SMP NOPTI CPU: 13 UID: 0 PID: 2960 Comm: kms_cursor_lega Not tainted 6.16.0-99-custom #8 PREEMPT(voluntary) Hardware name: AMD ........ RIP: 0010:dc_stream_get_scanoutpos+0x34/0x130 [amdgpu] Code: 57 4d 89 c7 41 56 49 89 ce 41 55 49 89 d5 41 54 49 89 fc 53 48 83 ec 18 48 8b 87 a0 64 00 00 48 89 75 d0 48 c7 c6 e0 41 30 c2 <48> 8b 38 48 8b 9f 68 06 00 00 e8 8d d7 fd ff 31 c0 48 81 c3 e0 02 RSP: 0018:ffffd0f3c2bd7608 EFLAGS: 00010292 RAX: 0000000000000000 RBX: 0000000000000000 RCX: ffffd0f3c2bd7668 RDX: ffffd0f3c2bd7664 RSI: ffffffffc23041e0 RDI: ffff8b32494b8000 RBP: ffffd0f3c2bd7648 R08: ffffd0f3c2bd766c R09: ffffd0f3c2bd7760 R10: ffffd0f3c2bd7820 R11: 0000000000000000 R12: ffff8b32494b8000 R13: ffffd0f3c2bd7664 R14: ffffd0f3c2bd7668 R15: ffffd0f3c2bd766c FS: 000071f631b68700(0000) GS:ffff8b399f114000(0000) knlGS:0000000000000000 CS: 0010 DS: 0000 ES: 0000 CR0: 0000000080050033 CR2: 0000000000000000 CR3: 00000001b8105000 CR4: 0000000000f50ef0 PKRU: 55555554 Call Trace: <TASK> dm_crtc_get_scanoutpos+0xd7/0x180 [amdgpu] amdgpu_display_get_crtc_scanoutpos+0x86/0x1c0 [amdgpu] ? __pfx_amdgpu_crtc_get_scanout_position+0x10/0x10[amdgpu] amdgpu_crtc_get_scanout_position+0x27/0x50 [amdgpu] drm_crtc_vblank_helper_get_vblank_timestamp_internal+0xf7/0x400 drm_crtc_vblank_helper_get_vblank_timestamp+0x1c/0x30 drm_crtc_get_last_vbltimestamp+0x55/0x90 drm_crtc_next_vblank_start+0x45/0xa0 drm_atomic_helper_wait_for_fences+0x81/0x1f0 ... Cc: Mario Limonciello <mario.limonciello(a)amd.com> Cc: Alex Deucher <alexander.deucher(a)amd.com> Cc: stable(a)vger.kernel.org Reviewed-by: Aurabindo Pillai <aurabindo.pillai(a)amd.com> Signed-off-by: Alex Hung <alex.hung(a)amd.com> --- drivers/gpu/drm/amd/display/dc/core/dc_stream.c | 11 ++++++++--- 1 file changed, 8 insertions(+), 3 deletions(-) diff --git a/drivers/gpu/drm/amd/display/dc/core/dc_stream.c b/drivers/gpu/drm/amd/display/dc/core/dc_stream.c index 6d309c320253..129cd5f84983 100644 --- a/drivers/gpu/drm/amd/display/dc/core/dc_stream.c +++ b/drivers/gpu/drm/amd/display/dc/core/dc_stream.c @@ -737,9 +737,14 @@ bool dc_stream_get_scanoutpos(const struct dc_stream_state *stream, { uint8_t i; bool ret = false; - struct dc *dc = stream->ctx->dc; - struct resource_context *res_ctx = - &dc->current_state->res_ctx; + struct dc *dc; + struct resource_context *res_ctx; + + if (!stream->ctx) + return false; + + dc = stream->ctx->dc; + res_ctx = &dc->current_state->res_ctx; dc_exit_ips_for_hw_access(dc); -- 2.43.0

2 weeks, 5 days

1
0
0 0

FAILED: patch "[PATCH] mm/huge_memory: do not change split_huge_page*() target order" failed to apply to 6.17-stable tree

by gregkh＠linuxfoundation.org

The patch below does not apply to the 6.17-stable tree. If someone wants it applied there, or to any other stable or longterm tree, then please email the backport, including the original git commit id to <stable(a)vger.kernel.org>. To reproduce the conflict and resubmit, you may use the following commands: git fetch https://git.kernel.org/pub/scm/linux/kernel/git/stable/linux.git/ linux-6.17.y git checkout FETCH_HEAD git cherry-pick -x 77008e1b2ef73249bceb078a321a3ff6bc087afb # <resolve conflicts, build, test, etc.> git commit -s git send-email --to '<stable(a)vger.kernel.org>' --in-reply-to '2025112014-photo-email-c834@gregkh' --subject-prefix 'PATCH 6.17.y' HEAD^.. Possible dependencies: thanks, greg k-h ------------------ original commit in Linus's tree ------------------ From 77008e1b2ef73249bceb078a321a3ff6bc087afb Mon Sep 17 00:00:00 2001 From: Zi Yan <ziy(a)nvidia.com> Date: Thu, 16 Oct 2025 21:36:30 -0400 Subject: [PATCH] mm/huge_memory: do not change split_huge_page*() target order silently Page cache folios from a file system that support large block size (LBS) can have minimal folio order greater than 0, thus a high order folio might not be able to be split down to order-0. Commit e220917fa507 ("mm: split a folio in minimum folio order chunks") bumps the target order of split_huge_page*() to the minimum allowed order when splitting a LBS folio. This causes confusion for some split_huge_page*() callers like memory failure handling code, since they expect after-split folios all have order-0 when split succeeds but in reality get min_order_for_split() order folios and give warnings. Fix it by failing a split if the folio cannot be split to the target order. Rename try_folio_split() to try_folio_split_to_order() to reflect the added new_order parameter. Remove its unused list parameter. [The test poisons LBS folios, which cannot be split to order-0 folios, and also tries to poison all memory. The non split LBS folios take more memory than the test anticipated, leading to OOM. The patch fixed the kernel warning and the test needs some change to avoid OOM.] Link: https://lkml.kernel.org/r/20251017013630.139907-1-ziy@nvidia.com Fixes: e220917fa507 ("mm: split a folio in minimum folio order chunks") Signed-off-by: Zi Yan <ziy(a)nvidia.com> Reported-by: syzbot+e6367ea2fdab6ed46056(a)syzkaller.appspotmail.com Closes: https://lore.kernel.org/all/68d2c943.a70a0220.1b52b.02b3.GAE@google.com/ Reviewed-by: Luis Chamberlain <mcgrof(a)kernel.org> Reviewed-by: Pankaj Raghav <p.raghav(a)samsung.com> Reviewed-by: Wei Yang <richard.weiyang(a)gmail.com> Acked-by: David Hildenbrand <david(a)redhat.com> Reviewed-by: Lorenzo Stoakes <lorenzo.stoakes(a)oracle.com> Reviewed-by: Miaohe Lin <linmiaohe(a)huawei.com> Cc: Baolin Wang <baolin.wang(a)linux.alibaba.com> Cc: Barry Song <baohua(a)kernel.org> Cc: David Hildenbrand <david(a)redhat.com> Cc: Dev Jain <dev.jain(a)arm.com> Cc: Jane Chu <jane.chu(a)oracle.com> Cc: Lance Yang <lance.yang(a)linux.dev> Cc: Liam Howlett <liam.howlett(a)oracle.com> Cc: Mariano Pache <npache(a)redhat.com> Cc: Matthew Wilcox (Oracle) <willy(a)infradead.org> Cc: Naoya Horiguchi <nao.horiguchi(a)gmail.com> Cc: Ryan Roberts <ryan.roberts(a)arm.com> Cc: Christian Brauner <brauner(a)kernel.org> Cc: <stable(a)vger.kernel.org> Signed-off-by: Andrew Morton <akpm(a)linux-foundation.org> diff --git a/include/linux/huge_mm.h b/include/linux/huge_mm.h index f327d62fc985..71ac78b9f834 100644 --- a/include/linux/huge_mm.h +++ b/include/linux/huge_mm.h @@ -376,45 +376,30 @@ bool non_uniform_split_supported(struct folio *folio, unsigned int new_order, int folio_split(struct folio *folio, unsigned int new_order, struct page *page, struct list_head *list); /* - * try_folio_split - try to split a @folio at @page using non uniform split. + * try_folio_split_to_order - try to split a @folio at @page to @new_order using + * non uniform split. * @folio: folio to be split - * @page: split to order-0 at the given page - * @list: store the after-split folios + * @page: split to @new_order at the given page + * @new_order: the target split order * - * Try to split a @folio at @page using non uniform split to order-0, if - * non uniform split is not supported, fall back to uniform split. + * Try to split a @folio at @page using non uniform split to @new_order, if + * non uniform split is not supported, fall back to uniform split. After-split + * folios are put back to LRU list. Use min_order_for_split() to get the lower + * bound of @new_order. * * Return: 0: split is successful, otherwise split failed. */ -static inline int try_folio_split(struct folio *folio, struct page *page, - struct list_head *list) +static inline int try_folio_split_to_order(struct folio *folio, + struct page *page, unsigned int new_order) { - int ret = min_order_for_split(folio); - - if (ret < 0) - return ret; - - if (!non_uniform_split_supported(folio, 0, false)) - return split_huge_page_to_list_to_order(&folio->page, list, - ret); - return folio_split(folio, ret, page, list); + if (!non_uniform_split_supported(folio, new_order, /* warns= */ false)) + return split_huge_page_to_list_to_order(&folio->page, NULL, + new_order); + return folio_split(folio, new_order, page, NULL); } static inline int split_huge_page(struct page *page) { - struct folio *folio = page_folio(page); - int ret = min_order_for_split(folio); - - if (ret < 0) - return ret; - - /* - * split_huge_page() locks the page before splitting and - * expects the same page that has been split to be locked when - * returned. split_folio(page_folio(page)) cannot be used here - * because it converts the page to folio and passes the head - * page to be split. - */ - return split_huge_page_to_list_to_order(page, NULL, ret); + return split_huge_page_to_list_to_order(page, NULL, 0); } void deferred_split_folio(struct folio *folio, bool partially_mapped); @@ -597,14 +582,20 @@ static inline int split_huge_page(struct page *page) return -EINVAL; } +static inline int min_order_for_split(struct folio *folio) +{ + VM_WARN_ON_ONCE_FOLIO(1, folio); + return -EINVAL; +} + static inline int split_folio_to_list(struct folio *folio, struct list_head *list) { VM_WARN_ON_ONCE_FOLIO(1, folio); return -EINVAL; } -static inline int try_folio_split(struct folio *folio, struct page *page, - struct list_head *list) +static inline int try_folio_split_to_order(struct folio *folio, + struct page *page, unsigned int new_order) { VM_WARN_ON_ONCE_FOLIO(1, folio); return -EINVAL; diff --git a/mm/huge_memory.c b/mm/huge_memory.c index 1d1b74950332..feac4aef7dfb 100644 --- a/mm/huge_memory.c +++ b/mm/huge_memory.c @@ -3653,8 +3653,6 @@ static int __folio_split(struct folio *folio, unsigned int new_order, min_order = mapping_min_folio_order(folio->mapping); if (new_order < min_order) { - VM_WARN_ONCE(1, "Cannot split mapped folio below min-order: %u", - min_order); ret = -EINVAL; goto out; } @@ -3986,12 +3984,7 @@ int min_order_for_split(struct folio *folio) int split_folio_to_list(struct folio *folio, struct list_head *list) { - int ret = min_order_for_split(folio); - - if (ret < 0) - return ret; - - return split_huge_page_to_list_to_order(&folio->page, list, ret); + return split_huge_page_to_list_to_order(&folio->page, list, 0); } /* diff --git a/mm/truncate.c b/mm/truncate.c index 91eb92a5ce4f..9210cf808f5c 100644 --- a/mm/truncate.c +++ b/mm/truncate.c @@ -194,6 +194,7 @@ bool truncate_inode_partial_folio(struct folio *folio, loff_t start, loff_t end) size_t size = folio_size(folio); unsigned int offset, length; struct page *split_at, *split_at2; + unsigned int min_order; if (pos < start) offset = start - pos; @@ -223,8 +224,9 @@ bool truncate_inode_partial_folio(struct folio *folio, loff_t start, loff_t end) if (!folio_test_large(folio)) return true; + min_order = mapping_min_folio_order(folio->mapping); split_at = folio_page(folio, PAGE_ALIGN_DOWN(offset) / PAGE_SIZE); - if (!try_folio_split(folio, split_at, NULL)) { + if (!try_folio_split_to_order(folio, split_at, min_order)) { /* * try to split at offset + length to make sure folios within * the range can be dropped, especially to avoid memory waste @@ -254,7 +256,7 @@ bool truncate_inode_partial_folio(struct folio *folio, loff_t start, loff_t end) */ if (folio_test_large(folio2) && folio2->mapping == folio->mapping) - try_folio_split(folio2, split_at2, NULL); + try_folio_split_to_order(folio2, split_at2, min_order); folio_unlock(folio2); out:

2 weeks, 5 days

2
1
0 0

FAILED: patch "[PATCH] KVM: guest_memfd: Remove bindings on memslot deletion when" failed to apply to 6.12-stable tree

by gregkh＠linuxfoundation.org

The patch below does not apply to the 6.12-stable tree. If someone wants it applied there, or to any other stable or longterm tree, then please email the backport, including the original git commit id to <stable(a)vger.kernel.org>. To reproduce the conflict and resubmit, you may use the following commands: git fetch https://git.kernel.org/pub/scm/linux/kernel/git/stable/linux.git/ linux-6.12.y git checkout FETCH_HEAD git cherry-pick -x ae431059e75d36170a5ae6b44cc4d06d43613215 # <resolve conflicts, build, test, etc.> git commit -s git send-email --to '<stable(a)vger.kernel.org>' --in-reply-to '2025112009-getaway-overplay-a36a@gregkh' --subject-prefix 'PATCH 6.12.y' HEAD^.. Possible dependencies: thanks, greg k-h ------------------ original commit in Linus's tree ------------------ From ae431059e75d36170a5ae6b44cc4d06d43613215 Mon Sep 17 00:00:00 2001 From: Sean Christopherson <seanjc(a)google.com> Date: Mon, 3 Nov 2025 17:12:05 -0800 Subject: [PATCH] KVM: guest_memfd: Remove bindings on memslot deletion when gmem is dying When unbinding a memslot from a guest_memfd instance, remove the bindings even if the guest_memfd file is dying, i.e. even if its file refcount has gone to zero. If the memslot is freed before the file is fully released, nullifying the memslot side of the binding in kvm_gmem_release() will write to freed memory, as detected by syzbot+KASAN: ================================================================== BUG: KASAN: slab-use-after-free in kvm_gmem_release+0x176/0x440 virt/kvm/guest_memfd.c:353 Write of size 8 at addr ffff88807befa508 by task syz.0.17/6022 CPU: 0 UID: 0 PID: 6022 Comm: syz.0.17 Not tainted syzkaller #0 PREEMPT(full) Hardware name: Google Google Compute Engine/Google Compute Engine, BIOS Google 10/02/2025 Call Trace: <TASK> dump_stack_lvl+0x189/0x250 lib/dump_stack.c:120 print_address_description mm/kasan/report.c:378 [inline] print_report+0xca/0x240 mm/kasan/report.c:482 kasan_report+0x118/0x150 mm/kasan/report.c:595 kvm_gmem_release+0x176/0x440 virt/kvm/guest_memfd.c:353 __fput+0x44c/0xa70 fs/file_table.c:468 task_work_run+0x1d4/0x260 kernel/task_work.c:227 resume_user_mode_work include/linux/resume_user_mode.h:50 [inline] exit_to_user_mode_loop+0xe9/0x130 kernel/entry/common.c:43 exit_to_user_mode_prepare include/linux/irq-entry-common.h:225 [inline] syscall_exit_to_user_mode_work include/linux/entry-common.h:175 [inline] syscall_exit_to_user_mode include/linux/entry-common.h:210 [inline] do_syscall_64+0x2bd/0xfa0 arch/x86/entry/syscall_64.c:100 entry_SYSCALL_64_after_hwframe+0x77/0x7f RIP: 0033:0x7fbeeff8efc9 </TASK> Allocated by task 6023: kasan_save_stack mm/kasan/common.c:56 [inline] kasan_save_track+0x3e/0x80 mm/kasan/common.c:77 poison_kmalloc_redzone mm/kasan/common.c:397 [inline] __kasan_kmalloc+0x93/0xb0 mm/kasan/common.c:414 kasan_kmalloc include/linux/kasan.h:262 [inline] __kmalloc_cache_noprof+0x3e2/0x700 mm/slub.c:5758 kmalloc_noprof include/linux/slab.h:957 [inline] kzalloc_noprof include/linux/slab.h:1094 [inline] kvm_set_memory_region+0x747/0xb90 virt/kvm/kvm_main.c:2104 kvm_vm_ioctl_set_memory_region+0x6f/0xd0 virt/kvm/kvm_main.c:2154 kvm_vm_ioctl+0x957/0xc60 virt/kvm/kvm_main.c:5201 vfs_ioctl fs/ioctl.c:51 [inline] __do_sys_ioctl fs/ioctl.c:597 [inline] __se_sys_ioctl+0xfc/0x170 fs/ioctl.c:583 do_syscall_x64 arch/x86/entry/syscall_64.c:63 [inline] do_syscall_64+0xfa/0xfa0 arch/x86/entry/syscall_64.c:94 entry_SYSCALL_64_after_hwframe+0x77/0x7f Freed by task 6023: kasan_save_stack mm/kasan/common.c:56 [inline] kasan_save_track+0x3e/0x80 mm/kasan/common.c:77 kasan_save_free_info+0x46/0x50 mm/kasan/generic.c:584 poison_slab_object mm/kasan/common.c:252 [inline] __kasan_slab_free+0x5c/0x80 mm/kasan/common.c:284 kasan_slab_free include/linux/kasan.h:234 [inline] slab_free_hook mm/slub.c:2533 [inline] slab_free mm/slub.c:6622 [inline] kfree+0x19a/0x6d0 mm/slub.c:6829 kvm_set_memory_region+0x9c4/0xb90 virt/kvm/kvm_main.c:2130 kvm_vm_ioctl_set_memory_region+0x6f/0xd0 virt/kvm/kvm_main.c:2154 kvm_vm_ioctl+0x957/0xc60 virt/kvm/kvm_main.c:5201 vfs_ioctl fs/ioctl.c:51 [inline] __do_sys_ioctl fs/ioctl.c:597 [inline] __se_sys_ioctl+0xfc/0x170 fs/ioctl.c:583 do_syscall_x64 arch/x86/entry/syscall_64.c:63 [inline] do_syscall_64+0xfa/0xfa0 arch/x86/entry/syscall_64.c:94 entry_SYSCALL_64_after_hwframe+0x77/0x7f Deliberately don't acquire filemap invalid lock when the file is dying as the lifecycle of f_mapping is outside the purview of KVM. Dereferencing the mapping is *probably* fine, but there's no need to invalidate anything as memslot deletion is responsible for zapping SPTEs, and the only code that can access the dying file is kvm_gmem_release(), whose core code is mutually exclusive with unbinding. Note, the mutual exclusivity is also what makes it safe to access the bindings on a dying gmem instance. Unbinding either runs with slots_lock held, or after the last reference to the owning "struct kvm" is put, and kvm_gmem_release() nullifies the slot pointer under slots_lock, and puts its reference to the VM after that is done. Reported-by: syzbot+2479e53d0db9b32ae2aa(a)syzkaller.appspotmail.com Closes: https://lore.kernel.org/all/68fa7a22.a70a0220.3bf6c6.008b.GAE@google.com Tested-by: syzbot+2479e53d0db9b32ae2aa(a)syzkaller.appspotmail.com Fixes: a7800aa80ea4 ("KVM: Add KVM_CREATE_GUEST_MEMFD ioctl() for guest-specific backing memory") Cc: stable(a)vger.kernel.org Cc: Hillf Danton <hdanton(a)sina.com> Reviewed-By: Vishal Annapurve <vannapurve(a)google.com> Link: https://patch.msgid.link/20251104011205.3853541-1-seanjc@google.com Signed-off-by: Sean Christopherson <seanjc(a)google.com> diff --git a/virt/kvm/guest_memfd.c b/virt/kvm/guest_memfd.c index fbca8c0972da..ffadc5ee8e04 100644 --- a/virt/kvm/guest_memfd.c +++ b/virt/kvm/guest_memfd.c @@ -623,24 +623,11 @@ int kvm_gmem_bind(struct kvm *kvm, struct kvm_memory_slot *slot, return r; } -void kvm_gmem_unbind(struct kvm_memory_slot *slot) +static void __kvm_gmem_unbind(struct kvm_memory_slot *slot, struct kvm_gmem *gmem) { unsigned long start = slot->gmem.pgoff; unsigned long end = start + slot->npages; - struct kvm_gmem *gmem; - struct file *file; - /* - * Nothing to do if the underlying file was already closed (or is being - * closed right now), kvm_gmem_release() invalidates all bindings. - */ - file = kvm_gmem_get_file(slot); - if (!file) - return; - - gmem = file->private_data; - - filemap_invalidate_lock(file->f_mapping); xa_store_range(&gmem->bindings, start, end - 1, NULL, GFP_KERNEL); /* @@ -648,6 +635,38 @@ void kvm_gmem_unbind(struct kvm_memory_slot *slot) * cannot see this memslot. */ WRITE_ONCE(slot->gmem.file, NULL); +} + +void kvm_gmem_unbind(struct kvm_memory_slot *slot) +{ + struct file *file; + + /* + * Nothing to do if the underlying file was _already_ closed, as + * kvm_gmem_release() invalidates and nullifies all bindings. + */ + if (!slot->gmem.file) + return; + + file = kvm_gmem_get_file(slot); + + /* + * However, if the file is _being_ closed, then the bindings need to be + * removed as kvm_gmem_release() might not run until after the memslot + * is freed. Note, modifying the bindings is safe even though the file + * is dying as kvm_gmem_release() nullifies slot->gmem.file under + * slots_lock, and only puts its reference to KVM after destroying all + * bindings. I.e. reaching this point means kvm_gmem_release() hasn't + * yet destroyed the bindings or freed the gmem_file, and can't do so + * until the caller drops slots_lock. + */ + if (!file) { + __kvm_gmem_unbind(slot, slot->gmem.file->private_data); + return; + } + + filemap_invalidate_lock(file->f_mapping); + __kvm_gmem_unbind(slot, file->private_data); filemap_invalidate_unlock(file->f_mapping); fput(file);

2 weeks, 5 days

2
3
0 0

2025

2024

2023

2022

2021

2020

2019

2018

2017

Linux-stable-mirror