May 2024 - Linux-stable-mirror

[PATCH 2/3] KVM: arm64: Allow AArch32 PSTATE.M to be restored as System mode

by Marc Zyngier

It appears that we don't allowed a vcpu to be restored in AArch32 System mode, as we *never* included it in the list of valid modes. Just add it to the list of allowed modes. Fixes: 0d854a60b1d7 ("arm64: KVM: enable initialization of a 32bit vcpu") Signed-off-by: Marc Zyngier <maz(a)kernel.org> Cc: stable(a)vger.kernel.org --- arch/arm64/kvm/guest.c | 1 + 1 file changed, 1 insertion(+) diff --git a/arch/arm64/kvm/guest.c b/arch/arm64/kvm/guest.c index d9617b11f7a8..11098eb7eb44 100644 --- a/arch/arm64/kvm/guest.c +++ b/arch/arm64/kvm/guest.c @@ -251,6 +251,7 @@ static int set_core_reg(struct kvm_vcpu *vcpu, const struct kvm_one_reg *reg) case PSR_AA32_MODE_SVC: case PSR_AA32_MODE_ABT: case PSR_AA32_MODE_UND: + case PSR_AA32_MODE_SYS: if (!vcpu_el1_is_32bit(vcpu)) return -EINVAL; break; -- 2.39.2

1 year, 6 months

2
1
0 0

[PATCH 1/3] KVM: arm64: Fix AArch32 register narrowing on userspace write

by Marc Zyngier

When userspace writes to once of the core registers, we make sure to narrow the corresponding GPRs if PSTATE indicates an AArch32 context. The code tries to check whether the context is EL0 or EL1 so that it narrows the correct registers. But it does so by checking the full PSTATE instead of PSTATE.M. As a consequence, and if we are restoring an AArch32 EL0 context in a 64bit guest, and that PSTATE has *any* bit set outside of PSTATE.M, we narrow *all* registers instead of only the first 15, destroying the 64bit state. Obviously, this is not something the guest is likely to enjoy. Correctly masking PSTATE to only evaluate PSTATE.M fixes it. Fixes: 90c1f934ed71 ("KVM: arm64: Get rid of the AArch32 register mapping code") Reported-by: Nina Schoetterl-Glausch <nsg(a)linux.ibm.com> Signed-off-by: Marc Zyngier <maz(a)kernel.org> Cc: stable(a)vger.kernel.org --- arch/arm64/kvm/guest.c | 2 +- 1 file changed, 1 insertion(+), 1 deletion(-) diff --git a/arch/arm64/kvm/guest.c b/arch/arm64/kvm/guest.c index e2f762d959bb..d9617b11f7a8 100644 --- a/arch/arm64/kvm/guest.c +++ b/arch/arm64/kvm/guest.c @@ -276,7 +276,7 @@ static int set_core_reg(struct kvm_vcpu *vcpu, const struct kvm_one_reg *reg) if (*vcpu_cpsr(vcpu) & PSR_MODE32_BIT) { int i, nr_reg; - switch (*vcpu_cpsr(vcpu)) { + switch (*vcpu_cpsr(vcpu) & PSR_AA32_MODE_MASK) { /* * Either we are dealing with user mode, and only the * first 15 registers (+ PC) must be narrowed to 32bit. -- 2.39.2

1 year, 6 months

3
2
0 0

[merged mm-hotfixes-stable] mm-memory-failure-fix-handling-of-dissolved-but-not-taken-off-from-buddy-pages.patch removed from -mm tree

by Andrew Morton

The quilt patch titled Subject: mm/memory-failure: fix handling of dissolved but not taken off from buddy pages has been removed from the -mm tree. Its filename was mm-memory-failure-fix-handling-of-dissolved-but-not-taken-off-from-buddy-pages.patch This patch was dropped because it was merged into the mm-hotfixes-stable branch of git://git.kernel.org/pub/scm/linux/kernel/git/akpm/mm ------------------------------------------------------ From: Miaohe Lin <linmiaohe(a)huawei.com> Subject: mm/memory-failure: fix handling of dissolved but not taken off from buddy pages Date: Thu, 23 May 2024 15:12:17 +0800 When I did memory failure tests recently, below panic occurs: page: refcount:0 mapcount:0 mapping:0000000000000000 index:0x0 pfn:0x8cee00 flags: 0x6fffe0000000000(node=1|zone=2|lastcpupid=0x7fff) raw: 06fffe0000000000 dead000000000100 dead000000000122 0000000000000000 raw: 0000000000000000 0000000000000009 00000000ffffffff 0000000000000000 page dumped because: VM_BUG_ON_PAGE(!PageBuddy(page)) ------------[ cut here ]------------ kernel BUG at include/linux/page-flags.h:1009! invalid opcode: 0000 [#1] PREEMPT SMP NOPTI RIP: 0010:__del_page_from_free_list+0x151/0x180 RSP: 0018:ffffa49c90437998 EFLAGS: 00000046 RAX: 0000000000000035 RBX: 0000000000000009 RCX: ffff8dd8dfd1c9c8 RDX: 0000000000000000 RSI: 0000000000000027 RDI: ffff8dd8dfd1c9c0 RBP: ffffd901233b8000 R08: ffffffffab5511f8 R09: 0000000000008c69 R10: 0000000000003c15 R11: ffffffffab5511f8 R12: ffff8dd8fffc0c80 R13: 0000000000000001 R14: ffff8dd8fffc0c80 R15: 0000000000000009 FS: 00007ff916304740(0000) GS:ffff8dd8dfd00000(0000) knlGS:0000000000000000 CS: 0010 DS: 0000 ES: 0000 CR0: 0000000080050033 CR2: 000055eae50124c8 CR3: 00000008479e0000 CR4: 00000000000006f0 Call Trace: <TASK> __rmqueue_pcplist+0x23b/0x520 get_page_from_freelist+0x26b/0xe40 __alloc_pages_noprof+0x113/0x1120 __folio_alloc_noprof+0x11/0xb0 alloc_buddy_hugetlb_folio.isra.0+0x5a/0x130 __alloc_fresh_hugetlb_folio+0xe7/0x140 alloc_pool_huge_folio+0x68/0x100 set_max_huge_pages+0x13d/0x340 hugetlb_sysctl_handler_common+0xe8/0x110 proc_sys_call_handler+0x194/0x280 vfs_write+0x387/0x550 ksys_write+0x64/0xe0 do_syscall_64+0xc2/0x1d0 entry_SYSCALL_64_after_hwframe+0x77/0x7f RIP: 0033:0x7ff916114887 RSP: 002b:00007ffec8a2fd78 EFLAGS: 00000246 ORIG_RAX: 0000000000000001 RAX: ffffffffffffffda RBX: 000055eae500e350 RCX: 00007ff916114887 RDX: 0000000000000004 RSI: 000055eae500e390 RDI: 0000000000000003 RBP: 000055eae50104c0 R08: 0000000000000000 R09: 000055eae50104c0 R10: 0000000000000077 R11: 0000000000000246 R12: 0000000000000004 R13: 0000000000000004 R14: 00007ff916216b80 R15: 00007ff916216a00 </TASK> Modules linked in: mce_inject hwpoison_inject ---[ end trace 0000000000000000 ]--- And before the panic, there had an warning about bad page state: BUG: Bad page state in process page-types pfn:8cee00 page: refcount:0 mapcount:0 mapping:0000000000000000 index:0x0 pfn:0x8cee00 flags: 0x6fffe0000000000(node=1|zone=2|lastcpupid=0x7fff) page_type: 0xffffff7f(buddy) raw: 06fffe0000000000 ffffd901241c0008 ffffd901240f8008 0000000000000000 raw: 0000000000000000 0000000000000009 00000000ffffff7f 0000000000000000 page dumped because: nonzero mapcount Modules linked in: mce_inject hwpoison_inject CPU: 8 PID: 154211 Comm: page-types Not tainted 6.9.0-rc4-00499-g5544ec3178e2-dirty #22 Call Trace: <TASK> dump_stack_lvl+0x83/0xa0 bad_page+0x63/0xf0 free_unref_page+0x36e/0x5c0 unpoison_memory+0x50b/0x630 simple_attr_write_xsigned.constprop.0.isra.0+0xb3/0x110 debugfs_attr_write+0x42/0x60 full_proxy_write+0x5b/0x80 vfs_write+0xcd/0x550 ksys_write+0x64/0xe0 do_syscall_64+0xc2/0x1d0 entry_SYSCALL_64_after_hwframe+0x77/0x7f RIP: 0033:0x7f189a514887 RSP: 002b:00007ffdcd899718 EFLAGS: 00000246 ORIG_RAX: 0000000000000001 RAX: ffffffffffffffda RBX: 0000000000000000 RCX: 00007f189a514887 RDX: 0000000000000009 RSI: 00007ffdcd899730 RDI: 0000000000000003 RBP: 00007ffdcd8997a0 R08: 0000000000000000 R09: 00007ffdcd8994b2 R10: 0000000000000000 R11: 0000000000000246 R12: 00007ffdcda199a8 R13: 0000000000404af1 R14: 000000000040ad78 R15: 00007f189a7a5040 </TASK> The root cause should be the below race: memory_failure try_memory_failure_hugetlb me_huge_page __page_handle_poison dissolve_free_hugetlb_folio drain_all_pages -- Buddy page can be isolated e.g. for compaction. take_page_off_buddy -- Failed as page is not in the buddy list. -- Page can be putback into buddy after compaction. page_ref_inc -- Leads to buddy page with refcnt = 1. Then unpoison_memory() can unpoison the page and send the buddy page back into buddy list again leading to the above bad page state warning. And bad_page() will call page_mapcount_reset() to remove PageBuddy from buddy page leading to later VM_BUG_ON_PAGE(!PageBuddy(page)) when trying to allocate this page. Fix this issue by only treating __page_handle_poison() as successful when it returns 1. Link: https://lkml.kernel.org/r/20240523071217.1696196-1-linmiaohe@huawei.com Fixes: ceaf8fbea79a ("mm, hwpoison: skip raw hwpoison page in freeing 1GB hugepage") Signed-off-by: Miaohe Lin <linmiaohe(a)huawei.com> Cc: Naoya Horiguchi <nao.horiguchi(a)gmail.com> Cc: <stable(a)vger.kernel.org> Signed-off-by: Andrew Morton <akpm(a)linux-foundation.org> --- mm/memory-failure.c | 4 ++-- 1 file changed, 2 insertions(+), 2 deletions(-) --- a/mm/memory-failure.c~mm-memory-failure-fix-handling-of-dissolved-but-not-taken-off-from-buddy-pages +++ a/mm/memory-failure.c @@ -1221,7 +1221,7 @@ static int me_huge_page(struct page_stat * subpages. */ folio_put(folio); - if (__page_handle_poison(p) >= 0) { + if (__page_handle_poison(p) > 0) { page_ref_inc(p); res = MF_RECOVERED; } else { @@ -2091,7 +2091,7 @@ retry: */ if (res == 0) { folio_unlock(folio); - if (__page_handle_poison(p) >= 0) { + if (__page_handle_poison(p) > 0) { page_ref_inc(p); res = MF_RECOVERED; } else { _ Patches currently in -mm which might be from linmiaohe(a)huawei.com are

1 year, 6 months

1
0
0 0

[merged mm-hotfixes-stable] mm-proc-pid-smaps_rollup-avoid-skipping-vma-after-getting-mmap_lock-again.patch removed from -mm tree

by Andrew Morton

The quilt patch titled Subject: mm: /proc/pid/smaps_rollup: avoid skipping vma after getting mmap_lock again has been removed from the -mm tree. Its filename was mm-proc-pid-smaps_rollup-avoid-skipping-vma-after-getting-mmap_lock-again.patch This patch was dropped because it was merged into the mm-hotfixes-stable branch of git://git.kernel.org/pub/scm/linux/kernel/git/akpm/mm ------------------------------------------------------ From: Yuanyuan Zhong <yzhong(a)purestorage.com> Subject: mm: /proc/pid/smaps_rollup: avoid skipping vma after getting mmap_lock again Date: Thu, 23 May 2024 12:35:31 -0600 After switching smaps_rollup to use VMA iterator, searching for next entry is part of the condition expression of the do-while loop. So the current VMA needs to be addressed before the continue statement. Otherwise, with some VMAs skipped, userspace observed memory consumption from /proc/pid/smaps_rollup will be smaller than the sum of the corresponding fields from /proc/pid/smaps. Link: https://lkml.kernel.org/r/20240523183531.2535436-1-yzhong@purestorage.com Fixes: c4c84f06285e ("fs/proc/task_mmu: stop using linked list and highest_vm_end") Signed-off-by: Yuanyuan Zhong <yzhong(a)purestorage.com> Reviewed-by: Mohamed Khalfella <mkhalfella(a)purestorage.com> Cc: David Hildenbrand <david(a)redhat.com> Cc: Matthew Wilcox (Oracle) <willy(a)infradead.org> Cc: <stable(a)vger.kernel.org> Signed-off-by: Andrew Morton <akpm(a)linux-foundation.org> --- fs/proc/task_mmu.c | 9 +++++++-- 1 file changed, 7 insertions(+), 2 deletions(-) --- a/fs/proc/task_mmu.c~mm-proc-pid-smaps_rollup-avoid-skipping-vma-after-getting-mmap_lock-again +++ a/fs/proc/task_mmu.c @@ -970,12 +970,17 @@ static int show_smaps_rollup(struct seq_ break; /* Case 1 and 2 above */ - if (vma->vm_start >= last_vma_end) + if (vma->vm_start >= last_vma_end) { + smap_gather_stats(vma, &mss, 0); + last_vma_end = vma->vm_end; continue; + } /* Case 4 above */ - if (vma->vm_end > last_vma_end) + if (vma->vm_end > last_vma_end) { smap_gather_stats(vma, &mss, last_vma_end); + last_vma_end = vma->vm_end; + } } } for_each_vma(vmi, vma); _ Patches currently in -mm which might be from yzhong(a)purestorage.com are

1 year, 6 months

1
0
0 0

[merged mm-hotfixes-stable] nilfs2-fix-potential-hang-in-nilfs_detach_log_writer.patch removed from -mm tree

by Andrew Morton

The quilt patch titled Subject: nilfs2: fix potential hang in nilfs_detach_log_writer() has been removed from the -mm tree. Its filename was nilfs2-fix-potential-hang-in-nilfs_detach_log_writer.patch This patch was dropped because it was merged into the mm-hotfixes-stable branch of git://git.kernel.org/pub/scm/linux/kernel/git/akpm/mm ------------------------------------------------------ From: Ryusuke Konishi <konishi.ryusuke(a)gmail.com> Subject: nilfs2: fix potential hang in nilfs_detach_log_writer() Date: Mon, 20 May 2024 22:26:21 +0900 Syzbot has reported a potential hang in nilfs_detach_log_writer() called during nilfs2 unmount. Analysis revealed that this is because nilfs_segctor_sync(), which synchronizes with the log writer thread, can be called after nilfs_segctor_destroy() terminates that thread, as shown in the call trace below: nilfs_detach_log_writer nilfs_segctor_destroy nilfs_segctor_kill_thread --> Shut down log writer thread flush_work nilfs_iput_work_func nilfs_dispose_list iput nilfs_evict_inode nilfs_transaction_commit nilfs_construct_segment (if inode needs sync) nilfs_segctor_sync --> Attempt to synchronize with log writer thread *** DEADLOCK *** Fix this issue by changing nilfs_segctor_sync() so that the log writer thread returns normally without synchronizing after it terminates, and by forcing tasks that are already waiting to complete once after the thread terminates. The skipped inode metadata flushout will then be processed together in the subsequent cleanup work in nilfs_segctor_destroy(). Link: https://lkml.kernel.org/r/20240520132621.4054-4-konishi.ryusuke@gmail.com Signed-off-by: Ryusuke Konishi <konishi.ryusuke(a)gmail.com> Reported-by: syzbot+e3973c409251e136fdd0(a)syzkaller.appspotmail.com Closes: https://syzkaller.appspot.com/bug?extid=e3973c409251e136fdd0 Tested-by: Ryusuke Konishi <konishi.ryusuke(a)gmail.com> Cc: <stable(a)vger.kernel.org> Cc: "Bai, Shuangpeng" <sjb7183(a)psu.edu> Signed-off-by: Andrew Morton <akpm(a)linux-foundation.org> --- fs/nilfs2/segment.c | 21 ++++++++++++++++++--- 1 file changed, 18 insertions(+), 3 deletions(-) --- a/fs/nilfs2/segment.c~nilfs2-fix-potential-hang-in-nilfs_detach_log_writer +++ a/fs/nilfs2/segment.c @@ -2190,6 +2190,14 @@ static int nilfs_segctor_sync(struct nil for (;;) { set_current_state(TASK_INTERRUPTIBLE); + /* + * Synchronize only while the log writer thread is alive. + * Leave flushing out after the log writer thread exits to + * the cleanup work in nilfs_segctor_destroy(). + */ + if (!sci->sc_task) + break; + if (atomic_read(&wait_req.done)) { err = wait_req.err; break; @@ -2205,7 +2213,7 @@ static int nilfs_segctor_sync(struct nil return err; } -static void nilfs_segctor_wakeup(struct nilfs_sc_info *sci, int err) +static void nilfs_segctor_wakeup(struct nilfs_sc_info *sci, int err, bool force) { struct nilfs_segctor_wait_request *wrq, *n; unsigned long flags; @@ -2213,7 +2221,7 @@ static void nilfs_segctor_wakeup(struct spin_lock_irqsave(&sci->sc_wait_request.lock, flags); list_for_each_entry_safe(wrq, n, &sci->sc_wait_request.head, wq.entry) { if (!atomic_read(&wrq->done) && - nilfs_cnt32_ge(sci->sc_seq_done, wrq->seq)) { + (force || nilfs_cnt32_ge(sci->sc_seq_done, wrq->seq))) { wrq->err = err; atomic_set(&wrq->done, 1); } @@ -2362,7 +2370,7 @@ static void nilfs_segctor_notify(struct if (mode == SC_LSEG_SR) { sci->sc_state &= ~NILFS_SEGCTOR_COMMIT; sci->sc_seq_done = sci->sc_seq_accepted; - nilfs_segctor_wakeup(sci, err); + nilfs_segctor_wakeup(sci, err, false); sci->sc_flush_request = 0; } else { if (mode == SC_FLUSH_FILE) @@ -2746,6 +2754,13 @@ static void nilfs_segctor_destroy(struct || sci->sc_seq_request != sci->sc_seq_done); spin_unlock(&sci->sc_state_lock); + /* + * Forcibly wake up tasks waiting in nilfs_segctor_sync(), which can + * be called from delayed iput() via nilfs_evict_inode() and can race + * with the above log writer thread termination. + */ + nilfs_segctor_wakeup(sci, 0, true); + if (flush_work(&sci->sc_iput_work)) flag = true; _ Patches currently in -mm which might be from konishi.ryusuke(a)gmail.com are

1 year, 6 months

1
0
0 0

[merged mm-hotfixes-stable] nilfs2-fix-unexpected-freezing-of-nilfs_segctor_sync.patch removed from -mm tree

by Andrew Morton

The quilt patch titled Subject: nilfs2: fix unexpected freezing of nilfs_segctor_sync() has been removed from the -mm tree. Its filename was nilfs2-fix-unexpected-freezing-of-nilfs_segctor_sync.patch This patch was dropped because it was merged into the mm-hotfixes-stable branch of git://git.kernel.org/pub/scm/linux/kernel/git/akpm/mm ------------------------------------------------------ From: Ryusuke Konishi <konishi.ryusuke(a)gmail.com> Subject: nilfs2: fix unexpected freezing of nilfs_segctor_sync() Date: Mon, 20 May 2024 22:26:20 +0900 A potential and reproducible race issue has been identified where nilfs_segctor_sync() would block even after the log writer thread writes a checkpoint, unless there is an interrupt or other trigger to resume log writing. This turned out to be because, depending on the execution timing of the log writer thread running in parallel, the log writer thread may skip responding to nilfs_segctor_sync(), which causes a call to schedule() waiting for completion within nilfs_segctor_sync() to lose the opportunity to wake up. The reason why waking up the task waiting in nilfs_segctor_sync() may be skipped is that updating the request generation issued using a shared sequence counter and adding an wait queue entry to the request wait queue to the log writer, are not done atomically. There is a possibility that log writing and request completion notification by nilfs_segctor_wakeup() may occur between the two operations, and in that case, the wait queue entry is not yet visible to nilfs_segctor_wakeup() and the wake-up of nilfs_segctor_sync() will be carried over until the next request occurs. Fix this issue by performing these two operations simultaneously within the lock section of sc_state_lock. Also, following the memory barrier guidelines for event waiting loops, move the call to set_current_state() in the same location into the event waiting loop to ensure that a memory barrier is inserted just before the event condition determination. Link: https://lkml.kernel.org/r/20240520132621.4054-3-konishi.ryusuke@gmail.com Fixes: 9ff05123e3bf ("nilfs2: segment constructor") Signed-off-by: Ryusuke Konishi <konishi.ryusuke(a)gmail.com> Tested-by: Ryusuke Konishi <konishi.ryusuke(a)gmail.com> Cc: <stable(a)vger.kernel.org> Cc: "Bai, Shuangpeng" <sjb7183(a)psu.edu> Signed-off-by: Andrew Morton <akpm(a)linux-foundation.org> --- fs/nilfs2/segment.c | 17 +++++++++++++---- 1 file changed, 13 insertions(+), 4 deletions(-) --- a/fs/nilfs2/segment.c~nilfs2-fix-unexpected-freezing-of-nilfs_segctor_sync +++ a/fs/nilfs2/segment.c @@ -2168,19 +2168,28 @@ static int nilfs_segctor_sync(struct nil struct nilfs_segctor_wait_request wait_req; int err = 0; - spin_lock(&sci->sc_state_lock); init_wait(&wait_req.wq); wait_req.err = 0; atomic_set(&wait_req.done, 0); + init_waitqueue_entry(&wait_req.wq, current); + + /* + * To prevent a race issue where completion notifications from the + * log writer thread are missed, increment the request sequence count + * "sc_seq_request" and insert a wait queue entry using the current + * sequence number into the "sc_wait_request" queue at the same time + * within the lock section of "sc_state_lock". + */ + spin_lock(&sci->sc_state_lock); wait_req.seq = ++sci->sc_seq_request; + add_wait_queue(&sci->sc_wait_request, &wait_req.wq); spin_unlock(&sci->sc_state_lock); - init_waitqueue_entry(&wait_req.wq, current); - add_wait_queue(&sci->sc_wait_request, &wait_req.wq); - set_current_state(TASK_INTERRUPTIBLE); wake_up(&sci->sc_wait_daemon); for (;;) { + set_current_state(TASK_INTERRUPTIBLE); + if (atomic_read(&wait_req.done)) { err = wait_req.err; break; _ Patches currently in -mm which might be from konishi.ryusuke(a)gmail.com are

1 year, 6 months

1
0
0 0

[merged mm-hotfixes-stable] nilfs2-fix-use-after-free-of-timer-for-log-writer-thread.patch removed from -mm tree

by Andrew Morton

The quilt patch titled Subject: nilfs2: fix use-after-free of timer for log writer thread has been removed from the -mm tree. Its filename was nilfs2-fix-use-after-free-of-timer-for-log-writer-thread.patch This patch was dropped because it was merged into the mm-hotfixes-stable branch of git://git.kernel.org/pub/scm/linux/kernel/git/akpm/mm ------------------------------------------------------ From: Ryusuke Konishi <konishi.ryusuke(a)gmail.com> Subject: nilfs2: fix use-after-free of timer for log writer thread Date: Mon, 20 May 2024 22:26:19 +0900 Patch series "nilfs2: fix log writer related issues". This bug fix series covers three nilfs2 log writer-related issues, including a timer use-after-free issue and potential deadlock issue on unmount, and a potential freeze issue in event synchronization found during their analysis. Details are described in each commit log. This patch (of 3): A use-after-free issue has been reported regarding the timer sc_timer on the nilfs_sc_info structure. The problem is that even though it is used to wake up a sleeping log writer thread, sc_timer is not shut down until the nilfs_sc_info structure is about to be freed, and is used regardless of the thread's lifetime. Fix this issue by limiting the use of sc_timer only while the log writer thread is alive. Link: https://lkml.kernel.org/r/20240520132621.4054-1-konishi.ryusuke@gmail.com Link: https://lkml.kernel.org/r/20240520132621.4054-2-konishi.ryusuke@gmail.com Fixes: fdce895ea5dd ("nilfs2: change sc_timer from a pointer to an embedded one in struct nilfs_sc_info") Signed-off-by: Ryusuke Konishi <konishi.ryusuke(a)gmail.com> Reported-by: "Bai, Shuangpeng" <sjb7183(a)psu.edu> Closes: https://groups.google.com/g/syzkaller/c/MK_LYqtt8ko/m/8rgdWeseAwAJ Tested-by: Ryusuke Konishi <konishi.ryusuke(a)gmail.com> Cc: <stable(a)vger.kernel.org> Signed-off-by: Andrew Morton <akpm(a)linux-foundation.org> --- fs/nilfs2/segment.c | 25 +++++++++++++++++++------ 1 file changed, 19 insertions(+), 6 deletions(-) --- a/fs/nilfs2/segment.c~nilfs2-fix-use-after-free-of-timer-for-log-writer-thread +++ a/fs/nilfs2/segment.c @@ -2118,8 +2118,10 @@ static void nilfs_segctor_start_timer(st { spin_lock(&sci->sc_state_lock); if (!(sci->sc_state & NILFS_SEGCTOR_COMMIT)) { - sci->sc_timer.expires = jiffies + sci->sc_interval; - add_timer(&sci->sc_timer); + if (sci->sc_task) { + sci->sc_timer.expires = jiffies + sci->sc_interval; + add_timer(&sci->sc_timer); + } sci->sc_state |= NILFS_SEGCTOR_COMMIT; } spin_unlock(&sci->sc_state_lock); @@ -2320,10 +2322,21 @@ int nilfs_construct_dsync_segment(struct */ static void nilfs_segctor_accept(struct nilfs_sc_info *sci) { + bool thread_is_alive; + spin_lock(&sci->sc_state_lock); sci->sc_seq_accepted = sci->sc_seq_request; + thread_is_alive = (bool)sci->sc_task; spin_unlock(&sci->sc_state_lock); - del_timer_sync(&sci->sc_timer); + + /* + * This function does not race with the log writer thread's + * termination. Therefore, deleting sc_timer, which should not be + * done after the log writer thread exits, can be done safely outside + * the area protected by sc_state_lock. + */ + if (thread_is_alive) + del_timer_sync(&sci->sc_timer); } /** @@ -2349,7 +2362,7 @@ static void nilfs_segctor_notify(struct sci->sc_flush_request &= ~FLUSH_DAT_BIT; /* re-enable timer if checkpoint creation was not done */ - if ((sci->sc_state & NILFS_SEGCTOR_COMMIT) && + if ((sci->sc_state & NILFS_SEGCTOR_COMMIT) && sci->sc_task && time_before(jiffies, sci->sc_timer.expires)) add_timer(&sci->sc_timer); } @@ -2539,6 +2552,7 @@ static int nilfs_segctor_thread(void *ar int timeout = 0; sci->sc_timer_task = current; + timer_setup(&sci->sc_timer, nilfs_construction_timeout, 0); /* start sync. */ sci->sc_task = current; @@ -2606,6 +2620,7 @@ static int nilfs_segctor_thread(void *ar end_thread: /* end sync. */ sci->sc_task = NULL; + timer_shutdown_sync(&sci->sc_timer); wake_up(&sci->sc_wait_task); /* for nilfs_segctor_kill_thread() */ spin_unlock(&sci->sc_state_lock); return 0; @@ -2669,7 +2684,6 @@ static struct nilfs_sc_info *nilfs_segct INIT_LIST_HEAD(&sci->sc_gc_inodes); INIT_LIST_HEAD(&sci->sc_iput_queue); INIT_WORK(&sci->sc_iput_work, nilfs_iput_work_func); - timer_setup(&sci->sc_timer, nilfs_construction_timeout, 0); sci->sc_interval = HZ * NILFS_SC_DEFAULT_TIMEOUT; sci->sc_mjcp_freq = HZ * NILFS_SC_DEFAULT_SR_FREQ; @@ -2748,7 +2762,6 @@ static void nilfs_segctor_destroy(struct down_write(&nilfs->ns_segctor_sem); - timer_shutdown_sync(&sci->sc_timer); kfree(sci); } _ Patches currently in -mm which might be from konishi.ryusuke(a)gmail.com are

1 year, 6 months

1
0
0 0

[merged mm-hotfixes-stable] selftests-mm-fix-build-warnings-on-ppc64.patch removed from -mm tree

by Andrew Morton

The quilt patch titled Subject: selftests/mm: fix build warnings on ppc64 has been removed from the -mm tree. Its filename was selftests-mm-fix-build-warnings-on-ppc64.patch This patch was dropped because it was merged into the mm-hotfixes-stable branch of git://git.kernel.org/pub/scm/linux/kernel/git/akpm/mm ------------------------------------------------------ From: Michael Ellerman <mpe(a)ellerman.id.au> Subject: selftests/mm: fix build warnings on ppc64 Date: Tue, 21 May 2024 13:02:19 +1000 Fix warnings like: In file included from uffd-unit-tests.c:8: uffd-unit-tests.c: In function `uffd_poison_handle_fault': uffd-common.h:45:33: warning: format `%llu' expects argument of type `long long unsigned int', but argument 3 has type `__u64' {aka `long unsigned int'} [-Wformat=] By switching to unsigned long long for u64 for ppc64 builds. Link: https://lkml.kernel.org/r/20240521030219.57439-1-mpe@ellerman.id.au Signed-off-by: Michael Ellerman <mpe(a)ellerman.id.au> Cc: Shuah Khan <skhan(a)linuxfoundation.org> Cc: <stable(a)vger.kernel.org> Signed-off-by: Andrew Morton <akpm(a)linux-foundation.org> --- tools/testing/selftests/mm/gup_test.c | 1 + tools/testing/selftests/mm/uffd-common.h | 1 + 2 files changed, 2 insertions(+) --- a/tools/testing/selftests/mm/gup_test.c~selftests-mm-fix-build-warnings-on-ppc64 +++ a/tools/testing/selftests/mm/gup_test.c @@ -1,3 +1,4 @@ +#define __SANE_USERSPACE_TYPES__ // Use ll64 #include <fcntl.h> #include <errno.h> #include <stdio.h> --- a/tools/testing/selftests/mm/uffd-common.h~selftests-mm-fix-build-warnings-on-ppc64 +++ a/tools/testing/selftests/mm/uffd-common.h @@ -8,6 +8,7 @@ #define __UFFD_COMMON_H__ #define _GNU_SOURCE +#define __SANE_USERSPACE_TYPES__ // Use ll64 #include <stdio.h> #include <errno.h> #include <unistd.h> _ Patches currently in -mm which might be from mpe(a)ellerman.id.au are

1 year, 6 months

1
0
0 0

[merged mm-hotfixes-stable] selftests-mm-compaction_test-fix-bogus-test-success-and-reduce-probability-of-oom-killer-invocation.patch removed from -mm tree

by Andrew Morton

The quilt patch titled Subject: selftests/mm: compaction_test: fix bogus test success and reduce probability of OOM-killer invocation has been removed from the -mm tree. Its filename was selftests-mm-compaction_test-fix-bogus-test-success-and-reduce-probability-of-oom-killer-invocation.patch This patch was dropped because it was merged into the mm-hotfixes-stable branch of git://git.kernel.org/pub/scm/linux/kernel/git/akpm/mm ------------------------------------------------------ From: Dev Jain <dev.jain(a)arm.com> Subject: selftests/mm: compaction_test: fix bogus test success and reduce probability of OOM-killer invocation Date: Tue, 21 May 2024 13:13:58 +0530 Reset nr_hugepages to zero before the start of the test. If a non-zero number of hugepages is already set before the start of the test, the following problems arise: - The probability of the test getting OOM-killed increases. Proof: The test wants to run on 80% of available memory to prevent OOM-killing (see original code comments). Let the value of mem_free at the start of the test, when nr_hugepages = 0, be x. In the other case, when nr_hugepages > 0, let the memory consumed by hugepages be y. In the former case, the test operates on 0.8 * x of memory. In the latter, the test operates on 0.8 * (x - y) of memory, with y already filled, hence, memory consumed is y + 0.8 * (x - y) = 0.8 * x + 0.2 * y > 0.8 * x. Q.E.D - The probability of a bogus test success increases. Proof: Let the memory consumed by hugepages be greater than 25% of x, with x and y defined as above. The definition of compaction_index is c_index = (x - y)/z where z is the memory consumed by hugepages after trying to increase them again. In check_compaction(), we set the number of hugepages to zero, and then increase them back; the probability that they will be set back to consume at least y amount of memory again is very high (since there is not much delay between the two attempts of changing nr_hugepages). Hence, z >= y > (x/4) (by the 25% assumption). Therefore, c_index = (x - y)/z <= (x - y)/y = x/y - 1 < 4 - 1 = 3 hence, c_index can always be forced to be less than 3, thereby the test succeeding always. Q.E.D Link: https://lkml.kernel.org/r/20240521074358.675031-4-dev.jain@arm.com Fixes: bd67d5c15cc1 ("Test compaction of mlocked memory") Signed-off-by: Dev Jain <dev.jain(a)arm.com> Cc: <stable(a)vger.kernel.org> Cc: Anshuman Khandual <anshuman.khandual(a)arm.com> Cc: Shuah Khan <shuah(a)kernel.org> Cc: Sri Jayaramappa <sjayaram(a)akamai.com> Signed-off-by: Andrew Morton <akpm(a)linux-foundation.org> --- tools/testing/selftests/mm/compaction_test.c | 71 +++++++++++------ 1 file changed, 49 insertions(+), 22 deletions(-) --- a/tools/testing/selftests/mm/compaction_test.c~selftests-mm-compaction_test-fix-bogus-test-success-and-reduce-probability-of-oom-killer-invocation +++ a/tools/testing/selftests/mm/compaction_test.c @@ -82,13 +82,16 @@ int prereq(void) return -1; } -int check_compaction(unsigned long mem_free, unsigned long hugepage_size) +int check_compaction(unsigned long mem_free, unsigned long hugepage_size, + unsigned long initial_nr_hugepages) { unsigned long nr_hugepages_ul; int fd, ret = -1; int compaction_index = 0; - char initial_nr_hugepages[20] = {0}; char nr_hugepages[20] = {0}; + char init_nr_hugepages[20] = {0}; + + sprintf(init_nr_hugepages, "%lu", initial_nr_hugepages); /* We want to test with 80% of available memory. Else, OOM killer comes in to play */ @@ -102,23 +105,6 @@ int check_compaction(unsigned long mem_f goto out; } - if (read(fd, initial_nr_hugepages, sizeof(initial_nr_hugepages)) <= 0) { - ksft_print_msg("Failed to read from /proc/sys/vm/nr_hugepages: %s\n", - strerror(errno)); - goto close_fd; - } - - lseek(fd, 0, SEEK_SET); - - /* Start with the initial condition of 0 huge pages*/ - if (write(fd, "0", sizeof(char)) != sizeof(char)) { - ksft_print_msg("Failed to write 0 to /proc/sys/vm/nr_hugepages: %s\n", - strerror(errno)); - goto close_fd; - } - - lseek(fd, 0, SEEK_SET); - /* Request a large number of huge pages. The Kernel will allocate as much as it can */ if (write(fd, "100000", (6*sizeof(char))) != (6*sizeof(char))) { @@ -146,8 +132,8 @@ int check_compaction(unsigned long mem_f lseek(fd, 0, SEEK_SET); - if (write(fd, initial_nr_hugepages, strlen(initial_nr_hugepages)) - != strlen(initial_nr_hugepages)) { + if (write(fd, init_nr_hugepages, strlen(init_nr_hugepages)) + != strlen(init_nr_hugepages)) { ksft_print_msg("Failed to write value to /proc/sys/vm/nr_hugepages: %s\n", strerror(errno)); goto close_fd; @@ -171,6 +157,41 @@ int check_compaction(unsigned long mem_f return ret; } +int set_zero_hugepages(unsigned long *initial_nr_hugepages) +{ + int fd, ret = -1; + char nr_hugepages[20] = {0}; + + fd = open("/proc/sys/vm/nr_hugepages", O_RDWR | O_NONBLOCK); + if (fd < 0) { + ksft_print_msg("Failed to open /proc/sys/vm/nr_hugepages: %s\n", + strerror(errno)); + goto out; + } + if (read(fd, nr_hugepages, sizeof(nr_hugepages)) <= 0) { + ksft_print_msg("Failed to read from /proc/sys/vm/nr_hugepages: %s\n", + strerror(errno)); + goto close_fd; + } + + lseek(fd, 0, SEEK_SET); + + /* Start with the initial condition of 0 huge pages */ + if (write(fd, "0", sizeof(char)) != sizeof(char)) { + ksft_print_msg("Failed to write 0 to /proc/sys/vm/nr_hugepages: %s\n", + strerror(errno)); + goto close_fd; + } + + *initial_nr_hugepages = strtoul(nr_hugepages, NULL, 10); + ret = 0; + + close_fd: + close(fd); + + out: + return ret; +} int main(int argc, char **argv) { @@ -181,6 +202,7 @@ int main(int argc, char **argv) unsigned long mem_free = 0; unsigned long hugepage_size = 0; long mem_fragmentable_MB = 0; + unsigned long initial_nr_hugepages; ksft_print_header(); @@ -189,6 +211,10 @@ int main(int argc, char **argv) ksft_set_plan(1); + /* Start the test without hugepages reducing mem_free */ + if (set_zero_hugepages(&initial_nr_hugepages)) + ksft_exit_fail(); + lim.rlim_cur = RLIM_INFINITY; lim.rlim_max = RLIM_INFINITY; if (setrlimit(RLIMIT_MEMLOCK, &lim)) @@ -232,7 +258,8 @@ int main(int argc, char **argv) entry = entry->next; } - if (check_compaction(mem_free, hugepage_size) == 0) + if (check_compaction(mem_free, hugepage_size, + initial_nr_hugepages) == 0) ksft_exit_pass(); ksft_exit_fail(); _ Patches currently in -mm which might be from dev.jain(a)arm.com are selftests-mm-va_high_addr_switch-reduce-test-noise.patch selftests-mm-va_high_addr_switch-dynamically-initialize-testcases-to-enable-lpa2-testing.patch

1 year, 6 months

1
0
0 0

[merged mm-hotfixes-stable] selftests-mm-compaction_test-fix-incorrect-write-of-zero-to-nr_hugepages.patch removed from -mm tree

by Andrew Morton

The quilt patch titled Subject: selftests/mm: compaction_test: fix incorrect write of zero to nr_hugepages has been removed from the -mm tree. Its filename was selftests-mm-compaction_test-fix-incorrect-write-of-zero-to-nr_hugepages.patch This patch was dropped because it was merged into the mm-hotfixes-stable branch of git://git.kernel.org/pub/scm/linux/kernel/git/akpm/mm ------------------------------------------------------ From: Dev Jain <dev.jain(a)arm.com> Subject: selftests/mm: compaction_test: fix incorrect write of zero to nr_hugepages Date: Tue, 21 May 2024 13:13:57 +0530 Currently, the test tries to set nr_hugepages to zero, but that is not actually done because the file offset is not reset after read(). Fix that using lseek(). Link: https://lkml.kernel.org/r/20240521074358.675031-3-dev.jain@arm.com Fixes: bd67d5c15cc1 ("Test compaction of mlocked memory") Signed-off-by: Dev Jain <dev.jain(a)arm.com> Cc: <stable(a)vger.kernel.org> Cc: Anshuman Khandual <anshuman.khandual(a)arm.com> Cc: Shuah Khan <shuah(a)kernel.org> Cc: Sri Jayaramappa <sjayaram(a)akamai.com> Signed-off-by: Andrew Morton <akpm(a)linux-foundation.org> --- tools/testing/selftests/mm/compaction_test.c | 2 ++ 1 file changed, 2 insertions(+) --- a/tools/testing/selftests/mm/compaction_test.c~selftests-mm-compaction_test-fix-incorrect-write-of-zero-to-nr_hugepages +++ a/tools/testing/selftests/mm/compaction_test.c @@ -108,6 +108,8 @@ int check_compaction(unsigned long mem_f goto close_fd; } + lseek(fd, 0, SEEK_SET); + /* Start with the initial condition of 0 huge pages*/ if (write(fd, "0", sizeof(char)) != sizeof(char)) { ksft_print_msg("Failed to write 0 to /proc/sys/vm/nr_hugepages: %s\n", _ Patches currently in -mm which might be from dev.jain(a)arm.com are selftests-mm-va_high_addr_switch-reduce-test-noise.patch selftests-mm-va_high_addr_switch-dynamically-initialize-testcases-to-enable-lpa2-testing.patch

1 year, 6 months

1
0
0 0

2025

2024

2023

2022

2021

2020

2019

2018

2017

Linux-stable-mirror May 2024