This is the start of the stable review cycle for the 6.1.27 release. There are 16 patches in this series, all will be posted as a response to this one. If anyone has any issues with these being applied, please let me know.
Responses should be made by Sun, 30 Apr 2023 11:20:30 +0000. Anything received after that time might be too late.
The whole patch series can be found in one patch at: https://www.kernel.org/pub/linux/kernel/v6.x/stable-review/patch-6.1.27-rc1.... or in the git tree and branch at: git://git.kernel.org/pub/scm/linux/kernel/git/stable/linux-stable-rc.git linux-6.1.y and the diffstat can be found below.
thanks,
greg k-h
------------- Pseudo-Shortlog of commits:
Greg Kroah-Hartman gregkh@linuxfoundation.org Linux 6.1.27-rc1
Alexandre Ghiti alexghiti@rivosinc.com riscv: No need to relocate the dtb as it lies in the fixmap region
Alexandre Ghiti alexghiti@rivosinc.com riscv: Do not set initial_boot_params to the linear address of the dtb
Alexandre Ghiti alexghiti@rivosinc.com riscv: Move early dtb mapping into the fixmap region
Stephen Boyd swboyd@chromium.org driver core: Don't require dynamic_debug for initcall_debug probe timing
Arınç ÜNAL arinc.unal@arinc9.com USB: serial: option: add UNISOC vendor and TOZED LT70C product
Genjian Zhang zhanggenjian@kylinos.cn btrfs: fix uninitialized variable warnings
Ruihan Li lrh2000@pku.edu.cn bluetooth: Perform careful capability checks in hci_sock_ioctl()
Werner Sembach wse@tuxedocomputers.com gpiolib: acpi: Add a ignore wakeup quirk for Clevo NL5xNU
Daniel Vetter daniel.vetter@ffwll.ch drm/fb-helper: set x/yres_virtual in drm_fb_helper_check_var
Jisoo Jang jisoo.jang@yonsei.ac.kr wifi: brcmfmac: slab-out-of-bounds read in brcmf_get_assoc_ies()
Paolo Abeni pabeni@redhat.com mptcp: fix accept vs worker race
Paolo Abeni pabeni@redhat.com mptcp: stops worker on unaccepted sockets at listener close
Liam R. Howlett Liam.Howlett@oracle.com mm/mempolicy: fix use-after-free of VMA iterator
David Matlack dmatlack@google.com KVM: arm64: Retry fault if vma_lookup() results become invalid
Florian Fainelli f.fainelli@gmail.com phy: phy-brcm-usb: Utilize platform_get_irq_byname_optional()
David Gow davidgow@google.com um: Only disable SSE on clang to work around old GCC bugs
-------------
Diffstat:
Documentation/riscv/vm-layout.rst | 4 +- Makefile | 4 +- arch/arm64/kvm/mmu.c | 47 ++++----- arch/riscv/include/asm/fixmap.h | 8 ++ arch/riscv/include/asm/pgtable.h | 8 +- arch/riscv/kernel/setup.c | 6 +- arch/riscv/mm/init.c | 82 +++++++-------- arch/x86/Makefile.um | 5 + drivers/base/dd.c | 7 +- drivers/gpio/gpiolib-acpi.c | 13 +++ drivers/gpu/drm/drm_fb_helper.c | 3 + .../broadcom/brcm80211/brcmfmac/cfg80211.c | 5 + drivers/phy/broadcom/phy-brcm-usb.c | 4 +- drivers/usb/serial/option.c | 6 ++ fs/btrfs/send.c | 2 +- fs/btrfs/volumes.c | 2 +- mm/mempolicy.c | 115 ++++++++++----------- net/bluetooth/hci_sock.c | 9 +- net/mptcp/protocol.c | 74 ++++++++----- net/mptcp/protocol.h | 2 + net/mptcp/subflow.c | 80 +++++++++++++- 21 files changed, 309 insertions(+), 177 deletions(-)
From: David Gow davidgow@google.com
commit a3046a618a284579d1189af8711765f553eed707 upstream.
As part of the Rust support for UML, we disable SSE (and similar flags) to match the normal x86 builds. This both makes sense (we ideally want a similar configuration to x86), and works around a crash bug with SSE generation under Rust with LLVM.
However, this breaks compiling stdlib.h under gcc < 11, as the x86_64 ABI requires floating-point return values be stored in an SSE register. gcc 11 fixes this by only doing register allocation when a function is actually used, and since we never use atof(), it shouldn't be a problem: https://gcc.gnu.org/bugzilla/show_bug.cgi?id=99652
Nevertheless, only disable SSE on clang setups, as that's a simple way of working around everyone's bugs.
Fixes: 884981867947 ("rust: arch/um: Disable FP/SIMD instruction to match x86") Reported-by: Roberto Sassu roberto.sassu@huaweicloud.com Link: https://lore.kernel.org/linux-um/6df2ecef9011d85654a82acd607fdcbc93ad593c.ca... Tested-by: Roberto Sassu roberto.sassu@huaweicloud.com Tested-by: SeongJae Park sj@kernel.org Signed-off-by: David Gow davidgow@google.com Reviewed-by: Vincenzo Palazzo vincenzopalazzodev@gmail.com Tested-by: Arthur Grillo arthurgrillo@riseup.net Signed-off-by: Richard Weinberger richard@nod.at Signed-off-by: Greg Kroah-Hartman gregkh@linuxfoundation.org --- arch/x86/Makefile.um | 5 +++++ 1 file changed, 5 insertions(+)
--- a/arch/x86/Makefile.um +++ b/arch/x86/Makefile.um @@ -3,9 +3,14 @@ core-y += arch/x86/crypto/
# # Disable SSE and other FP/SIMD instructions to match normal x86 +# This is required to work around issues in older LLVM versions, but breaks +# GCC versions < 11. See: +# https://gcc.gnu.org/bugzilla/show_bug.cgi?id=99652 # +ifeq ($(CONFIG_CC_IS_CLANG),y) KBUILD_CFLAGS += -mno-sse -mno-mmx -mno-sse2 -mno-3dnow -mno-avx KBUILD_RUSTFLAGS += -Ctarget-feature=-sse,-sse2,-sse3,-ssse3,-sse4.1,-sse4.2,-avx,-avx2 +endif
ifeq ($(CONFIG_X86_32),y) START := 0x8048000
From: Florian Fainelli f.fainelli@gmail.com
commit 53bffe0055741440a6c91abb80bad1c62ea443e3 upstream.
The wake-up interrupt lines are entirely optional, avoid printing messages that interrupts were not found by switching to the _optional variant.
Signed-off-by: Florian Fainelli f.fainelli@gmail.com Acked-by: Justin Chen justinpopo6@gmail.com Link: https://lore.kernel.org/r/20221026224450.2958762-1-f.fainelli@gmail.com Signed-off-by: Vinod Koul vkoul@kernel.org Signed-off-by: Greg Kroah-Hartman gregkh@linuxfoundation.org --- drivers/phy/broadcom/phy-brcm-usb.c | 4 ++-- 1 file changed, 2 insertions(+), 2 deletions(-)
--- a/drivers/phy/broadcom/phy-brcm-usb.c +++ b/drivers/phy/broadcom/phy-brcm-usb.c @@ -445,9 +445,9 @@ static int brcm_usb_phy_dvr_init(struct priv->suspend_clk = NULL; }
- priv->wake_irq = platform_get_irq_byname(pdev, "wake"); + priv->wake_irq = platform_get_irq_byname_optional(pdev, "wake"); if (priv->wake_irq < 0) - priv->wake_irq = platform_get_irq_byname(pdev, "wakeup"); + priv->wake_irq = platform_get_irq_byname_optional(pdev, "wakeup"); if (priv->wake_irq >= 0) { err = devm_request_irq(dev, priv->wake_irq, brcm_usb_phy_wake_isr, 0,
From: David Matlack dmatlack@google.com
commit 13ec9308a85702af7c31f3638a2720863848a7f2 upstream.
Read mmu_invalidate_seq before dropping the mmap_lock so that KVM can detect if the results of vma_lookup() (e.g. vma_shift) become stale before it acquires kvm->mmu_lock. This fixes a theoretical bug where a VMA could be changed by userspace after vma_lookup() and before KVM reads the mmu_invalidate_seq, causing KVM to install page table entries based on a (possibly) no-longer-valid vma_shift.
Re-order the MMU cache top-up to earlier in user_mem_abort() so that it is not done after KVM has read mmu_invalidate_seq (i.e. so as to avoid inducing spurious fault retries).
This bug has existed since KVM/ARM's inception. It's unlikely that any sane userspace currently modifies VMAs in such a way as to trigger this race. And even with directed testing I was unable to reproduce it. But a sufficiently motivated host userspace might be able to exploit this race.
Fixes: 94f8e6418d39 ("KVM: ARM: Handle guest faults in KVM") Cc: stable@vger.kernel.org Reported-by: Sean Christopherson seanjc@google.com Signed-off-by: David Matlack dmatlack@google.com Reviewed-by: Marc Zyngier maz@kernel.org Link: https://lore.kernel.org/r/20230313235454.2964067-1-dmatlack@google.com Signed-off-by: Oliver Upton oliver.upton@linux.dev [will: Use FSC_PERM instead of ESR_ELx_FSC_PERM] Signed-off-by: Will Deacon will@kernel.org Signed-off-by: Greg Kroah-Hartman gregkh@linuxfoundation.org --- arch/arm64/kvm/mmu.c | 47 +++++++++++++++++++++-------------------------- 1 file changed, 21 insertions(+), 26 deletions(-)
--- a/arch/arm64/kvm/mmu.c +++ b/arch/arm64/kvm/mmu.c @@ -1179,6 +1179,20 @@ static int user_mem_abort(struct kvm_vcp }
/* + * Permission faults just need to update the existing leaf entry, + * and so normally don't require allocations from the memcache. The + * only exception to this is when dirty logging is enabled at runtime + * and a write fault needs to collapse a block entry into a table. + */ + if (fault_status != FSC_PERM || + (logging_active && write_fault)) { + ret = kvm_mmu_topup_memory_cache(memcache, + kvm_mmu_cache_min_pages(kvm)); + if (ret) + return ret; + } + + /* * Let's check if we will get back a huge page backed by hugetlbfs, or * get block mapping for device MMIO region. */ @@ -1234,36 +1248,17 @@ static int user_mem_abort(struct kvm_vcp fault_ipa &= ~(vma_pagesize - 1);
gfn = fault_ipa >> PAGE_SHIFT; - mmap_read_unlock(current->mm); - - /* - * Permission faults just need to update the existing leaf entry, - * and so normally don't require allocations from the memcache. The - * only exception to this is when dirty logging is enabled at runtime - * and a write fault needs to collapse a block entry into a table. - */ - if (fault_status != FSC_PERM || (logging_active && write_fault)) { - ret = kvm_mmu_topup_memory_cache(memcache, - kvm_mmu_cache_min_pages(kvm)); - if (ret) - return ret; - }
- mmu_seq = vcpu->kvm->mmu_invalidate_seq; /* - * Ensure the read of mmu_invalidate_seq happens before we call - * gfn_to_pfn_prot (which calls get_user_pages), so that we don't risk - * the page we just got a reference to gets unmapped before we have a - * chance to grab the mmu_lock, which ensure that if the page gets - * unmapped afterwards, the call to kvm_unmap_gfn will take it away - * from us again properly. This smp_rmb() interacts with the smp_wmb() - * in kvm_mmu_notifier_invalidate_<page|range_end>. + * Read mmu_invalidate_seq so that KVM can detect if the results of + * vma_lookup() or __gfn_to_pfn_memslot() become stale prior to + * acquiring kvm->mmu_lock. * - * Besides, __gfn_to_pfn_memslot() instead of gfn_to_pfn_prot() is - * used to avoid unnecessary overhead introduced to locate the memory - * slot because it's always fixed even @gfn is adjusted for huge pages. + * Rely on mmap_read_unlock() for an implicit smp_rmb(), which pairs + * with the smp_wmb() in kvm_mmu_invalidate_end(). */ - smp_rmb(); + mmu_seq = vcpu->kvm->mmu_invalidate_seq; + mmap_read_unlock(current->mm);
pfn = __gfn_to_pfn_memslot(memslot, gfn, false, NULL, write_fault, &writable, NULL);
From: Liam R. Howlett Liam.Howlett@oracle.com
commit f4e9e0e69468583c2c6d9d5c7bfc975e292bf188 upstream.
set_mempolicy_home_node() iterates over a list of VMAs and calls mbind_range() on each VMA, which also iterates over the singular list of the VMA passed in and potentially splits the VMA. Since the VMA iterator is not passed through, set_mempolicy_home_node() may now point to a stale node in the VMA tree. This can result in a UAF as reported by syzbot.
Avoid the stale maple tree node by passing the VMA iterator through to the underlying call to split_vma().
mbind_range() is also overly complicated, since there are two calling functions and one already handles iterating over the VMAs. Simplify mbind_range() to only handle merging and splitting of the VMAs.
Align the new loop in do_mbind() and existing loop in set_mempolicy_home_node() to use the reduced mbind_range() function. This allows for a single location of the range calculation and avoids constantly looking up the previous VMA (since this is a loop over the VMAs).
Link: https://lore.kernel.org/linux-mm/000000000000c93feb05f87e24ad@google.com/ Fixes: 66850be55e8e ("mm/mempolicy: use vma iterator & maple state instead of vma linked list") Signed-off-by: Liam R. Howlett Liam.Howlett@oracle.com Reported-by: syzbot+a7c1ec5b1d71ceaa5186@syzkaller.appspotmail.com Link: https://lkml.kernel.org/r/20230410152205.2294819-1-Liam.Howlett@oracle.com Tested-by: syzbot+a7c1ec5b1d71ceaa5186@syzkaller.appspotmail.com Cc: stable@vger.kernel.org Signed-off-by: Andrew Morton akpm@linux-foundation.org Signed-off-by: Liam R. Howlett Liam.Howlett@oracle.com Signed-off-by: Greg Kroah-Hartman gregkh@linuxfoundation.org --- mm/mempolicy.c | 113 ++++++++++++++++++++++++++------------------------------- 1 file changed, 52 insertions(+), 61 deletions(-)
--- a/mm/mempolicy.c +++ b/mm/mempolicy.c @@ -784,70 +784,56 @@ static int vma_replace_policy(struct vm_ return err; }
-/* Step 2: apply policy to a range and do splits. */ -static int mbind_range(struct mm_struct *mm, unsigned long start, - unsigned long end, struct mempolicy *new_pol) +/* Split or merge the VMA (if required) and apply the new policy */ +static int mbind_range(struct vma_iterator *vmi, struct vm_area_struct *vma, + struct vm_area_struct **prev, unsigned long start, + unsigned long end, struct mempolicy *new_pol) { - MA_STATE(mas, &mm->mm_mt, start, start); - struct vm_area_struct *prev; - struct vm_area_struct *vma; - int err = 0; + struct vm_area_struct *merged; + unsigned long vmstart, vmend; pgoff_t pgoff; + int err;
- prev = mas_prev(&mas, 0); - if (unlikely(!prev)) - mas_set(&mas, start); + vmend = min(end, vma->vm_end); + if (start > vma->vm_start) { + *prev = vma; + vmstart = start; + } else { + vmstart = vma->vm_start; + }
- vma = mas_find(&mas, end - 1); - if (WARN_ON(!vma)) + if (mpol_equal(vma_policy(vma), new_pol)) return 0;
- if (start > vma->vm_start) - prev = vma; + pgoff = vma->vm_pgoff + ((vmstart - vma->vm_start) >> PAGE_SHIFT); + merged = vma_merge(vma->vm_mm, *prev, vmstart, vmend, vma->vm_flags, + vma->anon_vma, vma->vm_file, pgoff, new_pol, + vma->vm_userfaultfd_ctx, anon_vma_name(vma)); + if (merged) { + *prev = merged; + /* vma_merge() invalidated the mas */ + mas_pause(&vmi->mas); + return vma_replace_policy(merged, new_pol); + }
- for (; vma; vma = mas_next(&mas, end - 1)) { - unsigned long vmstart = max(start, vma->vm_start); - unsigned long vmend = min(end, vma->vm_end); - - if (mpol_equal(vma_policy(vma), new_pol)) - goto next; - - pgoff = vma->vm_pgoff + - ((vmstart - vma->vm_start) >> PAGE_SHIFT); - prev = vma_merge(mm, prev, vmstart, vmend, vma->vm_flags, - vma->anon_vma, vma->vm_file, pgoff, - new_pol, vma->vm_userfaultfd_ctx, - anon_vma_name(vma)); - if (prev) { - /* vma_merge() invalidated the mas */ - mas_pause(&mas); - vma = prev; - goto replace; - } - if (vma->vm_start != vmstart) { - err = split_vma(vma->vm_mm, vma, vmstart, 1); - if (err) - goto out; - /* split_vma() invalidated the mas */ - mas_pause(&mas); - } - if (vma->vm_end != vmend) { - err = split_vma(vma->vm_mm, vma, vmend, 0); - if (err) - goto out; - /* split_vma() invalidated the mas */ - mas_pause(&mas); - } -replace: - err = vma_replace_policy(vma, new_pol); + if (vma->vm_start != vmstart) { + err = split_vma(vma->vm_mm, vma, vmstart, 1); if (err) - goto out; -next: - prev = vma; + return err; + /* split_vma() invalidated the mas */ + mas_pause(&vmi->mas); }
-out: - return err; + if (vma->vm_end != vmend) { + err = split_vma(vma->vm_mm, vma, vmend, 0); + if (err) + return err; + /* split_vma() invalidated the mas */ + mas_pause(&vmi->mas); + } + + *prev = vma; + return vma_replace_policy(vma, new_pol); }
/* Set the process memory policy */ @@ -1259,6 +1245,8 @@ static long do_mbind(unsigned long start nodemask_t *nmask, unsigned long flags) { struct mm_struct *mm = current->mm; + struct vm_area_struct *vma, *prev; + struct vma_iterator vmi; struct mempolicy *new; unsigned long end; int err; @@ -1328,7 +1316,13 @@ static long do_mbind(unsigned long start goto up_out; }
- err = mbind_range(mm, start, end, new); + vma_iter_init(&vmi, mm, start); + prev = vma_prev(&vmi); + for_each_vma_range(vmi, vma, end) { + err = mbind_range(&vmi, vma, &prev, start, end, new); + if (err) + break; + }
if (!err) { int nr_failed = 0; @@ -1489,10 +1483,8 @@ SYSCALL_DEFINE4(set_mempolicy_home_node, unsigned long, home_node, unsigned long, flags) { struct mm_struct *mm = current->mm; - struct vm_area_struct *vma; + struct vm_area_struct *vma, *prev; struct mempolicy *new; - unsigned long vmstart; - unsigned long vmend; unsigned long end; int err = -ENOENT; VMA_ITERATOR(vmi, mm, start); @@ -1521,9 +1513,8 @@ SYSCALL_DEFINE4(set_mempolicy_home_node, if (end == start) return 0; mmap_write_lock(mm); + prev = vma_prev(&vmi); for_each_vma_range(vmi, vma, end) { - vmstart = max(start, vma->vm_start); - vmend = min(end, vma->vm_end); new = mpol_dup(vma_policy(vma)); if (IS_ERR(new)) { err = PTR_ERR(new); @@ -1547,7 +1538,7 @@ SYSCALL_DEFINE4(set_mempolicy_home_node, }
new->home_node = home_node; - err = mbind_range(mm, vmstart, vmend, new); + err = mbind_range(&vmi, vma, &prev, start, end, new); mpol_put(new); if (err) break;
From: Paolo Abeni pabeni@redhat.com
commit 2a6a870e44dd88f1a6a2893c65ef756a9edfb4c7 upstream.
This is a partial revert of the blamed commit, with a relevant change: mptcp_subflow_queue_clean() now just change the msk socket status and stop the worker, so that the UaF issue addressed by the blamed commit is not re-introduced.
The above prevents the mptcp worker from running concurrently with inet_csk_listen_stop(), as such race would trigger a warning, as reported by Christoph:
RSP: 002b:00007f784fe09cd8 EFLAGS: 00000246 ORIG_RAX: 000000000000002e WARNING: CPU: 0 PID: 25807 at net/ipv4/inet_connection_sock.c:1387 inet_csk_listen_stop+0x664/0x870 net/ipv4/inet_connection_sock.c:1387 RAX: ffffffffffffffda RBX: 00000000006bc050 RCX: 00007f7850afd6a9 RDX: 0000000000000000 RSI: 0000000020000340 RDI: 0000000000000004 Modules linked in: RBP: 0000000000000002 R08: 0000000000000000 R09: 0000000000000000 R10: 0000000000000000 R11: 0000000000000246 R12: 00000000006bc05c R13: fffffffffffffea8 R14: 00000000006bc050 R15: 000000000001fe40
</TASK> CPU: 0 PID: 25807 Comm: syz-executor.7 Not tainted 6.2.0-g778e54711659 #7 Hardware name: QEMU Standard PC (i440FX + PIIX, 1996), BIOS 1.11.0-2.el7 04/01/2014 RIP: 0010:inet_csk_listen_stop+0x664/0x870 net/ipv4/inet_connection_sock.c:1387 RAX: 0000000000000000 RBX: ffff888100dfbd40 RCX: 0000000000000000 RDX: ffff8881363aab80 RSI: ffffffff81c494f4 RDI: 0000000000000005 RBP: ffff888126dad080 R08: 0000000000000005 R09: 0000000000000000 R10: 0000000000000001 R11: 0000000000000000 R12: ffff888100dfe040 R13: 0000000000000001 R14: 0000000000000000 R15: ffff888100dfbdd8 FS: 00007f7850a2c800(0000) GS:ffff88813bc00000(0000) knlGS:0000000000000000 CS: 0010 DS: 0000 ES: 0000 CR0: 0000000080050033 CR2: 0000001b32d26000 CR3: 000000012fdd8006 CR4: 0000000000770ef0 DR0: 0000000000000000 DR1: 0000000000000000 DR2: 0000000000000000 DR3: 0000000000000000 DR6: 00000000fffe0ff0 DR7: 0000000000000400 PKRU: 55555554 Call Trace: <TASK> __tcp_close+0x5b2/0x620 net/ipv4/tcp.c:2875 __mptcp_close_ssk+0x145/0x3d0 net/mptcp/protocol.c:2427 mptcp_destroy_common+0x8a/0x1c0 net/mptcp/protocol.c:3277 mptcp_destroy+0x41/0x60 net/mptcp/protocol.c:3304 __mptcp_destroy_sock+0x56/0x140 net/mptcp/protocol.c:2965 __mptcp_close+0x38f/0x4a0 net/mptcp/protocol.c:3057 mptcp_close+0x24/0xe0 net/mptcp/protocol.c:3072 inet_release+0x53/0xa0 net/ipv4/af_inet.c:429 __sock_release+0x4e/0xf0 net/socket.c:651 sock_close+0x15/0x20 net/socket.c:1393 __fput+0xff/0x420 fs/file_table.c:321 task_work_run+0x8b/0xe0 kernel/task_work.c:179 resume_user_mode_work include/linux/resume_user_mode.h:49 [inline] exit_to_user_mode_loop kernel/entry/common.c:171 [inline] exit_to_user_mode_prepare+0x113/0x120 kernel/entry/common.c:203 __syscall_exit_to_user_mode_work kernel/entry/common.c:285 [inline] syscall_exit_to_user_mode+0x1d/0x40 kernel/entry/common.c:296 do_syscall_64+0x46/0x90 arch/x86/entry/common.c:86 entry_SYSCALL_64_after_hwframe+0x72/0xdc RIP: 0033:0x7f7850af70dc RAX: 0000000000000000 RBX: 0000000000000004 RCX: 00007f7850af70dc RDX: 00007f7850a2c800 RSI: 0000000000000002 RDI: 0000000000000003 RBP: 00000000006bd980 R08: 0000000000000000 R09: 00000000000018a0 R10: 00000000316338a4 R11: 0000000000000293 R12: 0000000000211e31 R13: 00000000006bc05c R14: 00007f785062c000 R15: 0000000000211af0
Fixes: 0a3f4f1f9c27 ("mptcp: fix UaF in listener shutdown") Cc: stable@vger.kernel.org Reported-by: Christoph Paasch cpaasch@apple.com Link: https://github.com/multipath-tcp/mptcp_net-next/issues/371 Signed-off-by: Paolo Abeni pabeni@redhat.com Reviewed-by: Matthieu Baerts matthieu.baerts@tessares.net Signed-off-by: Matthieu Baerts matthieu.baerts@tessares.net Signed-off-by: David S. Miller davem@davemloft.net Signed-off-by: Matthieu Baerts matthieu.baerts@tessares.net Signed-off-by: Greg Kroah-Hartman gregkh@linuxfoundation.org --- net/mptcp/protocol.c | 6 ++++ net/mptcp/protocol.h | 1 net/mptcp/subflow.c | 72 +++++++++++++++++++++++++++++++++++++++++++++++++++ 3 files changed, 79 insertions(+)
--- a/net/mptcp/protocol.c +++ b/net/mptcp/protocol.c @@ -2380,6 +2380,12 @@ static void __mptcp_close_ssk(struct soc mptcp_subflow_drop_ctx(ssk); } else { /* otherwise tcp will dispose of the ssk and subflow ctx */ + if (ssk->sk_state == TCP_LISTEN) { + tcp_set_state(ssk, TCP_CLOSE); + mptcp_subflow_queue_clean(sk, ssk); + inet_csk_listen_stop(ssk); + } + __tcp_close(ssk, 0);
/* close acquired an extra ref */ --- a/net/mptcp/protocol.h +++ b/net/mptcp/protocol.h @@ -615,6 +615,7 @@ void mptcp_close_ssk(struct sock *sk, st struct mptcp_subflow_context *subflow); void __mptcp_subflow_send_ack(struct sock *ssk); void mptcp_subflow_reset(struct sock *ssk); +void mptcp_subflow_queue_clean(struct sock *sk, struct sock *ssk); void mptcp_sock_graft(struct sock *sk, struct socket *parent); struct socket *__mptcp_nmpc_socket(const struct mptcp_sock *msk); bool __mptcp_close(struct sock *sk, long timeout); --- a/net/mptcp/subflow.c +++ b/net/mptcp/subflow.c @@ -1758,6 +1758,78 @@ static void subflow_state_change(struct } }
+void mptcp_subflow_queue_clean(struct sock *listener_sk, struct sock *listener_ssk) +{ + struct request_sock_queue *queue = &inet_csk(listener_ssk)->icsk_accept_queue; + struct mptcp_sock *msk, *next, *head = NULL; + struct request_sock *req; + + /* build a list of all unaccepted mptcp sockets */ + spin_lock_bh(&queue->rskq_lock); + for (req = queue->rskq_accept_head; req; req = req->dl_next) { + struct mptcp_subflow_context *subflow; + struct sock *ssk = req->sk; + + if (!sk_is_mptcp(ssk)) + continue; + + subflow = mptcp_subflow_ctx(ssk); + if (!subflow || !subflow->conn) + continue; + + /* skip if already in list */ + msk = mptcp_sk(subflow->conn); + if (msk->dl_next || msk == head) + continue; + + sock_hold(subflow->conn); + msk->dl_next = head; + head = msk; + } + spin_unlock_bh(&queue->rskq_lock); + if (!head) + return; + + /* can't acquire the msk socket lock under the subflow one, + * or will cause ABBA deadlock + */ + release_sock(listener_ssk); + + for (msk = head; msk; msk = next) { + struct sock *sk = (struct sock *)msk; + + lock_sock_nested(sk, SINGLE_DEPTH_NESTING); + next = msk->dl_next; + msk->dl_next = NULL; + + /* prevent the stack from later re-schedule the worker for + * this socket + */ + inet_sk_state_store(sk, TCP_CLOSE); + release_sock(sk); + + /* lockdep will report a false positive ABBA deadlock + * between cancel_work_sync and the listener socket. + * The involved locks belong to different sockets WRT + * the existing AB chain. + * Using a per socket key is problematic as key + * deregistration requires process context and must be + * performed at socket disposal time, in atomic + * context. + * Just tell lockdep to consider the listener socket + * released here. + */ + mutex_release(&listener_sk->sk_lock.dep_map, _RET_IP_); + mptcp_cancel_work(sk); + mutex_acquire(&listener_sk->sk_lock.dep_map, 0, 0, _RET_IP_); + + sock_put(sk); + } + + /* we are still under the listener msk socket lock */ + lock_sock_nested(listener_ssk, SINGLE_DEPTH_NESTING); +} + static int subflow_ulp_init(struct sock *sk) { struct inet_connection_sock *icsk = inet_csk(sk);
From: Paolo Abeni pabeni@redhat.com
commit 63740448a32eb662e05894425b47bcc5814136f4 upstream.
The mptcp worker and mptcp_accept() can race, as reported by Christoph:
refcount_t: addition on 0; use-after-free. WARNING: CPU: 1 PID: 14351 at lib/refcount.c:25 refcount_warn_saturate+0x105/0x1b0 lib/refcount.c:25 Modules linked in: CPU: 1 PID: 14351 Comm: syz-executor.2 Not tainted 6.3.0-rc1-gde5e8fd0123c #11 Hardware name: QEMU Standard PC (i440FX + PIIX, 1996), BIOS 1.11.0-2.el7 04/01/2014 RIP: 0010:refcount_warn_saturate+0x105/0x1b0 lib/refcount.c:25 Code: 02 31 ff 89 de e8 1b f0 a7 ff 84 db 0f 85 6e ff ff ff e8 3e f5 a7 ff 48 c7 c7 d8 c7 34 83 c6 05 6d 2d 0f 02 01 e8 cb 3d 90 ff <0f> 0b e9 4f ff ff ff e8 1f f5 a7 ff 0f b6 1d 54 2d 0f 02 31 ff 89 RSP: 0018:ffffc90000a47bf8 EFLAGS: 00010282 RAX: 0000000000000000 RBX: 0000000000000000 RCX: 0000000000000000 RDX: ffff88802eae98c0 RSI: ffffffff81097d4f RDI: 0000000000000001 RBP: ffff88802e712180 R08: 0000000000000001 R09: 0000000000000000 R10: 0000000000000001 R11: ffff88802eaea148 R12: ffff88802e712100 R13: ffff88802e712a88 R14: ffff888005cb93a8 R15: ffff88802e712a88 FS: 0000000000000000(0000) GS:ffff88803ed00000(0000) knlGS:0000000000000000 CS: 0010 DS: 0000 ES: 0000 CR0: 0000000080050033 CR2: 00007f277fd89120 CR3: 0000000035486002 CR4: 0000000000370ee0 DR0: 0000000000000000 DR1: 0000000000000000 DR2: 0000000000000000 DR3: 0000000000000000 DR6: 00000000fffe0ff0 DR7: 0000000000000400 Call Trace: <TASK> __refcount_add include/linux/refcount.h:199 [inline] __refcount_inc include/linux/refcount.h:250 [inline] refcount_inc include/linux/refcount.h:267 [inline] sock_hold include/net/sock.h:775 [inline] __mptcp_close+0x4c6/0x4d0 net/mptcp/protocol.c:3051 mptcp_close+0x24/0xe0 net/mptcp/protocol.c:3072 inet_release+0x56/0xa0 net/ipv4/af_inet.c:429 __sock_release+0x51/0xf0 net/socket.c:653 sock_close+0x18/0x20 net/socket.c:1395 __fput+0x113/0x430 fs/file_table.c:321 task_work_run+0x96/0x100 kernel/task_work.c:179 exit_task_work include/linux/task_work.h:38 [inline] do_exit+0x4fc/0x10c0 kernel/exit.c:869 do_group_exit+0x51/0xf0 kernel/exit.c:1019 get_signal+0x12b0/0x1390 kernel/signal.c:2859 arch_do_signal_or_restart+0x25/0x260 arch/x86/kernel/signal.c:306 exit_to_user_mode_loop kernel/entry/common.c:168 [inline] exit_to_user_mode_prepare+0x131/0x1a0 kernel/entry/common.c:203 __syscall_exit_to_user_mode_work kernel/entry/common.c:285 [inline] syscall_exit_to_user_mode+0x19/0x40 kernel/entry/common.c:296 do_syscall_64+0x46/0x90 arch/x86/entry/common.c:86 entry_SYSCALL_64_after_hwframe+0x72/0xdc RIP: 0033:0x7fec4b4926a9 Code: Unable to access opcode bytes at 0x7fec4b49267f. RSP: 002b:00007fec49f9dd78 EFLAGS: 00000246 ORIG_RAX: 00000000000000ca RAX: fffffffffffffe00 RBX: 00000000006bc058 RCX: 00007fec4b4926a9 RDX: 0000000000000000 RSI: 0000000000000080 RDI: 00000000006bc058 RBP: 00000000006bc050 R08: 00000000007df998 R09: 00000000007df998 R10: 0000000000000000 R11: 0000000000000246 R12: 00000000006bc05c R13: fffffffffffffea8 R14: 000000000000000b R15: 000000000001fe40 </TASK>
The root cause is that the worker can force fallback to TCP the first mptcp subflow, actually deleting the unaccepted msk socket.
We can explicitly prevent the race delaying the unaccepted msk deletion at listener shutdown time. In case the closed subflow is later accepted, just drop the mptcp context and let the user-space deal with the paired mptcp socket.
Fixes: b6985b9b8295 ("mptcp: use the workqueue to destroy unaccepted sockets") Cc: stable@vger.kernel.org Reported-by: Christoph Paasch cpaasch@apple.com Link: https://github.com/multipath-tcp/mptcp_net-next/issues/375 Signed-off-by: Paolo Abeni pabeni@redhat.com Reviewed-by: Matthieu Baerts matthieu.baerts@tessares.net Tested-by: Christoph Paasch cpaasch@apple.com Signed-off-by: Matthieu Baerts matthieu.baerts@tessares.net Signed-off-by: David S. Miller davem@davemloft.net Signed-off-by: Matthieu Baerts matthieu.baerts@tessares.net Signed-off-by: Greg Kroah-Hartman gregkh@linuxfoundation.org --- net/mptcp/protocol.c | 68 +++++++++++++++++++++++++++++++++------------------ net/mptcp/protocol.h | 1 net/mptcp/subflow.c | 22 +++++++++------- 3 files changed, 58 insertions(+), 33 deletions(-)
--- a/net/mptcp/protocol.c +++ b/net/mptcp/protocol.c @@ -2330,7 +2330,26 @@ static void __mptcp_close_ssk(struct soc unsigned int flags) { struct mptcp_sock *msk = mptcp_sk(sk); - bool need_push, dispose_it; + bool dispose_it, need_push = false; + + /* If the first subflow moved to a close state before accept, e.g. due + * to an incoming reset, mptcp either: + * - if either the subflow or the msk are dead, destroy the context + * (the subflow socket is deleted by inet_child_forget) and the msk + * - otherwise do nothing at the moment and take action at accept and/or + * listener shutdown - user-space must be able to accept() the closed + * socket. + */ + if (msk->in_accept_queue && msk->first == ssk) { + if (!sock_flag(sk, SOCK_DEAD) && !sock_flag(ssk, SOCK_DEAD)) + return; + + /* ensure later check in mptcp_worker() will dispose the msk */ + sock_set_flag(sk, SOCK_DEAD); + lock_sock_nested(ssk, SINGLE_DEPTH_NESTING); + mptcp_subflow_drop_ctx(ssk); + goto out_release; + }
dispose_it = !msk->subflow || ssk != msk->subflow->sk; if (dispose_it) @@ -2366,18 +2385,6 @@ static void __mptcp_close_ssk(struct soc if (!inet_csk(ssk)->icsk_ulp_ops) { WARN_ON_ONCE(!sock_flag(ssk, SOCK_DEAD)); kfree_rcu(subflow, rcu); - } else if (msk->in_accept_queue && msk->first == ssk) { - /* if the first subflow moved to a close state, e.g. due to - * incoming reset and we reach here before inet_child_forget() - * the TCP stack could later try to close it via - * inet_csk_listen_stop(), or deliver it to the user space via - * accept(). - * We can't delete the subflow - or risk a double free - nor let - * the msk survive - or will be leaked in the non accept scenario: - * fallback and let TCP cope with the subflow cleanup. - */ - WARN_ON_ONCE(sock_flag(ssk, SOCK_DEAD)); - mptcp_subflow_drop_ctx(ssk); } else { /* otherwise tcp will dispose of the ssk and subflow ctx */ if (ssk->sk_state == TCP_LISTEN) { @@ -2391,6 +2398,8 @@ static void __mptcp_close_ssk(struct soc /* close acquired an extra ref */ __sock_put(ssk); } + +out_release: release_sock(ssk);
sock_put(ssk); @@ -2445,21 +2454,14 @@ static void __mptcp_close_subflow(struct mptcp_close_ssk(sk, ssk, subflow); }
- /* if the MPC subflow has been closed before the msk is accepted, - * msk will never be accept-ed, close it now - */ - if (!msk->first && msk->in_accept_queue) { - sock_set_flag(sk, SOCK_DEAD); - inet_sk_state_store(sk, TCP_CLOSE); - } }
-static bool mptcp_check_close_timeout(const struct sock *sk) +static bool mptcp_should_close(const struct sock *sk) { s32 delta = tcp_jiffies32 - inet_csk(sk)->icsk_mtup.probe_timestamp; struct mptcp_subflow_context *subflow;
- if (delta >= TCP_TIMEWAIT_LEN) + if (delta >= TCP_TIMEWAIT_LEN || mptcp_sk(sk)->in_accept_queue) return true;
/* if all subflows are in closed status don't bother with additional @@ -2667,7 +2669,7 @@ static void mptcp_worker(struct work_str * even if it is orphaned and in FIN_WAIT2 state */ if (sock_flag(sk, SOCK_DEAD)) { - if (mptcp_check_close_timeout(sk)) { + if (mptcp_should_close(sk)) { inet_sk_state_store(sk, TCP_CLOSE); mptcp_do_fastclose(sk); } @@ -2912,6 +2914,14 @@ static void __mptcp_destroy_sock(struct sock_put(sk); }
+void __mptcp_unaccepted_force_close(struct sock *sk) +{ + sock_set_flag(sk, SOCK_DEAD); + inet_sk_state_store(sk, TCP_CLOSE); + mptcp_do_fastclose(sk); + __mptcp_destroy_sock(sk); +} + static __poll_t mptcp_check_readable(struct mptcp_sock *msk) { /* Concurrent splices from sk_receive_queue into receive_queue will @@ -3759,6 +3769,18 @@ static int mptcp_stream_accept(struct so if (!ssk->sk_socket) mptcp_sock_graft(ssk, newsock); } + + /* Do late cleanup for the first subflow as necessary. Also + * deal with bad peers not doing a complete shutdown. + */ + if (msk->first && + unlikely(inet_sk_state_load(msk->first) == TCP_CLOSE)) { + __mptcp_close_ssk(newsk, msk->first, + mptcp_subflow_ctx(msk->first), 0); + if (unlikely(list_empty(&msk->conn_list))) + inet_sk_state_store(newsk, TCP_CLOSE); + } + release_sock(newsk); }
--- a/net/mptcp/protocol.h +++ b/net/mptcp/protocol.h @@ -620,6 +620,7 @@ void mptcp_sock_graft(struct sock *sk, s struct socket *__mptcp_nmpc_socket(const struct mptcp_sock *msk); bool __mptcp_close(struct sock *sk, long timeout); void mptcp_cancel_work(struct sock *sk); +void __mptcp_unaccepted_force_close(struct sock *sk);
bool mptcp_addresses_equal(const struct mptcp_addr_info *a, const struct mptcp_addr_info *b, bool use_port); --- a/net/mptcp/subflow.c +++ b/net/mptcp/subflow.c @@ -661,9 +661,12 @@ void mptcp_subflow_drop_ctx(struct sock if (!ctx) return;
- subflow_ulp_fallback(ssk, ctx); - if (ctx->conn) - sock_put(ctx->conn); + list_del(&mptcp_subflow_ctx(ssk)->node); + if (inet_csk(ssk)->icsk_ulp_ops) { + subflow_ulp_fallback(ssk, ctx); + if (ctx->conn) + sock_put(ctx->conn); + }
kfree_rcu(ctx, rcu); } @@ -1763,6 +1766,7 @@ void mptcp_subflow_queue_clean(struct so struct request_sock_queue *queue = &inet_csk(listener_ssk)->icsk_accept_queue; struct mptcp_sock *msk, *next, *head = NULL; struct request_sock *req; + struct sock *sk;
/* build a list of all unaccepted mptcp sockets */ spin_lock_bh(&queue->rskq_lock); @@ -1778,11 +1782,12 @@ void mptcp_subflow_queue_clean(struct so continue;
/* skip if already in list */ - msk = mptcp_sk(subflow->conn); + sk = subflow->conn; + msk = mptcp_sk(sk); if (msk->dl_next || msk == head) continue;
- sock_hold(subflow->conn); + sock_hold(sk); msk->dl_next = head; head = msk; } @@ -1796,16 +1801,13 @@ void mptcp_subflow_queue_clean(struct so release_sock(listener_ssk);
for (msk = head; msk; msk = next) { - struct sock *sk = (struct sock *)msk; + sk = (struct sock *)msk;
lock_sock_nested(sk, SINGLE_DEPTH_NESTING); next = msk->dl_next; msk->dl_next = NULL;
- /* prevent the stack from later re-schedule the worker for - * this socket - */ - inet_sk_state_store(sk, TCP_CLOSE); + __mptcp_unaccepted_force_close(sk); release_sock(sk);
/* lockdep will report a false positive ABBA deadlock
From: Jisoo Jang jisoo.jang@yonsei.ac.kr
commit 0da40e018fd034d87c9460123fa7f897b69fdee7 upstream.
Fix a slab-out-of-bounds read that occurs in kmemdup() called from brcmf_get_assoc_ies(). The bug could occur when assoc_info->req_len, data from a URB provided by a USB device, is bigger than the size of buffer which is defined as WL_EXTRA_BUF_MAX.
Add the size check for req_len/resp_len of assoc_info.
Found by a modified version of syzkaller.
[ 46.592467][ T7] ================================================================== [ 46.594687][ T7] BUG: KASAN: slab-out-of-bounds in kmemdup+0x3e/0x50 [ 46.596572][ T7] Read of size 3014656 at addr ffff888019442000 by task kworker/0:1/7 [ 46.598575][ T7] [ 46.599157][ T7] CPU: 0 PID: 7 Comm: kworker/0:1 Tainted: G O 5.14.0+ #145 [ 46.601333][ T7] Hardware name: QEMU Standard PC (i440FX + PIIX, 1996), BIOS rel-1.12.1-0-ga5cab58e9a3f-prebuilt.qemu.org 04/01/2014 [ 46.604360][ T7] Workqueue: events brcmf_fweh_event_worker [ 46.605943][ T7] Call Trace: [ 46.606584][ T7] dump_stack_lvl+0x8e/0xd1 [ 46.607446][ T7] print_address_description.constprop.0.cold+0x93/0x334 [ 46.608610][ T7] ? kmemdup+0x3e/0x50 [ 46.609341][ T7] kasan_report.cold+0x79/0xd5 [ 46.610151][ T7] ? kmemdup+0x3e/0x50 [ 46.610796][ T7] kasan_check_range+0x14e/0x1b0 [ 46.611691][ T7] memcpy+0x20/0x60 [ 46.612323][ T7] kmemdup+0x3e/0x50 [ 46.612987][ T7] brcmf_get_assoc_ies+0x967/0xf60 [ 46.613904][ T7] ? brcmf_notify_vif_event+0x3d0/0x3d0 [ 46.614831][ T7] ? lock_chain_count+0x20/0x20 [ 46.615683][ T7] ? mark_lock.part.0+0xfc/0x2770 [ 46.616552][ T7] ? lock_chain_count+0x20/0x20 [ 46.617409][ T7] ? mark_lock.part.0+0xfc/0x2770 [ 46.618244][ T7] ? lock_chain_count+0x20/0x20 [ 46.619024][ T7] brcmf_bss_connect_done.constprop.0+0x241/0x2e0 [ 46.620019][ T7] ? brcmf_parse_configure_security.isra.0+0x2a0/0x2a0 [ 46.620818][ T7] ? __lock_acquire+0x181f/0x5790 [ 46.621462][ T7] brcmf_notify_connect_status+0x448/0x1950 [ 46.622134][ T7] ? rcu_read_lock_bh_held+0xb0/0xb0 [ 46.622736][ T7] ? brcmf_cfg80211_join_ibss+0x7b0/0x7b0 [ 46.623390][ T7] ? find_held_lock+0x2d/0x110 [ 46.623962][ T7] ? brcmf_fweh_event_worker+0x19f/0xc60 [ 46.624603][ T7] ? mark_held_locks+0x9f/0xe0 [ 46.625145][ T7] ? lockdep_hardirqs_on_prepare+0x3e0/0x3e0 [ 46.625871][ T7] ? brcmf_cfg80211_join_ibss+0x7b0/0x7b0 [ 46.626545][ T7] brcmf_fweh_call_event_handler.isra.0+0x90/0x100 [ 46.627338][ T7] brcmf_fweh_event_worker+0x557/0xc60 [ 46.627962][ T7] ? brcmf_fweh_call_event_handler.isra.0+0x100/0x100 [ 46.628736][ T7] ? rcu_read_lock_sched_held+0xa1/0xd0 [ 46.629396][ T7] ? rcu_read_lock_bh_held+0xb0/0xb0 [ 46.629970][ T7] ? lockdep_hardirqs_on_prepare+0x273/0x3e0 [ 46.630649][ T7] process_one_work+0x92b/0x1460 [ 46.631205][ T7] ? pwq_dec_nr_in_flight+0x330/0x330 [ 46.631821][ T7] ? rwlock_bug.part.0+0x90/0x90 [ 46.632347][ T7] worker_thread+0x95/0xe00 [ 46.632832][ T7] ? __kthread_parkme+0x115/0x1e0 [ 46.633393][ T7] ? process_one_work+0x1460/0x1460 [ 46.633957][ T7] kthread+0x3a1/0x480 [ 46.634369][ T7] ? set_kthread_struct+0x120/0x120 [ 46.634933][ T7] ret_from_fork+0x1f/0x30 [ 46.635431][ T7] [ 46.635687][ T7] Allocated by task 7: [ 46.636151][ T7] kasan_save_stack+0x1b/0x40 [ 46.636628][ T7] __kasan_kmalloc+0x7c/0x90 [ 46.637108][ T7] kmem_cache_alloc_trace+0x19e/0x330 [ 46.637696][ T7] brcmf_cfg80211_attach+0x4a0/0x4040 [ 46.638275][ T7] brcmf_attach+0x389/0xd40 [ 46.638739][ T7] brcmf_usb_probe+0x12de/0x1690 [ 46.639279][ T7] usb_probe_interface+0x2aa/0x760 [ 46.639820][ T7] really_probe+0x205/0xb70 [ 46.640342][ T7] __driver_probe_device+0x311/0x4b0 [ 46.640876][ T7] driver_probe_device+0x4e/0x150 [ 46.641445][ T7] __device_attach_driver+0x1cc/0x2a0 [ 46.642000][ T7] bus_for_each_drv+0x156/0x1d0 [ 46.642543][ T7] __device_attach+0x23f/0x3a0 [ 46.643065][ T7] bus_probe_device+0x1da/0x290 [ 46.643644][ T7] device_add+0xb7b/0x1eb0 [ 46.644130][ T7] usb_set_configuration+0xf59/0x16f0 [ 46.644720][ T7] usb_generic_driver_probe+0x82/0xa0 [ 46.645295][ T7] usb_probe_device+0xbb/0x250 [ 46.645786][ T7] really_probe+0x205/0xb70 [ 46.646258][ T7] __driver_probe_device+0x311/0x4b0 [ 46.646804][ T7] driver_probe_device+0x4e/0x150 [ 46.647387][ T7] __device_attach_driver+0x1cc/0x2a0 [ 46.647926][ T7] bus_for_each_drv+0x156/0x1d0 [ 46.648454][ T7] __device_attach+0x23f/0x3a0 [ 46.648939][ T7] bus_probe_device+0x1da/0x290 [ 46.649478][ T7] device_add+0xb7b/0x1eb0 [ 46.649936][ T7] usb_new_device.cold+0x49c/0x1029 [ 46.650526][ T7] hub_event+0x1c98/0x3950 [ 46.650975][ T7] process_one_work+0x92b/0x1460 [ 46.651535][ T7] worker_thread+0x95/0xe00 [ 46.651991][ T7] kthread+0x3a1/0x480 [ 46.652413][ T7] ret_from_fork+0x1f/0x30 [ 46.652885][ T7] [ 46.653131][ T7] The buggy address belongs to the object at ffff888019442000 [ 46.653131][ T7] which belongs to the cache kmalloc-2k of size 2048 [ 46.654669][ T7] The buggy address is located 0 bytes inside of [ 46.654669][ T7] 2048-byte region [ffff888019442000, ffff888019442800) [ 46.656137][ T7] The buggy address belongs to the page: [ 46.656720][ T7] page:ffffea0000651000 refcount:1 mapcount:0 mapping:0000000000000000 index:0x0 pfn:0x19440 [ 46.657792][ T7] head:ffffea0000651000 order:3 compound_mapcount:0 compound_pincount:0 [ 46.658673][ T7] flags: 0x100000000010200(slab|head|node=0|zone=1) [ 46.659422][ T7] raw: 0100000000010200 0000000000000000 dead000000000122 ffff888100042000 [ 46.660363][ T7] raw: 0000000000000000 0000000000080008 00000001ffffffff 0000000000000000 [ 46.661236][ T7] page dumped because: kasan: bad access detected [ 46.661956][ T7] page_owner tracks the page as allocated [ 46.662588][ T7] page last allocated via order 3, migratetype Unmovable, gfp_mask 0x52a20(GFP_ATOMIC|__GFP_NOWARN|__GFP_NORETRY|__GFP_COMP), pid 7, ts 31136961085, free_ts 0 [ 46.664271][ T7] prep_new_page+0x1aa/0x240 [ 46.664763][ T7] get_page_from_freelist+0x159a/0x27c0 [ 46.665340][ T7] __alloc_pages+0x2da/0x6a0 [ 46.665847][ T7] alloc_pages+0xec/0x1e0 [ 46.666308][ T7] allocate_slab+0x380/0x4e0 [ 46.666770][ T7] ___slab_alloc+0x5bc/0x940 [ 46.667264][ T7] __slab_alloc+0x6d/0x80 [ 46.667712][ T7] kmem_cache_alloc_trace+0x30a/0x330 [ 46.668299][ T7] brcmf_usbdev_qinit.constprop.0+0x50/0x470 [ 46.668885][ T7] brcmf_usb_probe+0xc97/0x1690 [ 46.669438][ T7] usb_probe_interface+0x2aa/0x760 [ 46.669988][ T7] really_probe+0x205/0xb70 [ 46.670487][ T7] __driver_probe_device+0x311/0x4b0 [ 46.671031][ T7] driver_probe_device+0x4e/0x150 [ 46.671604][ T7] __device_attach_driver+0x1cc/0x2a0 [ 46.672192][ T7] bus_for_each_drv+0x156/0x1d0 [ 46.672739][ T7] page_owner free stack trace missing [ 46.673335][ T7] [ 46.673620][ T7] Memory state around the buggy address: [ 46.674213][ T7] ffff888019442700: 00 00 00 00 00 00 00 00 00 00 00 00 00 00 00 00 [ 46.675083][ T7] ffff888019442780: 00 00 00 00 00 00 00 00 00 00 00 00 00 00 00 00 [ 46.675994][ T7] >ffff888019442800: fc fc fc fc fc fc fc fc fc fc fc fc fc fc fc fc [ 46.676875][ T7] ^ [ 46.677323][ T7] ffff888019442880: fc fc fc fc fc fc fc fc fc fc fc fc fc fc fc fc [ 46.678190][ T7] ffff888019442900: fc fc fc fc fc fc fc fc fc fc fc fc fc fc fc fc [ 46.679052][ T7] ================================================================== [ 46.679945][ T7] Disabling lock debugging due to kernel taint [ 46.680725][ T7] Kernel panic - not syncing:
Reviewed-by: Arend van Spriel arend.vanspriel@broadcom.com Signed-off-by: Jisoo Jang jisoo.jang@yonsei.ac.kr Signed-off-by: Kalle Valo kvalo@kernel.org Link: https://lore.kernel.org/r/20230309104457.22628-1-jisoo.jang@yonsei.ac.kr Signed-off-by: Greg Kroah-Hartman gregkh@linuxfoundation.org --- drivers/net/wireless/broadcom/brcm80211/brcmfmac/cfg80211.c | 5 +++++ 1 file changed, 5 insertions(+)
--- a/drivers/net/wireless/broadcom/brcm80211/brcmfmac/cfg80211.c +++ b/drivers/net/wireless/broadcom/brcm80211/brcmfmac/cfg80211.c @@ -5888,6 +5888,11 @@ static s32 brcmf_get_assoc_ies(struct br (struct brcmf_cfg80211_assoc_ielen_le *)cfg->extra_buf; req_len = le32_to_cpu(assoc_info->req_len); resp_len = le32_to_cpu(assoc_info->resp_len); + if (req_len > WL_EXTRA_BUF_MAX || resp_len > WL_EXTRA_BUF_MAX) { + bphy_err(drvr, "invalid lengths in assoc info: req %u resp %u\n", + req_len, resp_len); + return -EINVAL; + } if (req_len) { err = brcmf_fil_iovar_data_get(ifp, "assoc_req_ies", cfg->extra_buf,
From: Daniel Vetter daniel.vetter@ffwll.ch
commit 1935f0deb6116dd785ea64d8035eab0ff441255b upstream.
Drivers are supposed to fix this up if needed if they don't outright reject it. Uncovered by 6c11df58fd1a ("fbmem: Check virtual screen sizes in fb_set_var()").
Reported-by: syzbot+20dcf81733d43ddff661@syzkaller.appspotmail.com Link: https://syzkaller.appspot.com/bug?id=c5faf983bfa4a607de530cd3bb008888bf06cef... Cc: stable@vger.kernel.org # v5.4+ Cc: Daniel Vetter daniel@ffwll.ch Cc: Javier Martinez Canillas javierm@redhat.com Cc: Thomas Zimmermann tzimmermann@suse.de Reviewed-by: Javier Martinez Canillas javierm@redhat.com Signed-off-by: Daniel Vetter daniel.vetter@intel.com Link: https://patchwork.freedesktop.org/patch/msgid/20230404194038.472803-1-daniel... Signed-off-by: Greg Kroah-Hartman gregkh@linuxfoundation.org --- drivers/gpu/drm/drm_fb_helper.c | 3 +++ 1 file changed, 3 insertions(+)
--- a/drivers/gpu/drm/drm_fb_helper.c +++ b/drivers/gpu/drm/drm_fb_helper.c @@ -1406,6 +1406,9 @@ int drm_fb_helper_check_var(struct fb_va return -EINVAL; }
+ var->xres_virtual = fb->width; + var->yres_virtual = fb->height; + /* * Workaround for SDL 1.2, which is known to be setting all pixel format * fields values to zero in some cases. We treat this situation as a
From: Werner Sembach wse@tuxedocomputers.com
commit 782eea0c89f7d071d6b56ecfa1b8b0c81164b9be upstream.
commit 1796f808e4bb ("HID: i2c-hid: acpi: Stop setting wakeup_capable") changed the policy such that I2C touchpads may be able to wake up the system by default if the system is configured as such.
However on Clevo NL5xNU there is a mistake in the ACPI tables that the TP_ATTN# signal connected to GPIO 9 is configured as ActiveLow and level triggered but connected to a pull up. As soon as the system suspends the touchpad loses power and then the system wakes up.
To avoid this problem, introduce a quirk for this model that will prevent the wakeup capability for being set for GPIO 9.
This patch is analoge to a very similar patch for NL5xRU, just the DMI string changed.
Signed-off-by: Werner Sembach wse@tuxedocomputers.com Cc: stable@vger.kernel.org Signed-off-by: Andy Shevchenko andriy.shevchenko@linux.intel.com Signed-off-by: Greg Kroah-Hartman gregkh@linuxfoundation.org --- drivers/gpio/gpiolib-acpi.c | 13 +++++++++++++ 1 file changed, 13 insertions(+)
--- a/drivers/gpio/gpiolib-acpi.c +++ b/drivers/gpio/gpiolib-acpi.c @@ -1607,6 +1607,19 @@ static const struct dmi_system_id gpioli * https://gitlab.freedesktop.org/drm/amd/-/issues/1722#note_1720627 */ .matches = { + DMI_MATCH(DMI_BOARD_NAME, "NL5xNU"), + }, + .driver_data = &(struct acpi_gpiolib_dmi_quirk) { + .ignore_wake = "ELAN0415:00@9", + }, + }, + { + /* + * Spurious wakeups from TP_ATTN# pin + * Found in BIOS 1.7.8 + * https://gitlab.freedesktop.org/drm/amd/-/issues/1722#note_1720627 + */ + .matches = { DMI_MATCH(DMI_BOARD_NAME, "NL5xRU"), }, .driver_data = &(struct acpi_gpiolib_dmi_quirk) {
From: Ruihan Li lrh2000@pku.edu.cn
commit 25c150ac103a4ebeed0319994c742a90634ddf18 upstream.
Previously, capability was checked using capable(), which verified that the caller of the ioctl system call had the required capability. In addition, the result of the check would be stored in the HCI_SOCK_TRUSTED flag, making it persistent for the socket.
However, malicious programs can abuse this approach by deliberately sharing an HCI socket with a privileged task. The HCI socket will be marked as trusted when the privileged task occasionally makes an ioctl call.
This problem can be solved by using sk_capable() to check capability, which ensures that not only the current task but also the socket opener has the specified capability, thus reducing the risk of privilege escalation through the previously identified vulnerability.
Cc: stable@vger.kernel.org Fixes: f81f5b2db869 ("Bluetooth: Send control open and close messages for HCI raw sockets") Signed-off-by: Ruihan Li lrh2000@pku.edu.cn Signed-off-by: Luiz Augusto von Dentz luiz.von.dentz@intel.com Signed-off-by: Greg Kroah-Hartman gregkh@linuxfoundation.org --- net/bluetooth/hci_sock.c | 9 ++++++++- 1 file changed, 8 insertions(+), 1 deletion(-)
--- a/net/bluetooth/hci_sock.c +++ b/net/bluetooth/hci_sock.c @@ -1003,7 +1003,14 @@ static int hci_sock_ioctl(struct socket if (hci_sock_gen_cookie(sk)) { struct sk_buff *skb;
- if (capable(CAP_NET_ADMIN)) + /* Perform careful checks before setting the HCI_SOCK_TRUSTED + * flag. Make sure that not only the current task but also + * the socket opener has the required capability, since + * privileged programs can be tricked into making ioctl calls + * on HCI sockets, and the socket should not be marked as + * trusted simply because the ioctl caller is privileged. + */ + if (sk_capable(sk, CAP_NET_ADMIN)) hci_sock_set_flag(sk, HCI_SOCK_TRUSTED);
/* Send event to monitor */
From: Genjian Zhang zhanggenjian@kylinos.cn
commit 8ba7d5f5ba931be68a94b8c91bcced1622934e7a upstream.
There are some warnings on older compilers (gcc 10, 7) or non-x86_64 architectures (aarch64). As btrfs wants to enable -Wmaybe-uninitialized by default, fix the warnings even though it's not necessary on recent compilers (gcc 12+).
../fs/btrfs/volumes.c: In function ‘btrfs_init_new_device’: ../fs/btrfs/volumes.c:2703:3: error: ‘seed_devices’ may be used uninitialized in this function [-Werror=maybe-uninitialized] 2703 | btrfs_setup_sprout(fs_info, seed_devices); | ^~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~
../fs/btrfs/send.c: In function ‘get_cur_inode_state’: ../include/linux/compiler.h:70:32: error: ‘right_gen’ may be used uninitialized in this function [-Werror=maybe-uninitialized] 70 | (__if_trace.miss_hit[1]++,1) : \ | ^ ../fs/btrfs/send.c:1878:6: note: ‘right_gen’ was declared here 1878 | u64 right_gen; | ^~~~~~~~~
Reported-by: k2ci kernel-bot@kylinos.cn Signed-off-by: Genjian Zhang zhanggenjian@kylinos.cn Reviewed-by: David Sterba dsterba@suse.com [ update changelog ] Signed-off-by: David Sterba dsterba@suse.com Cc: Ammar Faizi ammarfaizi2@gnuweeb.org Signed-off-by: Greg Kroah-Hartman gregkh@linuxfoundation.org --- fs/btrfs/send.c | 2 +- fs/btrfs/volumes.c | 2 +- 2 files changed, 2 insertions(+), 2 deletions(-)
--- a/fs/btrfs/send.c +++ b/fs/btrfs/send.c @@ -1658,7 +1658,7 @@ static int get_cur_inode_state(struct se int left_ret; int right_ret; u64 left_gen; - u64 right_gen; + u64 right_gen = 0; struct btrfs_inode_info info;
ret = get_inode_info(sctx->send_root, ino, &info); --- a/fs/btrfs/volumes.c +++ b/fs/btrfs/volumes.c @@ -2631,7 +2631,7 @@ int btrfs_init_new_device(struct btrfs_f struct super_block *sb = fs_info->sb; struct rcu_string *name; struct btrfs_fs_devices *fs_devices = fs_info->fs_devices; - struct btrfs_fs_devices *seed_devices; + struct btrfs_fs_devices *seed_devices = NULL; u64 orig_super_total_bytes; u64 orig_super_num_devices; int ret = 0;
From: Arınç ÜNAL arinc.unal@arinc9.com
commit a095edfc15f0832e046ae23964e249ef5c95af87 upstream.
Add UNISOC vendor ID and TOZED LT70-C modem which is based from UNISOC SL8563. The modem supports the NCM mode. Interface 0 is used for running the AT commands. Interface 12 is the ADB interface.
T: Bus=01 Lev=01 Prnt=01 Port=00 Cnt=01 Dev#= 6 Spd=480 MxCh= 0 D: Ver= 2.00 Cls=00(>ifc ) Sub=00 Prot=00 MxPS=64 #Cfgs= 1 P: Vendor=1782 ProdID=4055 Rev=04.04 S: Manufacturer=Unisoc Phone S: Product=Unisoc Phone S: SerialNumber=<redacted> C: #Ifs=14 Cfg#= 1 Atr=c0 MxPwr=500mA I: If#= 0 Alt= 0 #EPs= 1 Cls=02(commc) Sub=0d Prot=00 Driver=cdc_ncm E: Ad=82(I) Atr=03(Int.) MxPS= 16 Ivl=32ms I: If#= 1 Alt= 1 #EPs= 2 Cls=0a(data ) Sub=00 Prot=01 Driver=cdc_ncm E: Ad=01(O) Atr=02(Bulk) MxPS= 512 Ivl=0ms E: Ad=81(I) Atr=02(Bulk) MxPS= 512 Ivl=0ms I: If#=10 Alt= 0 #EPs= 2 Cls=ff(vend.) Sub=00 Prot=00 Driver=option E: Ad=07(O) Atr=02(Bulk) MxPS= 512 Ivl=0ms E: Ad=8b(I) Atr=02(Bulk) MxPS= 512 Ivl=0ms I: If#=11 Alt= 0 #EPs= 2 Cls=ff(vend.) Sub=00 Prot=00 Driver=option E: Ad=08(O) Atr=02(Bulk) MxPS= 512 Ivl=0ms E: Ad=8c(I) Atr=02(Bulk) MxPS= 512 Ivl=0ms I: If#=12 Alt= 0 #EPs= 2 Cls=ff(vend.) Sub=42 Prot=01 Driver=(none) E: Ad=09(O) Atr=02(Bulk) MxPS= 512 Ivl=0ms E: Ad=8d(I) Atr=02(Bulk) MxPS= 512 Ivl=0ms I: If#=13 Alt= 0 #EPs= 2 Cls=ff(vend.) Sub=00 Prot=00 Driver=option E: Ad=0a(O) Atr=02(Bulk) MxPS= 512 Ivl=0ms E: Ad=8e(I) Atr=02(Bulk) MxPS= 512 Ivl=0ms I: If#= 2 Alt= 0 #EPs= 1 Cls=02(commc) Sub=0d Prot=00 Driver=cdc_ncm E: Ad=84(I) Atr=03(Int.) MxPS= 16 Ivl=32ms I: If#= 3 Alt= 1 #EPs= 2 Cls=0a(data ) Sub=00 Prot=01 Driver=cdc_ncm E: Ad=02(O) Atr=02(Bulk) MxPS= 512 Ivl=0ms E: Ad=83(I) Atr=02(Bulk) MxPS= 512 Ivl=0ms I: If#= 4 Alt= 0 #EPs= 1 Cls=02(commc) Sub=0d Prot=00 Driver=cdc_ncm E: Ad=86(I) Atr=03(Int.) MxPS= 16 Ivl=32ms I: If#= 5 Alt= 1 #EPs= 2 Cls=0a(data ) Sub=00 Prot=01 Driver=cdc_ncm E: Ad=03(O) Atr=02(Bulk) MxPS= 512 Ivl=0ms E: Ad=85(I) Atr=02(Bulk) MxPS= 512 Ivl=0ms I: If#= 6 Alt= 0 #EPs= 1 Cls=02(commc) Sub=0d Prot=00 Driver=cdc_ncm E: Ad=88(I) Atr=03(Int.) MxPS= 16 Ivl=32ms I: If#= 7 Alt= 1 #EPs= 2 Cls=0a(data ) Sub=00 Prot=01 Driver=cdc_ncm E: Ad=04(O) Atr=02(Bulk) MxPS= 512 Ivl=0ms E: Ad=87(I) Atr=02(Bulk) MxPS= 512 Ivl=0ms I: If#= 8 Alt= 0 #EPs= 2 Cls=ff(vend.) Sub=00 Prot=00 Driver=option E: Ad=05(O) Atr=02(Bulk) MxPS= 512 Ivl=0ms E: Ad=89(I) Atr=02(Bulk) MxPS= 512 Ivl=0ms I: If#= 9 Alt= 0 #EPs= 2 Cls=ff(vend.) Sub=00 Prot=00 Driver=option E: Ad=06(O) Atr=02(Bulk) MxPS= 512 Ivl=0ms E: Ad=8a(I) Atr=02(Bulk) MxPS= 512 Ivl=0ms
Signed-off-by: Arınç ÜNAL arinc.unal@arinc9.com Link: https://lore.kernel.org/r/20230417152003.243248-1-arinc.unal@arinc9.com Cc: stable@vger.kernel.org Signed-off-by: Johan Hovold johan@kernel.org Signed-off-by: Greg Kroah-Hartman gregkh@linuxfoundation.org --- drivers/usb/serial/option.c | 6 ++++++ 1 file changed, 6 insertions(+)
--- a/drivers/usb/serial/option.c +++ b/drivers/usb/serial/option.c @@ -595,6 +595,11 @@ static void option_instat_callback(struc #define SIERRA_VENDOR_ID 0x1199 #define SIERRA_PRODUCT_EM9191 0x90d3
+/* UNISOC (Spreadtrum) products */ +#define UNISOC_VENDOR_ID 0x1782 +/* TOZED LT70-C based on UNISOC SL8563 uses UNISOC's vendor ID */ +#define TOZED_PRODUCT_LT70C 0x4055 + /* Device flags */
/* Highest interface number which can be used with NCTRL() and RSVD() */ @@ -2225,6 +2230,7 @@ static const struct usb_device_id option { USB_DEVICE_AND_INTERFACE_INFO(OPPO_VENDOR_ID, OPPO_PRODUCT_R11, 0xff, 0xff, 0x30) }, { USB_DEVICE_AND_INTERFACE_INFO(SIERRA_VENDOR_ID, SIERRA_PRODUCT_EM9191, 0xff, 0xff, 0x30) }, { USB_DEVICE_AND_INTERFACE_INFO(SIERRA_VENDOR_ID, SIERRA_PRODUCT_EM9191, 0xff, 0, 0) }, + { USB_DEVICE_AND_INTERFACE_INFO(UNISOC_VENDOR_ID, TOZED_PRODUCT_LT70C, 0xff, 0, 0) }, { } /* Terminating entry */ }; MODULE_DEVICE_TABLE(usb, option_ids);
From: Stephen Boyd swboyd@chromium.org
commit e2f06aa885081e1391916367f53bad984714b4db upstream.
Don't require the use of dynamic debug (or modification of the kernel to add a #define DEBUG to the top of this file) to get the printk message about driver probe timing. This printk is only emitted when initcall_debug is enabled on the kernel commandline, and it isn't immediately obvious that you have to do something else to debug boot timing issues related to driver probe. Add a comment too so it doesn't get converted back to pr_debug().
Fixes: eb7fbc9fb118 ("driver core: Add missing '\n' in log messages") Cc: stable stable@kernel.org Cc: Christophe JAILLET christophe.jaillet@wanadoo.fr Cc: Brian Norris briannorris@chromium.org Reviewed-by: Brian Norris briannorris@chromium.org Acked-by: Randy Dunlap rdunlap@infradead.org Signed-off-by: Stephen Boyd swboyd@chromium.org Link: https://lore.kernel.org/r/20230412225842.3196599-1-swboyd@chromium.org Signed-off-by: Greg Kroah-Hartman gregkh@linuxfoundation.org --- drivers/base/dd.c | 7 ++++++- 1 file changed, 6 insertions(+), 1 deletion(-)
--- a/drivers/base/dd.c +++ b/drivers/base/dd.c @@ -718,7 +718,12 @@ static int really_probe_debug(struct dev calltime = ktime_get(); ret = really_probe(dev, drv); rettime = ktime_get(); - pr_debug("probe of %s returned %d after %lld usecs\n", + /* + * Don't change this to pr_debug() because that requires + * CONFIG_DYNAMIC_DEBUG and we want a simple 'initcall_debug' on the + * kernel commandline to print this all the time at the debug level. + */ + printk(KERN_DEBUG "probe of %s returned %d after %lld usecs\n", dev_name(dev), ret, ktime_us_delta(rettime, calltime)); return ret; }
From: Alexandre Ghiti alexghiti@rivosinc.com
commit ef69d2559fe91f23d27a3d6fd640b5641787d22e upstream.
riscv establishes 2 virtual mappings:
- early_pg_dir maps the kernel which allows to discover the system memory - swapper_pg_dir installs the final mapping (linear mapping included)
We used to map the dtb in early_pg_dir using DTB_EARLY_BASE_VA, and this mapping was not carried over in swapper_pg_dir. It happens that early_init_fdt_scan_reserved_mem() must be called before swapper_pg_dir is setup otherwise we could allocate reserved memory defined in the dtb. And this function initializes reserved_mem variable with addresses that lie in the early_pg_dir dtb mapping: when those addresses are reused with swapper_pg_dir, this mapping does not exist and then we trap.
The previous "fix" was incorrect as early_init_fdt_scan_reserved_mem() must be called before swapper_pg_dir is set up otherwise we could allocate in reserved memory defined in the dtb.
So move the dtb mapping in the fixmap region which is established in early_pg_dir and handed over to swapper_pg_dir.
This patch had to be backported because: - the documentation for sv57 is not present here (as sv48/57 are not present)
Fixes: 922b0375fc93 ("riscv: Fix memblock reservation for device tree blob") Fixes: 8f3a2b4a96dc ("RISC-V: Move DT mapping outof fixmap") Fixes: 50e63dd8ed92 ("riscv: fix reserved memory setup") Reported-by: Conor Dooley conor.dooley@microchip.com Link: https://lore.kernel.org/all/f8e67f82-103d-156c-deb0-d6d6e2756f5e@microchip.c... Signed-off-by: Alexandre Ghiti alexghiti@rivosinc.com Reviewed-by: Conor Dooley conor.dooley@microchip.com Tested-by: Conor Dooley conor.dooley@microchip.com Link: https://lore.kernel.org/r/20230329081932.79831-2-alexghiti@rivosinc.com Cc: stable@vger.kernel.org # 6.1.x Signed-off-by: Palmer Dabbelt palmer@rivosinc.com Signed-off-by: Greg Kroah-Hartman gregkh@linuxfoundation.org --- Documentation/riscv/vm-layout.rst | 4 +- arch/riscv/include/asm/fixmap.h | 8 ++++ arch/riscv/include/asm/pgtable.h | 8 +++- arch/riscv/kernel/setup.c | 1 arch/riscv/mm/init.c | 61 +++++++++++++++++++++----------------- 5 files changed, 50 insertions(+), 32 deletions(-)
--- a/Documentation/riscv/vm-layout.rst +++ b/Documentation/riscv/vm-layout.rst @@ -47,7 +47,7 @@ RISC-V Linux Kernel SV39 | Kernel-space virtual memory, shared between all processes: ____________________________________________________________|___________________________________________________________ | | | | - ffffffc6fee00000 | -228 GB | ffffffc6feffffff | 2 MB | fixmap + ffffffc6fea00000 | -228 GB | ffffffc6feffffff | 6 MB | fixmap ffffffc6ff000000 | -228 GB | ffffffc6ffffffff | 16 MB | PCI io ffffffc700000000 | -228 GB | ffffffc7ffffffff | 4 GB | vmemmap ffffffc800000000 | -224 GB | ffffffd7ffffffff | 64 GB | vmalloc/ioremap space @@ -83,7 +83,7 @@ RISC-V Linux Kernel SV48 | Kernel-space virtual memory, shared between all processes: ____________________________________________________________|___________________________________________________________ | | | | - ffff8d7ffee00000 | -114.5 TB | ffff8d7ffeffffff | 2 MB | fixmap + ffff8d7ffea00000 | -114.5 TB | ffff8d7ffeffffff | 6 MB | fixmap ffff8d7fff000000 | -114.5 TB | ffff8d7fffffffff | 16 MB | PCI io ffff8d8000000000 | -114.5 TB | ffff8f7fffffffff | 2 TB | vmemmap ffff8f8000000000 | -112.5 TB | ffffaf7fffffffff | 32 TB | vmalloc/ioremap space --- a/arch/riscv/include/asm/fixmap.h +++ b/arch/riscv/include/asm/fixmap.h @@ -22,6 +22,14 @@ */ enum fixed_addresses { FIX_HOLE, + /* + * The fdt fixmap mapping must be PMD aligned and will be mapped + * using PMD entries in fixmap_pmd in 64-bit and a PGD entry in 32-bit. + */ + FIX_FDT_END, + FIX_FDT = FIX_FDT_END + FIX_FDT_SIZE / PAGE_SIZE - 1, + + /* Below fixmaps will be mapped using fixmap_pte */ FIX_PTE, FIX_PMD, FIX_PUD, --- a/arch/riscv/include/asm/pgtable.h +++ b/arch/riscv/include/asm/pgtable.h @@ -87,9 +87,13 @@
#define FIXADDR_TOP PCI_IO_START #ifdef CONFIG_64BIT -#define FIXADDR_SIZE PMD_SIZE +#define MAX_FDT_SIZE PMD_SIZE +#define FIX_FDT_SIZE (MAX_FDT_SIZE + SZ_2M) +#define FIXADDR_SIZE (PMD_SIZE + FIX_FDT_SIZE) #else -#define FIXADDR_SIZE PGDIR_SIZE +#define MAX_FDT_SIZE PGDIR_SIZE +#define FIX_FDT_SIZE MAX_FDT_SIZE +#define FIXADDR_SIZE (PGDIR_SIZE + FIX_FDT_SIZE) #endif #define FIXADDR_START (FIXADDR_TOP - FIXADDR_SIZE)
--- a/arch/riscv/kernel/setup.c +++ b/arch/riscv/kernel/setup.c @@ -283,7 +283,6 @@ void __init setup_arch(char **cmdline_p) else pr_err("No DTB found in kernel mappings\n"); #endif - early_init_fdt_scan_reserved_mem(); misc_mem_init();
init_resources(); --- a/arch/riscv/mm/init.c +++ b/arch/riscv/mm/init.c @@ -57,7 +57,6 @@ unsigned long empty_zero_page[PAGE_SIZE EXPORT_SYMBOL(empty_zero_page);
extern char _start[]; -#define DTB_EARLY_BASE_VA PGDIR_SIZE void *_dtb_early_va __initdata; uintptr_t _dtb_early_pa __initdata;
@@ -236,6 +235,14 @@ static void __init setup_bootmem(void) set_max_mapnr(max_low_pfn - ARCH_PFN_OFFSET);
reserve_initrd_mem(); + + /* + * No allocation should be done before reserving the memory as defined + * in the device tree, otherwise the allocation could end up in a + * reserved region. + */ + early_init_fdt_scan_reserved_mem(); + /* * If DTB is built in, no need to reserve its memblock. * Otherwise, do reserve it but avoid using @@ -279,9 +286,6 @@ pgd_t trampoline_pg_dir[PTRS_PER_PGD] __ static pte_t fixmap_pte[PTRS_PER_PTE] __page_aligned_bss;
pgd_t early_pg_dir[PTRS_PER_PGD] __initdata __aligned(PAGE_SIZE); -static p4d_t __maybe_unused early_dtb_p4d[PTRS_PER_P4D] __initdata __aligned(PAGE_SIZE); -static pud_t __maybe_unused early_dtb_pud[PTRS_PER_PUD] __initdata __aligned(PAGE_SIZE); -static pmd_t __maybe_unused early_dtb_pmd[PTRS_PER_PMD] __initdata __aligned(PAGE_SIZE);
#ifdef CONFIG_XIP_KERNEL #define pt_ops (*(struct pt_alloc_ops *)XIP_FIXUP(&pt_ops)) @@ -626,9 +630,6 @@ static void __init create_p4d_mapping(p4 #define trampoline_pgd_next (pgtable_l5_enabled ? \ (uintptr_t)trampoline_p4d : (pgtable_l4_enabled ? \ (uintptr_t)trampoline_pud : (uintptr_t)trampoline_pmd)) -#define early_dtb_pgd_next (pgtable_l5_enabled ? \ - (uintptr_t)early_dtb_p4d : (pgtable_l4_enabled ? \ - (uintptr_t)early_dtb_pud : (uintptr_t)early_dtb_pmd)) #else #define pgd_next_t pte_t #define alloc_pgd_next(__va) pt_ops.alloc_pte(__va) @@ -636,7 +637,6 @@ static void __init create_p4d_mapping(p4 #define create_pgd_next_mapping(__nextp, __va, __pa, __sz, __prot) \ create_pte_mapping(__nextp, __va, __pa, __sz, __prot) #define fixmap_pgd_next ((uintptr_t)fixmap_pte) -#define early_dtb_pgd_next ((uintptr_t)early_dtb_pmd) #define create_p4d_mapping(__pmdp, __va, __pa, __sz, __prot) do {} while(0) #define create_pud_mapping(__pmdp, __va, __pa, __sz, __prot) do {} while(0) #define create_pmd_mapping(__pmdp, __va, __pa, __sz, __prot) do {} while(0) @@ -859,32 +859,28 @@ static void __init create_kernel_page_ta * this means 2 PMD entries whereas for 32-bit kernel, this is only 1 PGDIR * entry. */ -static void __init create_fdt_early_page_table(pgd_t *pgdir, uintptr_t dtb_pa) +static void __init create_fdt_early_page_table(pgd_t *pgdir, + uintptr_t fix_fdt_va, + uintptr_t dtb_pa) { -#ifndef CONFIG_BUILTIN_DTB uintptr_t pa = dtb_pa & ~(PMD_SIZE - 1);
- create_pgd_mapping(early_pg_dir, DTB_EARLY_BASE_VA, - IS_ENABLED(CONFIG_64BIT) ? early_dtb_pgd_next : pa, - PGDIR_SIZE, - IS_ENABLED(CONFIG_64BIT) ? PAGE_TABLE : PAGE_KERNEL); - - if (pgtable_l5_enabled) - create_p4d_mapping(early_dtb_p4d, DTB_EARLY_BASE_VA, - (uintptr_t)early_dtb_pud, P4D_SIZE, PAGE_TABLE); - - if (pgtable_l4_enabled) - create_pud_mapping(early_dtb_pud, DTB_EARLY_BASE_VA, - (uintptr_t)early_dtb_pmd, PUD_SIZE, PAGE_TABLE); +#ifndef CONFIG_BUILTIN_DTB + /* Make sure the fdt fixmap address is always aligned on PMD size */ + BUILD_BUG_ON(FIX_FDT % (PMD_SIZE / PAGE_SIZE));
- if (IS_ENABLED(CONFIG_64BIT)) { - create_pmd_mapping(early_dtb_pmd, DTB_EARLY_BASE_VA, + /* In 32-bit only, the fdt lies in its own PGD */ + if (!IS_ENABLED(CONFIG_64BIT)) { + create_pgd_mapping(early_pg_dir, fix_fdt_va, + pa, MAX_FDT_SIZE, PAGE_KERNEL); + } else { + create_pmd_mapping(fixmap_pmd, fix_fdt_va, pa, PMD_SIZE, PAGE_KERNEL); - create_pmd_mapping(early_dtb_pmd, DTB_EARLY_BASE_VA + PMD_SIZE, + create_pmd_mapping(fixmap_pmd, fix_fdt_va + PMD_SIZE, pa + PMD_SIZE, PMD_SIZE, PAGE_KERNEL); }
- dtb_early_va = (void *)DTB_EARLY_BASE_VA + (dtb_pa & (PMD_SIZE - 1)); + dtb_early_va = (void *)fix_fdt_va + (dtb_pa & (PMD_SIZE - 1)); #else /* * For 64-bit kernel, __va can't be used since it would return a linear @@ -1054,7 +1050,8 @@ asmlinkage void __init setup_vm(uintptr_ create_kernel_page_table(early_pg_dir, true);
/* Setup early mapping for FDT early scan */ - create_fdt_early_page_table(early_pg_dir, dtb_pa); + create_fdt_early_page_table(early_pg_dir, + __fix_to_virt(FIX_FDT), dtb_pa);
/* * Bootime fixmap only can handle PMD_SIZE mapping. Thus, boot-ioremap @@ -1096,6 +1093,16 @@ static void __init setup_vm_final(void) u64 i;
/* Setup swapper PGD for fixmap */ +#if !defined(CONFIG_64BIT) + /* + * In 32-bit, the device tree lies in a pgd entry, so it must be copied + * directly in swapper_pg_dir in addition to the pgd entry that points + * to fixmap_pte. + */ + unsigned long idx = pgd_index(__fix_to_virt(FIX_FDT)); + + set_pgd(&swapper_pg_dir[idx], early_pg_dir[idx]); +#endif create_pgd_mapping(swapper_pg_dir, FIXADDR_START, __pa_symbol(fixmap_pgd_next), PGDIR_SIZE, PAGE_TABLE);
From: Alexandre Ghiti alexghiti@rivosinc.com
commit f1581626071c8e37c58c5e8f0b4126b17172a211 upstream.
early_init_dt_verify() is already called in parse_dtb() and since the dtb address does not change anymore (it is now in the fixmap region), no need to reset initial_boot_params by calling early_init_dt_verify() again.
Signed-off-by: Alexandre Ghiti alexghiti@rivosinc.com Link: https://lore.kernel.org/r/20230329081932.79831-3-alexghiti@rivosinc.com Cc: stable@vger.kernel.org # 6.1.x Signed-off-by: Palmer Dabbelt palmer@rivosinc.com Signed-off-by: Greg Kroah-Hartman gregkh@linuxfoundation.org --- arch/riscv/kernel/setup.c | 5 +---- 1 file changed, 1 insertion(+), 4 deletions(-)
--- a/arch/riscv/kernel/setup.c +++ b/arch/riscv/kernel/setup.c @@ -278,10 +278,7 @@ void __init setup_arch(char **cmdline_p) #if IS_ENABLED(CONFIG_BUILTIN_DTB) unflatten_and_copy_device_tree(); #else - if (early_init_dt_verify(__va(XIP_FIXUP(dtb_early_pa)))) - unflatten_device_tree(); - else - pr_err("No DTB found in kernel mappings\n"); + unflatten_device_tree(); #endif misc_mem_init();
From: Alexandre Ghiti alexghiti@rivosinc.com
commit 1b50f956c8fe9082bdee4a9cfd798149c52f7043 upstream.
We used to access the dtb via its linear mapping address but now that the dtb early mapping was moved in the fixmap region, we can keep using this address since it is present in swapper_pg_dir, and remove the dtb relocation.
Note that the relocation was wrong anyway since early_memremap() is restricted to 256K whereas the maximum fdt size is 2MB.
Signed-off-by: Alexandre Ghiti alexghiti@rivosinc.com Reviewed-by: Conor Dooley conor.dooley@microchip.com Tested-by: Conor Dooley conor.dooley@microchip.com Link: https://lore.kernel.org/r/20230329081932.79831-4-alexghiti@rivosinc.com Cc: stable@vger.kernel.org # 6.1.x Signed-off-by: Palmer Dabbelt palmer@rivosinc.com Signed-off-by: Greg Kroah-Hartman gregkh@linuxfoundation.org --- arch/riscv/mm/init.c | 21 ++------------------- 1 file changed, 2 insertions(+), 19 deletions(-)
--- a/arch/riscv/mm/init.c +++ b/arch/riscv/mm/init.c @@ -249,25 +249,8 @@ static void __init setup_bootmem(void) * early_init_fdt_reserve_self() since __pa() does * not work for DTB pointers that are fixmap addresses */ - if (!IS_ENABLED(CONFIG_BUILTIN_DTB)) { - /* - * In case the DTB is not located in a memory region we won't - * be able to locate it later on via the linear mapping and - * get a segfault when accessing it via __va(dtb_early_pa). - * To avoid this situation copy DTB to a memory region. - * Note that memblock_phys_alloc will also reserve DTB region. - */ - if (!memblock_is_memory(dtb_early_pa)) { - size_t fdt_size = fdt_totalsize(dtb_early_va); - phys_addr_t new_dtb_early_pa = memblock_phys_alloc(fdt_size, PAGE_SIZE); - void *new_dtb_early_va = early_memremap(new_dtb_early_pa, fdt_size); - - memcpy(new_dtb_early_va, dtb_early_va, fdt_size); - early_memunmap(new_dtb_early_va, fdt_size); - _dtb_early_pa = new_dtb_early_pa; - } else - memblock_reserve(dtb_early_pa, fdt_totalsize(dtb_early_va)); - } + if (!IS_ENABLED(CONFIG_BUILTIN_DTB)) + memblock_reserve(dtb_early_pa, fdt_totalsize(dtb_early_va));
dma_contiguous_reserve(dma32_phys_limit); if (IS_ENABLED(CONFIG_64BIT))
* Greg Kroah-Hartman gregkh@linuxfoundation.org wrote:
This is the start of the stable review cycle for the 6.1.27 release. There are 16 patches in this series, all will be posted as a response to this one. If anyone has any issues with these being applied, please let me know.
Responses should be made by Sun, 30 Apr 2023 11:20:30 +0000. Anything received after that time might be too late.
Hi Greg
6.1.27-rc1
compiles, boots and runs here on x86_64 (AMD Ryzen 5 PRO 4650G, Slackware64-15.0)
Tested-by: Markus Reichelt lkt+2023@mareichelt.com
On Fri, 28 Apr 2023 at 12:29, Greg Kroah-Hartman gregkh@linuxfoundation.org wrote:
This is the start of the stable review cycle for the 6.1.27 release. There are 16 patches in this series, all will be posted as a response to this one. If anyone has any issues with these being applied, please let me know.
Responses should be made by Sun, 30 Apr 2023 11:20:30 +0000. Anything received after that time might be too late.
The whole patch series can be found in one patch at: https://www.kernel.org/pub/linux/kernel/v6.x/stable-review/patch-6.1.27-rc1.... or in the git tree and branch at: git://git.kernel.org/pub/scm/linux/kernel/git/stable/linux-stable-rc.git linux-6.1.y and the diffstat can be found below.
thanks,
greg k-h
Results from Linaro’s test farm. No regressions on arm64, arm, x86_64, and i386.
Tested-by: Linux Kernel Functional Testing lkft@linaro.org
## Build * kernel: 6.1.27-rc1 * git: https://gitlab.com/Linaro/lkft/mirrors/stable/linux-stable-rc * git branch: linux-6.1.y * git commit: 58b654bf36db7c89178300adbb034ce63301b685 * git describe: v6.1.22-591-g58b654bf36db * test details: https://qa-reports.linaro.org/lkft/linux-stable-rc-linux-6.1.y/build/v6.1.22...
## Test Regressions (compared to v6.1.22-574-ge4ff6ff54dea)
## Metric Regressions (compared to v6.1.22-574-ge4ff6ff54dea)
## Test Fixes (compared to v6.1.22-574-ge4ff6ff54dea)
## Metric Fixes (compared to v6.1.22-574-ge4ff6ff54dea)
## Test result summary total: 158499, pass: 138000, fail: 3707, skip: 16470, xfail: 322
## Build Summary * arc: 5 total, 5 passed, 0 failed * arm: 145 total, 144 passed, 1 failed * arm64: 48 total, 48 passed, 0 failed * i386: 35 total, 34 passed, 1 failed * mips: 26 total, 26 passed, 0 failed * parisc: 6 total, 6 passed, 0 failed * powerpc: 34 total, 34 passed, 0 failed * riscv: 12 total, 12 passed, 0 failed * s390: 12 total, 12 passed, 0 failed * sh: 12 total, 12 passed, 0 failed * sparc: 6 total, 6 passed, 0 failed * x86_64: 40 total, 40 passed, 0 failed
## Test suites summary * boot * fwts * igt-gpu-tools * kselftest-android * kselftest-arm64 * kselftest-breakpoints * kselftest-capabilities * kselftest-cgroup * kselftest-clone3 * kselftest-core * kselftest-cpu-hotplug * kselftest-cpufreq * kselftest-drivers-dma-buf * kselftest-efivarfs * kselftest-filesystems * kselftest-filesystems-binderfs * kselftest-firmware * kselftest-fpu * kselftest-ftrace * kselftest-futex * kselftest-gpio * kselftest-intel_pstate * kselftest-ipc * kselftest-ir * kselftest-kcmp * kselftest-kexec * kselftest-kvm * kselftest-lib * kselftest-livepatch * kselftest-membarrier * kselftest-memfd * kselftest-memory-hotplug * kselftest-mincore * kselftest-mount * kselftest-mqueue * kselftest-net * kselftest-net-forwarding * kselftest-net-mptcp * kselftest-netfilter * kselftest-nsfs * kselftest-openat2 * kselftest-pid_namespace * kselftest-pidfd * kselftest-proc * kselftest-pstore * kselftest-ptrace * kselftest-rseq * kselftest-rtc * kselftest-seccomp * kselftest-sigaltstack * kselftest-size * kselftest-splice * kselftest-static_keys * kselftest-sync * kselftest-sysctl * kselftest-tc-testing * kselftest-timens * kselftest-timers * kselftest-tmpfs * kselftest-tpm2 * kselftest-user * kselftest-vm * kselftest-x86 * kselftest-zram * kunit * kvm-unit-tests * libhugetlbfs * log-parser-boot * log-parser-test * ltp-cap_bounds * ltp-commands * ltp-containers * ltp-controllers * ltp-cpuhotplug * ltp-crypto * ltp-cve * ltp-dio * ltp-fcntl-locktests * ltp-filecaps * ltp-fs * ltp-fs_bind * ltp-fs_perms_simple * ltp-fsx * ltp-hugetlb * ltp-io * ltp-ipc * ltp-math * ltp-mm * ltp-nptl * ltp-pty * ltp-sched * ltp-securebits * ltp-smoke * ltp-syscalls * ltp-tracing * network-basic-tests * perf * rcutorture * v4l2-compliance * vdso
-- Linaro LKFT https://lkft.linaro.org
On 4/28/23 05:27, Greg Kroah-Hartman wrote:
This is the start of the stable review cycle for the 6.1.27 release. There are 16 patches in this series, all will be posted as a response to this one. If anyone has any issues with these being applied, please let me know.
Responses should be made by Sun, 30 Apr 2023 11:20:30 +0000. Anything received after that time might be too late.
The whole patch series can be found in one patch at: https://www.kernel.org/pub/linux/kernel/v6.x/stable-review/patch-6.1.27-rc1.... or in the git tree and branch at: git://git.kernel.org/pub/scm/linux/kernel/git/stable/linux-stable-rc.git linux-6.1.y and the diffstat can be found below.
thanks,
greg k-h
Compiled and booted on my test system. No dmesg regressions.
Tested-by: Shuah Khan skhan@linuxfoundation.org
thanks, -- Shuah
On Fri, Apr 28, 2023 at 01:27:52PM +0200, Greg Kroah-Hartman wrote:
This is the start of the stable review cycle for the 6.1.27 release. There are 16 patches in this series, all will be posted as a response to this one. If anyone has any issues with these being applied, please let me know.
Responses should be made by Sun, 30 Apr 2023 11:20:30 +0000. Anything received after that time might be too late.
Build results: total: 155 pass: 155 fail: 0 Qemu test results: total: 519 pass: 519 fail: 0
Tested-by: Guenter Roeck linux@roeck-us.net
Guenter
On 4/28/23 4:27 AM, Greg Kroah-Hartman wrote:
This is the start of the stable review cycle for the 6.1.27 release. There are 16 patches in this series, all will be posted as a response to this one. If anyone has any issues with these being applied, please let me know.
Responses should be made by Sun, 30 Apr 2023 11:20:30 +0000. Anything received after that time might be too late.
The whole patch series can be found in one patch at: https://www.kernel.org/pub/linux/kernel/v6.x/stable-review/patch-6.1.27-rc1.... or in the git tree and branch at: git://git.kernel.org/pub/scm/linux/kernel/git/stable/linux-stable-rc.git linux-6.1.y and the diffstat can be found below.
thanks,
greg k-h
Built and booted successfully on RISC-V RV64 (HiFive Unmatched).
Tested-by: Ron Economos re@w6rz.net
On Fri, Apr 28, 2023 at 01:27:52PM +0200, Greg Kroah-Hartman wrote:
This is the start of the stable review cycle for the 6.1.27 release. There are 16 patches in this series, all will be posted as a response to this one. If anyone has any issues with these being applied, please let me know.
Successfully built and installed bindeb-pkgs for my computer (Acer E15, Intel Core i3 Haswell).
Tested-by: Bagas Sanjaya bagasdotme@gmail.com
On Fri, Apr 28, 2023 at 01:27:52PM +0200, Greg Kroah-Hartman wrote:
This is the start of the stable review cycle for the 6.1.27 release. There are 16 patches in this series, all will be posted as a response to this one. If anyone has any issues with these being applied, please let me know.
LGTM in terms of what my CI is testing. I didn't test the niche configurations that the fixmap stuff was introduced for specifically here, but you won't be too long hearing from me next week if it goes awry there.
Tested-by: Conor Dooley conor.dooley@microchip.com
Thanks, Conor.
Hi Greg
On Fri, Apr 28, 2023 at 8:30 PM Greg Kroah-Hartman gregkh@linuxfoundation.org wrote:
This is the start of the stable review cycle for the 6.1.27 release. There are 16 patches in this series, all will be posted as a response to this one. If anyone has any issues with these being applied, please let me know.
Responses should be made by Sun, 30 Apr 2023 11:20:30 +0000. Anything received after that time might be too late.
The whole patch series can be found in one patch at: https://www.kernel.org/pub/linux/kernel/v6.x/stable-review/patch-6.1.27-rc1.... or in the git tree and branch at: git://git.kernel.org/pub/scm/linux/kernel/git/stable/linux-stable-rc.git linux-6.1.y and the diffstat can be found below.
thanks,
greg k-h
6.1.27-rc1 tested.
x86_64
Build successfully completed. Boot successfully completed. No dmesg regressions. Video output normal. Sound output normal.
Lenovo ThinkPad X1 Carbon Gen10(Intel i7-1260P, arch linux)
Thanks
Tested-by: Takeshi Ogasawara takeshi.ogasawara@futuring-girl.com
On 4/28/2023 4:27 AM, Greg Kroah-Hartman wrote:
This is the start of the stable review cycle for the 6.1.27 release. There are 16 patches in this series, all will be posted as a response to this one. If anyone has any issues with these being applied, please let me know.
Responses should be made by Sun, 30 Apr 2023 11:20:30 +0000. Anything received after that time might be too late.
The whole patch series can be found in one patch at: https://www.kernel.org/pub/linux/kernel/v6.x/stable-review/patch-6.1.27-rc1.... or in the git tree and branch at: git://git.kernel.org/pub/scm/linux/kernel/git/stable/linux-stable-rc.git linux-6.1.y and the diffstat can be found below.
thanks,
greg k-h
On ARCH_BRCMSTB using 32-bit and 64-bit ARM kernels, build tested on BMIPS_GENERIC:
Tested-by: Florian Fainelli f.fainelli@gmail.com
Hello Greg,
From: Greg Kroah-Hartman gregkh@linuxfoundation.org Sent: Friday, April 28, 2023 12:28 PM
This is the start of the stable review cycle for the 6.1.27 release. There are 16 patches in this series, all will be posted as a response to this one. If anyone has any issues with these being applied, please let me know.
Responses should be made by Sun, 30 Apr 2023 11:20:30 +0000. Anything received after that time might be too late.
Sorry it's late. Weekend & national holidays in the UK etc...
CIP configurations built and booted with Linux 6.1.27-rc1 (58b654bf36db): https://gitlab.com/cip-project/cip-testing/linux-stable-rc-ci/-/pipelines/85... https://gitlab.com/cip-project/cip-testing/linux-stable-rc-ci/-/commits/linu...
Tested-by: Chris Paterson (CIP) chris.paterson2@renesas.com
Kind regards, Chris
linux-stable-mirror@lists.linaro.org