November 2020 - Linux-stable-mirror

FAILED: patch "[PATCH] mm/userfaultfd: do not access vma->vm_mm after calling" failed to apply to 4.4-stable tree

by gregkh＠linuxfoundation.org

The patch below does not apply to the 4.4-stable tree. If someone wants it applied there, or to any other stable or longterm tree, then please email the backport, including the original git commit id to <stable(a)vger.kernel.org>. thanks, greg k-h ------------------ original commit in Linus's tree ------------------ >From bfe8cc1db02ab243c62780f17fc57f65bde0afe1 Mon Sep 17 00:00:00 2001 From: Gerald Schaefer <gerald.schaefer(a)linux.ibm.com> Date: Sat, 21 Nov 2020 22:17:15 -0800 Subject: [PATCH] mm/userfaultfd: do not access vma->vm_mm after calling handle_userfault() Alexander reported a syzkaller / KASAN finding on s390, see below for complete output. In do_huge_pmd_anonymous_page(), the pre-allocated pagetable will be freed in some cases. In the case of userfaultfd_missing(), this will happen after calling handle_userfault(), which might have released the mmap_lock. Therefore, the following pte_free(vma->vm_mm, pgtable) will access an unstable vma->vm_mm, which could have been freed or re-used already. For all architectures other than s390 this will go w/o any negative impact, because pte_free() simply frees the page and ignores the passed-in mm. The implementation for SPARC32 would also access mm->page_table_lock for pte_free(), but there is no THP support in SPARC32, so the buggy code path will not be used there. For s390, the mm->context.pgtable_list is being used to maintain the 2K pagetable fragments, and operating on an already freed or even re-used mm could result in various more or less subtle bugs due to list / pagetable corruption. Fix this by calling pte_free() before handle_userfault(), similar to how it is already done in __do_huge_pmd_anonymous_page() for the WRITE / non-huge_zero_page case. Commit 6b251fc96cf2c ("userfaultfd: call handle_userfault() for userfaultfd_missing() faults") actually introduced both, the do_huge_pmd_anonymous_page() and also __do_huge_pmd_anonymous_page() changes wrt to calling handle_userfault(), but only in the latter case it put the pte_free() before calling handle_userfault(). BUG: KASAN: use-after-free in do_huge_pmd_anonymous_page+0xcda/0xd90 mm/huge_memory.c:744 Read of size 8 at addr 00000000962d6988 by task syz-executor.0/9334 CPU: 1 PID: 9334 Comm: syz-executor.0 Not tainted 5.10.0-rc1-syzkaller-07083-g4c9720875573 #0 Hardware name: IBM 3906 M04 701 (KVM/Linux) Call Trace: do_huge_pmd_anonymous_page+0xcda/0xd90 mm/huge_memory.c:744 create_huge_pmd mm/memory.c:4256 [inline] __handle_mm_fault+0xe6e/0x1068 mm/memory.c:4480 handle_mm_fault+0x288/0x748 mm/memory.c:4607 do_exception+0x394/0xae0 arch/s390/mm/fault.c:479 do_dat_exception+0x34/0x80 arch/s390/mm/fault.c:567 pgm_check_handler+0x1da/0x22c arch/s390/kernel/entry.S:706 copy_from_user_mvcos arch/s390/lib/uaccess.c:111 [inline] raw_copy_from_user+0x3a/0x88 arch/s390/lib/uaccess.c:174 _copy_from_user+0x48/0xa8 lib/usercopy.c:16 copy_from_user include/linux/uaccess.h:192 [inline] __do_sys_sigaltstack kernel/signal.c:4064 [inline] __s390x_sys_sigaltstack+0xc8/0x240 kernel/signal.c:4060 system_call+0xe0/0x28c arch/s390/kernel/entry.S:415 Allocated by task 9334: slab_alloc_node mm/slub.c:2891 [inline] slab_alloc mm/slub.c:2899 [inline] kmem_cache_alloc+0x118/0x348 mm/slub.c:2904 vm_area_dup+0x9c/0x2b8 kernel/fork.c:356 __split_vma+0xba/0x560 mm/mmap.c:2742 split_vma+0xca/0x108 mm/mmap.c:2800 mlock_fixup+0x4ae/0x600 mm/mlock.c:550 apply_vma_lock_flags+0x2c6/0x398 mm/mlock.c:619 do_mlock+0x1aa/0x718 mm/mlock.c:711 __do_sys_mlock2 mm/mlock.c:738 [inline] __s390x_sys_mlock2+0x86/0xa8 mm/mlock.c:728 system_call+0xe0/0x28c arch/s390/kernel/entry.S:415 Freed by task 9333: slab_free mm/slub.c:3142 [inline] kmem_cache_free+0x7c/0x4b8 mm/slub.c:3158 __vma_adjust+0x7b2/0x2508 mm/mmap.c:960 vma_merge+0x87e/0xce0 mm/mmap.c:1209 userfaultfd_release+0x412/0x6b8 fs/userfaultfd.c:868 __fput+0x22c/0x7a8 fs/file_table.c:281 task_work_run+0x200/0x320 kernel/task_work.c:151 tracehook_notify_resume include/linux/tracehook.h:188 [inline] do_notify_resume+0x100/0x148 arch/s390/kernel/signal.c:538 system_call+0xe6/0x28c arch/s390/kernel/entry.S:416 The buggy address belongs to the object at 00000000962d6948 which belongs to the cache vm_area_struct of size 200 The buggy address is located 64 bytes inside of 200-byte region [00000000962d6948, 00000000962d6a10) The buggy address belongs to the page: page:00000000313a09fe refcount:1 mapcount:0 mapping:0000000000000000 index:0x0 pfn:0x962d6 flags: 0x3ffff00000000200(slab) raw: 3ffff00000000200 000040000257e080 0000000c0000000c 000000008020ba00 raw: 0000000000000000 000f001e00000000 ffffffff00000001 0000000096959501 page dumped because: kasan: bad access detected page->mem_cgroup:0000000096959501 Memory state around the buggy address: 00000000962d6880: 00 00 00 00 00 00 00 00 00 00 00 00 00 00 00 00 00000000962d6900: 00 fc fc fc fc fc fc fc fc fa fb fb fb fb fb fb >00000000962d6980: fb fb fb fb fb fb fb fb fb fb fb fb fb fb fb fb ^ 00000000962d6a00: fb fb fc fc fc fc fc fc fc fc 00 00 00 00 00 00 00000000962d6a80: 00 00 00 00 00 00 00 00 00 00 00 00 00 00 00 00 ================================================================== Fixes: 6b251fc96cf2c ("userfaultfd: call handle_userfault() for userfaultfd_missing() faults") Reported-by: Alexander Egorenkov <egorenar(a)linux.ibm.com> Signed-off-by: Gerald Schaefer <gerald.schaefer(a)linux.ibm.com> Signed-off-by: Andrew Morton <akpm(a)linux-foundation.org> Cc: Andrea Arcangeli <aarcange(a)redhat.com> Cc: Heiko Carstens <hca(a)linux.ibm.com> Cc: <stable(a)vger.kernel.org> [4.3+] Link: https://lkml.kernel.org/r/20201110190329.11920-1-gerald.schaefer@linux.ibm.… Signed-off-by: Linus Torvalds <torvalds(a)linux-foundation.org> diff --git a/mm/huge_memory.c b/mm/huge_memory.c index 9474dbc150ed..ec2bb93f7431 100644 --- a/mm/huge_memory.c +++ b/mm/huge_memory.c @@ -710,7 +710,6 @@ vm_fault_t do_huge_pmd_anonymous_page(struct vm_fault *vmf) transparent_hugepage_use_zero_page()) { pgtable_t pgtable; struct page *zero_page; - bool set; vm_fault_t ret; pgtable = pte_alloc_one(vma->vm_mm); if (unlikely(!pgtable)) @@ -723,25 +722,25 @@ vm_fault_t do_huge_pmd_anonymous_page(struct vm_fault *vmf) } vmf->ptl = pmd_lock(vma->vm_mm, vmf->pmd); ret = 0; - set = false; if (pmd_none(*vmf->pmd)) { ret = check_stable_address_space(vma->vm_mm); if (ret) { spin_unlock(vmf->ptl); + pte_free(vma->vm_mm, pgtable); } else if (userfaultfd_missing(vma)) { spin_unlock(vmf->ptl); + pte_free(vma->vm_mm, pgtable); ret = handle_userfault(vmf, VM_UFFD_MISSING); VM_BUG_ON(ret & VM_FAULT_FALLBACK); } else { set_huge_zero_page(pgtable, vma->vm_mm, vma, haddr, vmf->pmd, zero_page); spin_unlock(vmf->ptl); - set = true; } - } else + } else { spin_unlock(vmf->ptl); - if (!set) pte_free(vma->vm_mm, pgtable); + } return ret; } gfp = alloc_hugepage_direct_gfpmask(vma);

5 years

3
3
0 0

FAILED: patch "[PATCH] btrfs: cleanup cow block on error" failed to apply to 4.4-stable tree

by gregkh＠linuxfoundation.org

The patch below does not apply to the 4.4-stable tree. If someone wants it applied there, or to any other stable or longterm tree, then please email the backport, including the original git commit id to <stable(a)vger.kernel.org>. thanks, greg k-h ------------------ original commit in Linus's tree ------------------ >From 572c83acdcdafeb04e70aa46be1fa539310be20c Mon Sep 17 00:00:00 2001 From: Josef Bacik <josef(a)toxicpanda.com> Date: Tue, 29 Sep 2020 08:53:54 -0400 Subject: [PATCH] btrfs: cleanup cow block on error In fstest btrfs/064 a transaction abort in __btrfs_cow_block could lead to a system lockup. It gets stuck trying to write back inodes, and the write back thread was trying to lock an extent buffer: $ cat /proc/2143497/stack [<0>] __btrfs_tree_lock+0x108/0x250 [<0>] lock_extent_buffer_for_io+0x35e/0x3a0 [<0>] btree_write_cache_pages+0x15a/0x3b0 [<0>] do_writepages+0x28/0xb0 [<0>] __writeback_single_inode+0x54/0x5c0 [<0>] writeback_sb_inodes+0x1e8/0x510 [<0>] wb_writeback+0xcc/0x440 [<0>] wb_workfn+0xd7/0x650 [<0>] process_one_work+0x236/0x560 [<0>] worker_thread+0x55/0x3c0 [<0>] kthread+0x13a/0x150 [<0>] ret_from_fork+0x1f/0x30 This is because we got an error while COWing a block, specifically here if (test_bit(BTRFS_ROOT_SHAREABLE, &root->state)) { ret = btrfs_reloc_cow_block(trans, root, buf, cow); if (ret) { btrfs_abort_transaction(trans, ret); return ret; } } [16402.241552] BTRFS: Transaction aborted (error -2) [16402.242362] WARNING: CPU: 1 PID: 2563188 at fs/btrfs/ctree.c:1074 __btrfs_cow_block+0x376/0x540 [16402.249469] CPU: 1 PID: 2563188 Comm: fsstress Not tainted 5.9.0-rc6+ #8 [16402.249936] Hardware name: QEMU Standard PC (Q35 + ICH9, 2009), BIOS 1.13.0-2.fc32 04/01/2014 [16402.250525] RIP: 0010:__btrfs_cow_block+0x376/0x540 [16402.252417] RSP: 0018:ffff9cca40e578b0 EFLAGS: 00010282 [16402.252787] RAX: 0000000000000025 RBX: 0000000000000002 RCX: ffff9132bbd19388 [16402.253278] RDX: 00000000ffffffd8 RSI: 0000000000000027 RDI: ffff9132bbd19380 [16402.254063] RBP: ffff9132b41a49c0 R08: 0000000000000000 R09: 0000000000000000 [16402.254887] R10: 0000000000000000 R11: ffff91324758b080 R12: ffff91326ef17ce0 [16402.255694] R13: ffff91325fc0f000 R14: ffff91326ef176b0 R15: ffff9132815e2000 [16402.256321] FS: 00007f542c6d7b80(0000) GS:ffff9132bbd00000(0000) knlGS:0000000000000000 [16402.256973] CS: 0010 DS: 0000 ES: 0000 CR0: 0000000080050033 [16402.257374] CR2: 00007f127b83f250 CR3: 0000000133480002 CR4: 0000000000370ee0 [16402.257867] Call Trace: [16402.258072] btrfs_cow_block+0x109/0x230 [16402.258356] btrfs_search_slot+0x530/0x9d0 [16402.258655] btrfs_lookup_file_extent+0x37/0x40 [16402.259155] __btrfs_drop_extents+0x13c/0xd60 [16402.259628] ? btrfs_block_rsv_migrate+0x4f/0xb0 [16402.259949] btrfs_replace_file_extents+0x190/0x820 [16402.260873] btrfs_clone+0x9ae/0xc00 [16402.261139] btrfs_extent_same_range+0x66/0x90 [16402.261771] btrfs_remap_file_range+0x353/0x3b1 [16402.262333] vfs_dedupe_file_range_one.part.0+0xd5/0x140 [16402.262821] vfs_dedupe_file_range+0x189/0x220 [16402.263150] do_vfs_ioctl+0x552/0x700 [16402.263662] __x64_sys_ioctl+0x62/0xb0 [16402.264023] do_syscall_64+0x33/0x40 [16402.264364] entry_SYSCALL_64_after_hwframe+0x44/0xa9 [16402.264862] RIP: 0033:0x7f542c7d15cb [16402.266901] RSP: 002b:00007ffd35944ea8 EFLAGS: 00000246 ORIG_RAX: 0000000000000010 [16402.267627] RAX: ffffffffffffffda RBX: 00000000009d1968 RCX: 00007f542c7d15cb [16402.268298] RDX: 00000000009d2490 RSI: 00000000c0189436 RDI: 0000000000000003 [16402.268958] RBP: 00000000009d2520 R08: 0000000000000036 R09: 00000000009d2e64 [16402.269726] R10: 0000000000000000 R11: 0000000000000246 R12: 0000000000000002 [16402.270659] R13: 000000000001f000 R14: 00000000009d1970 R15: 00000000009d2e80 [16402.271498] irq event stamp: 0 [16402.271846] hardirqs last enabled at (0): [<0000000000000000>] 0x0 [16402.272497] hardirqs last disabled at (0): [<ffffffff910dbf59>] copy_process+0x6b9/0x1ba0 [16402.273343] softirqs last enabled at (0): [<ffffffff910dbf59>] copy_process+0x6b9/0x1ba0 [16402.273905] softirqs last disabled at (0): [<0000000000000000>] 0x0 [16402.274338] ---[ end trace 737874a5a41a8236 ]--- [16402.274669] BTRFS: error (device dm-9) in __btrfs_cow_block:1074: errno=-2 No such entry [16402.276179] BTRFS info (device dm-9): forced readonly [16402.277046] BTRFS: error (device dm-9) in btrfs_replace_file_extents:2723: errno=-2 No such entry [16402.278744] BTRFS: error (device dm-9) in __btrfs_cow_block:1074: errno=-2 No such entry [16402.279968] BTRFS: error (device dm-9) in __btrfs_cow_block:1074: errno=-2 No such entry [16402.280582] BTRFS info (device dm-9): balance: ended with status: -30 The problem here is that as soon as we allocate the new block it is locked and marked dirty in the btree inode. This means that we could attempt to writeback this block and need to lock the extent buffer. However we're not unlocking it here and thus we deadlock. Fix this by unlocking the cow block if we have any errors inside of __btrfs_cow_block, and also free it so we do not leak it. CC: stable(a)vger.kernel.org # 4.4+ Reviewed-by: Filipe Manana <fdmanana(a)suse.com> Signed-off-by: Josef Bacik <josef(a)toxicpanda.com> Reviewed-by: David Sterba <dsterba(a)suse.com> Signed-off-by: David Sterba <dsterba(a)suse.com> diff --git a/fs/btrfs/ctree.c b/fs/btrfs/ctree.c index a165093739c4..113da62dc17f 100644 --- a/fs/btrfs/ctree.c +++ b/fs/btrfs/ctree.c @@ -1064,6 +1064,8 @@ static noinline int __btrfs_cow_block(struct btrfs_trans_handle *trans, ret = update_ref_for_cow(trans, root, buf, cow, &last_ref); if (ret) { + btrfs_tree_unlock(cow); + free_extent_buffer(cow); btrfs_abort_transaction(trans, ret); return ret; } @@ -1071,6 +1073,8 @@ static noinline int __btrfs_cow_block(struct btrfs_trans_handle *trans, if (test_bit(BTRFS_ROOT_SHAREABLE, &root->state)) { ret = btrfs_reloc_cow_block(trans, root, buf, cow); if (ret) { + btrfs_tree_unlock(cow); + free_extent_buffer(cow); btrfs_abort_transaction(trans, ret); return ret; } @@ -1103,6 +1107,8 @@ static noinline int __btrfs_cow_block(struct btrfs_trans_handle *trans, if (last_ref) { ret = tree_mod_log_free_eb(buf); if (ret) { + btrfs_tree_unlock(cow); + free_extent_buffer(cow); btrfs_abort_transaction(trans, ret); return ret; }

5 years

3
2
0 0

[PATCH 1/2] perf/x86/intel: Fix rtm_abort_event encoding on Ice Lake

by kan.liang＠linux.intel.com

From: Kan Liang <kan.liang(a)linux.intel.com> According to the event list from icelake_core_v1.09.json, the encoding of the RTM_RETIRED.ABORTED event on Ice Lake should be, "EventCode": "0xc9", "UMask": "0x04", "EventName": "RTM_RETIRED.ABORTED", Correct the wrong encoding. Fixes: 6017608936c1 ("perf/x86/intel: Add Icelake support") Signed-off-by: Kan Liang <kan.liang(a)linux.intel.com> Cc: stable(a)vger.kernel.org --- arch/x86/events/intel/core.c | 2 +- 1 file changed, 1 insertion(+), 1 deletion(-) diff --git a/arch/x86/events/intel/core.c b/arch/x86/events/intel/core.c index ec503399c5df..c5aca399a46a 100644 --- a/arch/x86/events/intel/core.c +++ b/arch/x86/events/intel/core.c @@ -5321,7 +5321,7 @@ __init int intel_pmu_init(void) mem_attr = icl_events_attrs; td_attr = icl_td_events_attrs; tsx_attr = icl_tsx_events_attrs; - x86_pmu.rtm_abort_event = X86_CONFIG(.event=0xca, .umask=0x02); + x86_pmu.rtm_abort_event = X86_CONFIG(.event=0xc9, .umask=0x04); x86_pmu.lbr_pt_coexist = true; intel_pmu_pebs_data_source_skl(pmem); x86_pmu.update_topdown_event = icl_update_topdown_event; -- 2.17.1

5 years

2
3
0 0

[PATCH] dyndbg: fix use before null check

by Jim Cromie

commit a2d375eda771 ("dyndbg: refine export, rename to dynamic_debug_exec_queries()") Above commit copies a string before checking for null pointer, fix this, and add a pr_err. Also trim comment, and add return val info. Fixes: a2d375eda771 Cc: stable(a)vger.kernel.org Signed-off-by: Jim Cromie <jim.cromie(a)gmail.com> --- lib/dynamic_debug.c | 18 ++++++++++++------ 1 file changed, 12 insertions(+), 6 deletions(-) diff --git a/lib/dynamic_debug.c b/lib/dynamic_debug.c index bd7b3aaa93c3..711a9def8c83 100644 --- a/lib/dynamic_debug.c +++ b/lib/dynamic_debug.c @@ -553,17 +553,23 @@ static int ddebug_exec_queries(char *query, const char *modname) * @query: query-string described in admin-guide/dynamic-debug-howto * @modname: string containing module name, usually &module.mod_name * - * This uses the >/proc/dynamic_debug/control reader, allowing module - * authors to modify their dynamic-debug callsites. The modname is - * canonically struct module.mod_name, but can also be null or a - * module-wildcard, for example: "drm*". + * This uses the >control reader, allowing module authors to modify + * their dynamic-debug callsites. The modname is canonically struct + * module.mod_name, but can also be null or a module-wildcard, for + * example: "drm*". + * Returns <0 on error, >=0 for callsites changed */ int dynamic_debug_exec_queries(const char *query, const char *modname) { int rc; - char *qry = kstrndup(query, PAGE_SIZE, GFP_KERNEL); + char *qry; /* writable copy of query */ - if (!query) + if (!query) { + pr_err("non-null query/command string expected\n"); + return -EINVAL; + } + qry = kstrndup(query, PAGE_SIZE, GFP_KERNEL); + if (!qry) return -ENOMEM; rc = ddebug_exec_queries(qry, modname); -- 2.28.0

5 years

2
1
0 0

[PATCH 4.4 00/70] 4.4.165-stable review

by Greg Kroah-Hartman

This is the start of the stable review cycle for the 4.4.165 release. There are 70 patches in this series, all will be posted as a response to this one. If anyone has any issues with these being applied, please let me know. Responses should be made by Wed Nov 28 10:50:16 UTC 2018. Anything received after that time might be too late. The whole patch series can be found in one patch at: https://www.kernel.org/pub/linux/kernel/v4.x/stable-review/patch-4.4.165-rc… or in the git tree and branch at: git://git.kernel.org/pub/scm/linux/kernel/git/stable/linux-stable-rc.git linux-4.4.y and the diffstat can be found below. thanks, greg k-h ------------- Pseudo-Shortlog of commits: Greg Kroah-Hartman <gregkh(a)linuxfoundation.org> Linux 4.4.165-rc1 Eric Biggers <ebiggers(a)google.com> HID: uhid: forbid UHID_CREATE under KERNEL_DS or elevated privileges Al Viro <viro(a)zeniv.linux.org.uk> new helper: uaccess_kernel() Hans de Goede <hdegoede(a)redhat.com> ACPI / platform: Add SMB0001 HID to forbidden_id_list Gustavo A. R. Silva <gustavo(a)embeddedor.com> drivers/misc/sgi-gru: fix Spectre v1 vulnerability Mattias Jacobsson <2pi(a)mok.nu> USB: misc: appledisplay: add 20" Apple Cinema Display Nathan Chancellor <natechancellor(a)gmail.com> misc: atmel-ssc: Fix section annotation on atmel_ssc_get_driver_data Emmanuel Pescosta <emmanuelpescosta099(a)gmail.com> usb: quirks: Add delay-init quirk for Corsair K70 LUX RGB Kai-Heng Feng <kai.heng.feng(a)canonical.com> USB: quirks: Add no-lpm quirk for Raydium touchscreens Maarten Jacobs <maarten256(a)outlook.com> usb: cdc-acm: add entry for Hiro (Conexant) modem Dan Carpenter <dan.carpenter(a)oracle.com> uio: Fix an Oops on load Sakari Ailus <sakari.ailus(a)linux.intel.com> media: v4l: event: Add subscription to list before calling "add" operation Greg Kroah-Hartman <gregkh(a)linuxfoundation.org> Revert "Bluetooth: h5: Fix missing dependency on BT_HCIUART_SERDEV" Hans Verkuil <hverkuil(a)xs4all.nl> Revert "media: videobuf2-core: don't call memop 'finish' when queueing" Lu Fengqi <lufq.fnst(a)cn.fujitsu.com> btrfs: fix pinned underflow after transaction aborted Andreas Gruenbacher <agruenba(a)redhat.com> gfs2: Put bitmap buffers in put_super YueHaibing <yuehaibing(a)huawei.com> SUNRPC: drop pointless static qualifier in xdr_get_next_encode_buffer() Minchan Kim <minchan(a)kernel.org> zram: close udev startup race condition as default groups Vignesh R <vigneshr(a)ti.com> i2c: omap: Enable for ARCH_K3 Jeremy Linton <jeremy.linton(a)arm.com> lib/raid6: Fix arm64 test build Geert Uytterhoeven <geert(a)linux-m68k.org> hwmon: (ibmpowernv) Remove bogus __init annotations Taehee Yoo <ap420073(a)gmail.com> netfilter: xt_IDLETIMER: add sysfs filename checking routine Jozsef Kadlecsik <kadlec(a)blackhole.kfki.hu> netfilter: ipset: Correct rcu_dereference() call in ip_set_put_comment() Justin M. Forbes <jforbes(a)fedoraproject.org> s390/mm: Fix ERROR: "__node_distance" undefined! Eric Westbrook <eric(a)westbrook.io> netfilter: ipset: actually allow allowable CIDR 0 in hash:net,port,net Vasily Gorbik <gor(a)linux.ibm.com> s390/vdso: add missing FORCE to build targets Nathan Chancellor <natechancellor(a)gmail.com> arm64: percpu: Initialize ret in the default case Paul Gortmaker <paul.gortmaker(a)windriver.com> platform/x86: acerhdf: Add BIOS entry for Gateway LT31 v1.3307 Marek Szyprowski <m.szyprowski(a)samsung.com> clk: samsung: exynos5420: Enable PERIS clocks for suspend Chengguang Xu <cgxu519(a)gmx.com> fs/exofs: fix potential memory leak in mount option parsing Richard Weinberger <richard(a)nod.at> um: Give start_idle_thread() a return code Ernesto A. Fernández <ernesto.mnd.fernandez(a)gmail.com> hfsplus: prevent btree data loss on root split Ernesto A. Fernández <ernesto.mnd.fernandez(a)gmail.com> hfs: prevent btree data loss on root split Jann Horn <jannh(a)google.com> reiserfs: propagate errors from fill_with_dentries() properly Matthias Kaehlcke <mka(a)chromium.org> x86/build: Use cc-option to validate stack alignment parameter Matthias Kaehlcke <mka(a)chromium.org> x86/build: Fix stack alignment for CLang Michael Davidson <md(a)google.com> x86/boot: #undef memcpy() et al in string.c Matthias Kaehlcke <mka(a)chromium.org> x86/build: Specify stack alignment for clang Matthias Kaehlcke <mka(a)chromium.org> x86/build: Use __cc-option for boot code compiler options Matthias Kaehlcke <mka(a)chromium.org> kbuild: Add __cc-option macro Matthias Kaehlcke <mka(a)chromium.org> x86/mm/kaslr: Use the _ASM_MUL macro for multiplication to work around Clang incompatibility Michael Davidson <md(a)google.com> crypto, x86: aesni - fix token pasting for clang Matthias Kaehlcke <mka(a)chromium.org> x86/kbuild: Use cc-option to enable -falign-{jumps/loops} Matthias Kaehlcke <mka(a)chromium.org> arm64: Disable asm-operand-width warning for clang Stefan Agner <stefan(a)agner.ch> kbuild: allow to use GCC toolchain not in Clang search path Stefan Agner <stefan(a)agner.ch> kbuild: set no-integrated-as before incl. arch Makefile Sodagudi Prasad <psodagud(a)codeaurora.org> kbuild: clang: disable unused variable warnings only when constant Nick Desaulniers <nick.desaulniers(a)gmail.com> kbuild: clang: remove crufty HOSTCFLAGS David Lin <dtwlin(a)google.com> kbuild: clang: fix build failures with sparse check Masahiro Yamada <yamada.masahiro(a)socionext.com> kbuild: move cc-option and cc-disable-warning after incl. arch Makefile Chris Fries <cfries(a)google.com> kbuild: Set KBUILD_CFLAGS before incl. arch Makefile Nick Desaulniers <ndesaulniers(a)google.com> kbuild: fix linker feature test macros when cross compiling with Clang Ard Biesheuvel <ard.biesheuvel(a)linaro.org> efi/libstub/arm64: Set -fpie when building the EFI stub Ard Biesheuvel <ard.biesheuvel(a)linaro.org> efi/libstub/arm64: Force 'hidden' visibility for section markers Ard Biesheuvel <ard.biesheuvel(a)linaro.org> crypto: arm64/sha - avoid non-standard inline asm tricks Matthias Kaehlcke <mka(a)chromium.org> kbuild: clang: Disable 'address-of-packed-member' warning Arnd Bergmann <arnd(a)arndb.de> modules: mark __inittest/__exittest as __maybe_unused Vinícius Tinti <viniciustinti(a)gmail.com> kbuild: Add support to generate LLVM assembly files Behan Webster <behanw(a)converseincode.com> kbuild: use -Oz instead of -Os when using clang Mark Charlebois <charlebm(a)gmail.com> kbuild, LLVMLinux: Add -Werror to cc-option to support clang Masahiro Yamada <yamada.masahiro(a)socionext.com> kbuild: drop -Wno-unknown-warning-option from clang options Jeroen Hofstee <jeroen(a)myspectrum.nl> kbuild: fix asm-offset generation to work with clang Masahiro Yamada <yamada.masahiro(a)socionext.com> kbuild: consolidate redundant sed script ASM offset generation Matthias Kaehlcke <mka(a)chromium.org> kbuild: Consolidate header generation from ASM offset information Michael Davidson <md(a)google.com> kbuild: clang: add -no-integrated-as to KBUILD_[AC]FLAGS Behan Webster <behanw(a)converseincode.com> kbuild: Add better clang cross build support David Ahern <dsahern(a)gmail.com> ipv6: Fix PMTU updates for UDP/raw sockets in presence of VRF Siva Reddy Kallam <siva.kallam(a)broadcom.com> tg3: Add PHY reset for 5717/5719/5720 in change ring and flow control paths Eric Dumazet <edumazet(a)google.com> net-gro: reset skb->pkt_type in napi_reuse_skb() Sabrina Dubroca <sd(a)queasysnail.net> ip_tunnel: don't force DF when MTU is locked 배석진 <soukjin.bae(a)samsung.com> flow_dissector: do not dissect l4 ports for fragments ------------- Diffstat: .gitignore | 1 + Kbuild | 25 --------------- Makefile | 41 ++++++++++++++++-------- arch/arm64/Makefile | 4 +++ arch/arm64/crypto/sha1-ce-core.S | 6 ++-- arch/arm64/crypto/sha1-ce-glue.c | 11 ++----- arch/arm64/crypto/sha2-ce-core.S | 6 ++-- arch/arm64/crypto/sha2-ce-glue.c | 13 +++----- arch/arm64/include/asm/percpu.h | 3 ++ arch/ia64/kernel/Makefile | 26 ++-------------- arch/s390/kernel/vdso32/Makefile | 6 ++-- arch/s390/kernel/vdso64/Makefile | 6 ++-- arch/s390/numa/numa.c | 1 + arch/um/os-Linux/skas/process.c | 5 +++ arch/x86/Makefile | 39 +++++++++++++++++------ arch/x86/boot/compressed/aslr.c | 3 +- arch/x86/boot/string.c | 9 ++++++ arch/x86/crypto/aes_ctrby8_avx-x86_64.S | 7 ++--- arch/x86/include/asm/asm.h | 1 + drivers/acpi/acpi_platform.c | 1 + drivers/block/zram/zram_drv.c | 26 ++++------------ drivers/bluetooth/Kconfig | 1 - drivers/clk/samsung/clk-exynos5420.c | 1 + drivers/firmware/efi/libstub/Makefile | 2 +- drivers/firmware/efi/libstub/arm64-stub.c | 10 +++++- drivers/hid/uhid.c | 13 ++++++++ drivers/hwmon/ibmpowernv.c | 7 ++--- drivers/i2c/busses/Kconfig | 2 +- drivers/media/v4l2-core/v4l2-event.c | 43 ++++++++++++++------------ drivers/media/v4l2-core/videobuf2-core.c | 9 ++---- drivers/misc/atmel-ssc.c | 2 +- drivers/misc/sgi-gru/grukdump.c | 4 +++ drivers/net/ethernet/broadcom/tg3.c | 18 +++++++++-- drivers/platform/x86/acerhdf.c | 1 + drivers/uio/uio.c | 7 +++-- drivers/usb/class/cdc-acm.c | 3 ++ drivers/usb/core/quirks.c | 8 +++++ drivers/usb/misc/appledisplay.c | 1 + fs/btrfs/disk-io.c | 19 +++++++++--- fs/exofs/super.c | 5 ++- fs/gfs2/rgrp.c | 3 +- fs/hfs/brec.c | 4 +++ fs/hfsplus/brec.c | 4 +++ fs/reiserfs/xattr.c | 7 +++++ include/linux/kbuild.h | 6 ++-- include/linux/module.h | 4 +-- include/linux/netfilter/ipset/ip_set_comment.h | 4 +-- include/linux/uaccess.h | 3 ++ lib/raid6/test/Makefile | 4 +-- net/core/dev.c | 4 +++ net/core/flow_dissector.c | 4 +-- net/ipv4/ip_tunnel_core.c | 2 +- net/ipv6/route.c | 8 +++-- net/netfilter/ipset/ip_set_hash_netportnet.c | 8 ++--- net/netfilter/xt_IDLETIMER.c | 20 ++++++++++++ net/sunrpc/xdr.c | 2 +- scripts/Kbuild.include | 18 +++++++---- scripts/Makefile.build | 8 +++++ scripts/Makefile.extrawarn | 1 - scripts/Makefile.lib | 31 +++++++++++++++++++ scripts/mod/Makefile | 28 ++--------------- 61 files changed, 349 insertions(+), 220 deletions(-)

5 years

9
88
0 0

[PATCH v2 1/2] powerpc/rtas: Restrict RTAS requests from userspace

by Andrew Donnellan

A number of userspace utilities depend on making calls to RTAS to retrieve information and update various things. The existing API through which we expose RTAS to userspace exposes more RTAS functionality than we actually need, through the sys_rtas syscall, which allows root (or anyone with CAP_SYS_ADMIN) to make any RTAS call they want with arbitrary arguments. Many RTAS calls take the address of a buffer as an argument, and it's up to the caller to specify the physical address of the buffer as an argument. We allocate a buffer (the "RMO buffer") in the Real Memory Area that RTAS can access, and then expose the physical address and size of this buffer in /proc/powerpc/rtas/rmo_buffer. Userspace is expected to read this address, poke at the buffer using /dev/mem, and pass an address in the RMO buffer to the RTAS call. However, there's nothing stopping the caller from specifying whatever address they want in the RTAS call, and it's easy to construct a series of RTAS calls that can overwrite arbitrary bytes (even without /dev/mem access). Additionally, there are some RTAS calls that do potentially dangerous things and for which there are no legitimate userspace use cases. In the past, this would not have been a particularly big deal as it was assumed that root could modify all system state freely, but with Secure Boot and lockdown we need to care about this. We can't fundamentally change the ABI at this point, however we can address this by implementing a filter that checks RTAS calls against a list of permitted calls and forces the caller to use addresses within the RMO buffer. The list is based off the list of calls that are used by the librtas userspace library, and has been tested with a number of existing userspace RTAS utilities. For compatibility with any applications we are not aware of that require other calls, the filter can be turned off at build time. Reported-by: Daniel Axtens <dja(a)axtens.net> Cc: stable(a)vger.kernel.org Signed-off-by: Andrew Donnellan <ajd(a)linux.ibm.com> --- v1->v2: - address comments from mpe - shorten the names of some struct members - make the filter array static/ro_after_init, use const char * - genericise the fixed buffer size cases - simplify/get rid of some of the error printing - get rid of rtas_token_name() --- arch/powerpc/Kconfig | 13 ++++ arch/powerpc/kernel/rtas.c | 153 +++++++++++++++++++++++++++++++++++++ 2 files changed, 166 insertions(+) diff --git a/arch/powerpc/Kconfig b/arch/powerpc/Kconfig index 1f48bbfb3ce9..8dd42b82379b 100644 --- a/arch/powerpc/Kconfig +++ b/arch/powerpc/Kconfig @@ -989,6 +989,19 @@ config PPC_SECVAR_SYSFS read/write operations on these variables. Say Y if you have secure boot enabled and want to expose variables to userspace. +config PPC_RTAS_FILTER + bool "Enable filtering of RTAS syscalls" + default y + depends on PPC_RTAS + help + The RTAS syscall API has security issues that could be used to + compromise system integrity. This option enforces restrictions on the + RTAS calls and arguments passed by userspace programs to mitigate + these issues. + + Say Y unless you know what you are doing and the filter is causing + problems for you. + endmenu config ISA_DMA_API diff --git a/arch/powerpc/kernel/rtas.c b/arch/powerpc/kernel/rtas.c index 806d554ce357..954f41676f69 100644 --- a/arch/powerpc/kernel/rtas.c +++ b/arch/powerpc/kernel/rtas.c @@ -992,6 +992,147 @@ struct pseries_errorlog *get_pseries_errorlog(struct rtas_error_log *log, return NULL; } +#ifdef CONFIG_PPC_RTAS_FILTER + +/* + * The sys_rtas syscall, as originally designed, allows root to pass + * arbitrary physical addresses to RTAS calls. A number of RTAS calls + * can be abused to write to arbitrary memory and do other things that + * are potentially harmful to system integrity, and thus should only + * be used inside the kernel and not exposed to userspace. + * + * All known legitimate users of the sys_rtas syscall will only ever + * pass addresses that fall within the RMO buffer, and use a known + * subset of RTAS calls. + * + * Accordingly, we filter RTAS requests to check that the call is + * permitted, and that provided pointers fall within the RMO buffer. + * The rtas_filters list contains an entry for each permitted call, + * with the indexes of the parameters which are expected to contain + * addresses and sizes of buffers allocated inside the RMO buffer. + */ +struct rtas_filter { + const char *name; + int token; + /* Indexes into the args buffer, -1 if not used */ + int buf_idx1; + int size_idx1; + int buf_idx2; + int size_idx2; + + int fixed_size; +}; + +static struct rtas_filter rtas_filters[] __ro_after_init = { + { "ibm,activate-firmware", -1, -1, -1, -1, -1 }, + { "ibm,configure-connector", -1, 0, -1, 1, -1, 4096 }, /* Special cased */ + { "display-character", -1, -1, -1, -1, -1 }, + { "ibm,display-message", -1, 0, -1, -1, -1 }, + { "ibm,errinjct", -1, 2, -1, -1, -1, 1024 }, + { "ibm,close-errinjct", -1, -1, -1, -1, -1 }, + { "ibm,open-errinct", -1, -1, -1, -1, -1 }, + { "ibm,get-config-addr-info2", -1, -1, -1, -1, -1 }, + { "ibm,get-dynamic-sensor-state", -1, 1, -1, -1, -1 }, + { "ibm,get-indices", -1, 2, 3, -1, -1 }, + { "get-power-level", -1, -1, -1, -1, -1 }, + { "get-sensor-state", -1, -1, -1, -1, -1 }, + { "ibm,get-system-parameter", -1, 1, 2, -1, -1 }, + { "get-time-of-day", -1, -1, -1, -1, -1 }, + { "ibm,get-vpd", -1, 0, -1, 1, 2 }, + { "ibm,lpar-perftools", -1, 2, 3, -1, -1 }, + { "ibm,platform-dump", -1, 4, 5, -1, -1 }, + { "ibm,read-slot-reset-state", -1, -1, -1, -1, -1 }, + { "ibm,scan-log-dump", -1, 0, 1, -1, -1 }, + { "ibm,set-dynamic-indicator", -1, 2, -1, -1, -1 }, + { "ibm,set-eeh-option", -1, -1, -1, -1, -1 }, + { "set-indicator", -1, -1, -1, -1, -1 }, + { "set-power-level", -1, -1, -1, -1, -1 }, + { "set-time-for-power-on", -1, -1, -1, -1, -1 }, + { "ibm,set-system-parameter", -1, 1, -1, -1, -1 }, + { "set-time-of-day", -1, -1, -1, -1, -1 }, + { "ibm,suspend-me", -1, -1, -1, -1, -1 }, + { "ibm,update-nodes", -1, 0, -1, -1, -1, 4096 }, + { "ibm,update-properties", -1, 0, -1, -1, -1, 4096 }, + { "ibm,physical-attestation", -1, 0, 1, -1, -1 }, +}; + +static bool in_rmo_buf(u32 base, u32 end) +{ + return base >= rtas_rmo_buf && + base < (rtas_rmo_buf + RTAS_RMOBUF_MAX) && + base <= end && + end >= rtas_rmo_buf && + end < (rtas_rmo_buf + RTAS_RMOBUF_MAX); +} + +static bool block_rtas_call(int token, int nargs, + struct rtas_args *args) +{ + int i; + + for (i = 0; i < ARRAY_SIZE(rtas_filters); i++) { + struct rtas_filter *f = &rtas_filters[i]; + u32 base, size, end; + + if (token != f->token) + continue; + + if (f->buf_idx1 != -1) { + base = be32_to_cpu(args->args[f->buf_idx1]); + if (f->size_idx1 != -1) + size = be32_to_cpu(args->args[f->size_idx1]); + else if (f->fixed_size) + size = f->fixed_size; + else + size = 1; + + end = base + size - 1; + if (!in_rmo_buf(base, end)) + goto err; + } + + if (f->buf_idx2 != -1) { + base = be32_to_cpu(args->args[f->buf_idx2]); + if (f->size_idx2 != -1) + size = be32_to_cpu(args->args[f->size_idx2]); + else if (f->fixed_size) + size = f->fixed_size; + else + size = 1; + end = base + size - 1; + + /* + * Special case for ibm,configure-connector where the + * address can be 0 + */ + if (!strcmp(f->name, "ibm,configure-connector") && + base == 0) + return false; + + if (!in_rmo_buf(base, end)) + goto err; + } + + return false; + } + +err: + pr_err_ratelimited("sys_rtas: RTAS call blocked - exploit attempt?\n"); + pr_err_ratelimited("sys_rtas: token=0x%x, nargs=%d (called by %s)\n", + token, nargs, current->comm); + return true; +} + +#else + +static bool block_rtas_call(int token, int nargs, + struct rtas_args *args) +{ + return false; +} + +#endif /* CONFIG_PPC_RTAS_FILTER */ + /* We assume to be passed big endian arguments */ SYSCALL_DEFINE1(rtas, struct rtas_args __user *, uargs) { @@ -1029,6 +1170,9 @@ SYSCALL_DEFINE1(rtas, struct rtas_args __user *, uargs) args.rets = &args.args[nargs]; memset(args.rets, 0, nret * sizeof(rtas_arg_t)); + if (block_rtas_call(token, nargs, &args)) + return -EINVAL; + /* Need to handle ibm,suspend_me call specially */ if (token == ibm_suspend_me_token) { @@ -1090,6 +1234,9 @@ void __init rtas_initialize(void) unsigned long rtas_region = RTAS_INSTANTIATE_MAX; u32 base, size, entry; int no_base, no_size, no_entry; +#ifdef CONFIG_PPC_RTAS_FILTER + int i; +#endif /* Get RTAS dev node and fill up our "rtas" structure with infos * about it. @@ -1129,6 +1276,12 @@ void __init rtas_initialize(void) #ifdef CONFIG_RTAS_ERROR_LOGGING rtas_last_error_token = rtas_token("rtas-last-error"); #endif + +#ifdef CONFIG_PPC_RTAS_FILTER + for (i = 0; i < ARRAY_SIZE(rtas_filters); i++) { + rtas_filters[i].token = rtas_token(rtas_filters[i].name); + } +#endif } int __init early_init_dt_scan_rtas(unsigned long node, -- 2.20.1

5 years

5
7
0 0

FAILED: patch "[PATCH] pinctrl: baytrail: Fix pin being driven low for a while on" failed to apply to 4.9-stable tree

by gregkh＠linuxfoundation.org

The patch below does not apply to the 4.9-stable tree. If someone wants it applied there, or to any other stable or longterm tree, then please email the backport, including the original git commit id to <stable(a)vger.kernel.org>. thanks, greg k-h ------------------ original commit in Linus's tree ------------------ >From 156abe2961601d60a8c2a60c6dc8dd6ce7adcdaf Mon Sep 17 00:00:00 2001 From: Hans de Goede <hdegoede(a)redhat.com> Date: Sat, 6 Jun 2020 11:31:50 +0200 Subject: [PATCH] pinctrl: baytrail: Fix pin being driven low for a while on gpiod_get(..., GPIOD_OUT_HIGH) The pins on the Bay Trail SoC have separate input-buffer and output-buffer enable bits and a read of the level bit of the value register will always return the value from the input-buffer. The BIOS of a device may configure a pin in output-only mode, only enabling the output buffer, and write 1 to the level bit to drive the pin high. This 1 written to the level bit will be stored inside the data-latch of the output buffer. But a subsequent read of the value register will return 0 for the level bit because the input-buffer is disabled. This causes a read-modify-write as done by byt_gpio_set_direction() to write 0 to the level bit, driving the pin low! Before this commit byt_gpio_direction_output() relied on pinctrl_gpio_direction_output() to set the direction, followed by a call to byt_gpio_set() to apply the selected value. This causes the pin to go low between the pinctrl_gpio_direction_output() and byt_gpio_set() calls. Change byt_gpio_direction_output() to directly make the register modifications itself instead. Replacing the 2 subsequent writes to the value register with a single write. Note that the pinctrl code does not keep track internally of the direction, so not going through pinctrl_gpio_direction_output() is not an issue. This issue was noticed on a Trekstor SurfTab Twin 10.1. When the panel is already on at boot (no external monitor connected), then the i915 driver does a gpiod_get(..., GPIOD_OUT_HIGH) for the panel-enable GPIO. The temporarily going low of that GPIO was causing the panel to reset itself after which it would not show an image until it was turned off and back on again (until a full modeset was done on it). This commit fixes this. This commit also updates the byt_gpio_direction_input() to use direct register accesses instead of going through pinctrl_gpio_direction_input(), to keep it consistent with byt_gpio_direction_output(). Note for backporting, this commit depends on: commit e2b74419e5cc ("pinctrl: baytrail: Replace WARN with dev_info_once when setting direct-irq pin to output") Cc: stable(a)vger.kernel.org Fixes: 86e3ef812fe3 ("pinctrl: baytrail: Update gpio chip operations") Signed-off-by: Hans de Goede <hdegoede(a)redhat.com> Acked-by: Mika Westerberg <mika.westerberg(a)linux.intel.com> Signed-off-by: Andy Shevchenko <andriy.shevchenko(a)linux.intel.com> diff --git a/drivers/pinctrl/intel/pinctrl-baytrail.c b/drivers/pinctrl/intel/pinctrl-baytrail.c index e3ceb3dfeabe..a917a2df520e 100644 --- a/drivers/pinctrl/intel/pinctrl-baytrail.c +++ b/drivers/pinctrl/intel/pinctrl-baytrail.c @@ -800,6 +800,21 @@ static void byt_gpio_disable_free(struct pinctrl_dev *pctl_dev, pm_runtime_put(vg->dev); } +static void byt_gpio_direct_irq_check(struct intel_pinctrl *vg, + unsigned int offset) +{ + void __iomem *conf_reg = byt_gpio_reg(vg, offset, BYT_CONF0_REG); + + /* + * Before making any direction modifications, do a check if gpio is set + * for direct IRQ. On Bay Trail, setting GPIO to output does not make + * sense, so let's at least inform the caller before they shoot + * themselves in the foot. + */ + if (readl(conf_reg) & BYT_DIRECT_IRQ_EN) + dev_info_once(vg->dev, "Potential Error: Setting GPIO with direct_irq_en to output"); +} + static int byt_gpio_set_direction(struct pinctrl_dev *pctl_dev, struct pinctrl_gpio_range *range, unsigned int offset, @@ -807,7 +822,6 @@ static int byt_gpio_set_direction(struct pinctrl_dev *pctl_dev, { struct intel_pinctrl *vg = pinctrl_dev_get_drvdata(pctl_dev); void __iomem *val_reg = byt_gpio_reg(vg, offset, BYT_VAL_REG); - void __iomem *conf_reg = byt_gpio_reg(vg, offset, BYT_CONF0_REG); unsigned long flags; u32 value; @@ -817,14 +831,8 @@ static int byt_gpio_set_direction(struct pinctrl_dev *pctl_dev, value &= ~BYT_DIR_MASK; if (input) value |= BYT_OUTPUT_EN; - else if (readl(conf_reg) & BYT_DIRECT_IRQ_EN) - /* - * Before making any direction modifications, do a check if gpio - * is set for direct IRQ. On baytrail, setting GPIO to output - * does not make sense, so let's at least inform the caller before - * they shoot themselves in the foot. - */ - dev_info_once(vg->dev, "Potential Error: Setting GPIO with direct_irq_en to output"); + else + byt_gpio_direct_irq_check(vg, offset); writel(value, val_reg); @@ -1165,19 +1173,50 @@ static int byt_gpio_get_direction(struct gpio_chip *chip, unsigned int offset) static int byt_gpio_direction_input(struct gpio_chip *chip, unsigned int offset) { - return pinctrl_gpio_direction_input(chip->base + offset); + struct intel_pinctrl *vg = gpiochip_get_data(chip); + void __iomem *val_reg = byt_gpio_reg(vg, offset, BYT_VAL_REG); + unsigned long flags; + u32 reg; + + raw_spin_lock_irqsave(&byt_lock, flags); + + reg = readl(val_reg); + reg &= ~BYT_DIR_MASK; + reg |= BYT_OUTPUT_EN; + writel(reg, val_reg); + + raw_spin_unlock_irqrestore(&byt_lock, flags); + return 0; } +/* + * Note despite the temptation this MUST NOT be converted into a call to + * pinctrl_gpio_direction_output() + byt_gpio_set() that does not work this + * MUST be done as a single BYT_VAL_REG register write. + * See the commit message of the commit adding this comment for details. + */ static int byt_gpio_direction_output(struct gpio_chip *chip, unsigned int offset, int value) { - int ret = pinctrl_gpio_direction_output(chip->base + offset); + struct intel_pinctrl *vg = gpiochip_get_data(chip); + void __iomem *val_reg = byt_gpio_reg(vg, offset, BYT_VAL_REG); + unsigned long flags; + u32 reg; - if (ret) - return ret; + raw_spin_lock_irqsave(&byt_lock, flags); + + byt_gpio_direct_irq_check(vg, offset); - byt_gpio_set(chip, offset, value); + reg = readl(val_reg); + reg &= ~BYT_DIR_MASK; + if (value) + reg |= BYT_LEVEL; + else + reg &= ~BYT_LEVEL; + writel(reg, val_reg); + + raw_spin_unlock_irqrestore(&byt_lock, flags); return 0; }

5 years

3
2
0 0

FAILED: patch "[PATCH] btrfs: sysfs: init devices outside of the chunk_mutex" failed to apply to 4.9-stable tree

by gregkh＠linuxfoundation.org

The patch below does not apply to the 4.9-stable tree. If someone wants it applied there, or to any other stable or longterm tree, then please email the backport, including the original git commit id to <stable(a)vger.kernel.org>. thanks, greg k-h ------------------ original commit in Linus's tree ------------------ >From ca10845a56856fff4de3804c85e6424d0f6d0cde Mon Sep 17 00:00:00 2001 From: Josef Bacik <josef(a)toxicpanda.com> Date: Tue, 1 Sep 2020 08:09:01 -0400 Subject: [PATCH] btrfs: sysfs: init devices outside of the chunk_mutex While running btrfs/061, btrfs/073, btrfs/078, or btrfs/178 we hit the following lockdep splat: ====================================================== WARNING: possible circular locking dependency detected 5.9.0-rc3+ #4 Not tainted ------------------------------------------------------ kswapd0/100 is trying to acquire lock: ffff96ecc22ef4a0 (&delayed_node->mutex){+.+.}-{3:3}, at: __btrfs_release_delayed_node.part.0+0x3f/0x330 but task is already holding lock: ffffffff8dd74700 (fs_reclaim){+.+.}-{0:0}, at: __fs_reclaim_acquire+0x5/0x30 which lock already depends on the new lock. the existing dependency chain (in reverse order) is: -> #3 (fs_reclaim){+.+.}-{0:0}: fs_reclaim_acquire+0x65/0x80 slab_pre_alloc_hook.constprop.0+0x20/0x200 kmem_cache_alloc+0x37/0x270 alloc_inode+0x82/0xb0 iget_locked+0x10d/0x2c0 kernfs_get_inode+0x1b/0x130 kernfs_get_tree+0x136/0x240 sysfs_get_tree+0x16/0x40 vfs_get_tree+0x28/0xc0 path_mount+0x434/0xc00 __x64_sys_mount+0xe3/0x120 do_syscall_64+0x33/0x40 entry_SYSCALL_64_after_hwframe+0x44/0xa9 -> #2 (kernfs_mutex){+.+.}-{3:3}: __mutex_lock+0x7e/0x7e0 kernfs_add_one+0x23/0x150 kernfs_create_link+0x63/0xa0 sysfs_do_create_link_sd+0x5e/0xd0 btrfs_sysfs_add_devices_dir+0x81/0x130 btrfs_init_new_device+0x67f/0x1250 btrfs_ioctl+0x1ef/0x2e20 __x64_sys_ioctl+0x83/0xb0 do_syscall_64+0x33/0x40 entry_SYSCALL_64_after_hwframe+0x44/0xa9 -> #1 (&fs_info->chunk_mutex){+.+.}-{3:3}: __mutex_lock+0x7e/0x7e0 btrfs_chunk_alloc+0x125/0x3a0 find_free_extent+0xdf6/0x1210 btrfs_reserve_extent+0xb3/0x1b0 btrfs_alloc_tree_block+0xb0/0x310 alloc_tree_block_no_bg_flush+0x4a/0x60 __btrfs_cow_block+0x11a/0x530 btrfs_cow_block+0x104/0x220 btrfs_search_slot+0x52e/0x9d0 btrfs_insert_empty_items+0x64/0xb0 btrfs_insert_delayed_items+0x90/0x4f0 btrfs_commit_inode_delayed_items+0x93/0x140 btrfs_log_inode+0x5de/0x2020 btrfs_log_inode_parent+0x429/0xc90 btrfs_log_new_name+0x95/0x9b btrfs_rename2+0xbb9/0x1800 vfs_rename+0x64f/0x9f0 do_renameat2+0x320/0x4e0 __x64_sys_rename+0x1f/0x30 do_syscall_64+0x33/0x40 entry_SYSCALL_64_after_hwframe+0x44/0xa9 -> #0 (&delayed_node->mutex){+.+.}-{3:3}: __lock_acquire+0x119c/0x1fc0 lock_acquire+0xa7/0x3d0 __mutex_lock+0x7e/0x7e0 __btrfs_release_delayed_node.part.0+0x3f/0x330 btrfs_evict_inode+0x24c/0x500 evict+0xcf/0x1f0 dispose_list+0x48/0x70 prune_icache_sb+0x44/0x50 super_cache_scan+0x161/0x1e0 do_shrink_slab+0x178/0x3c0 shrink_slab+0x17c/0x290 shrink_node+0x2b2/0x6d0 balance_pgdat+0x30a/0x670 kswapd+0x213/0x4c0 kthread+0x138/0x160 ret_from_fork+0x1f/0x30 other info that might help us debug this: Chain exists of: &delayed_node->mutex --> kernfs_mutex --> fs_reclaim Possible unsafe locking scenario: CPU0 CPU1 ---- ---- lock(fs_reclaim); lock(kernfs_mutex); lock(fs_reclaim); lock(&delayed_node->mutex); *** DEADLOCK *** 3 locks held by kswapd0/100: #0: ffffffff8dd74700 (fs_reclaim){+.+.}-{0:0}, at: __fs_reclaim_acquire+0x5/0x30 #1: ffffffff8dd65c50 (shrinker_rwsem){++++}-{3:3}, at: shrink_slab+0x115/0x290 #2: ffff96ed2ade30e0 (&type->s_umount_key#36){++++}-{3:3}, at: super_cache_scan+0x38/0x1e0 stack backtrace: CPU: 0 PID: 100 Comm: kswapd0 Not tainted 5.9.0-rc3+ #4 Hardware name: QEMU Standard PC (Q35 + ICH9, 2009), BIOS 1.13.0-2.fc32 04/01/2014 Call Trace: dump_stack+0x8b/0xb8 check_noncircular+0x12d/0x150 __lock_acquire+0x119c/0x1fc0 lock_acquire+0xa7/0x3d0 ? __btrfs_release_delayed_node.part.0+0x3f/0x330 __mutex_lock+0x7e/0x7e0 ? __btrfs_release_delayed_node.part.0+0x3f/0x330 ? __btrfs_release_delayed_node.part.0+0x3f/0x330 ? lock_acquire+0xa7/0x3d0 ? find_held_lock+0x2b/0x80 __btrfs_release_delayed_node.part.0+0x3f/0x330 btrfs_evict_inode+0x24c/0x500 evict+0xcf/0x1f0 dispose_list+0x48/0x70 prune_icache_sb+0x44/0x50 super_cache_scan+0x161/0x1e0 do_shrink_slab+0x178/0x3c0 shrink_slab+0x17c/0x290 shrink_node+0x2b2/0x6d0 balance_pgdat+0x30a/0x670 kswapd+0x213/0x4c0 ? _raw_spin_unlock_irqrestore+0x41/0x50 ? add_wait_queue_exclusive+0x70/0x70 ? balance_pgdat+0x670/0x670 kthread+0x138/0x160 ? kthread_create_worker_on_cpu+0x40/0x40 ret_from_fork+0x1f/0x30 This happens because we are holding the chunk_mutex at the time of adding in a new device. However we only need to hold the device_list_mutex, as we're going to iterate over the fs_devices devices. Move the sysfs init stuff outside of the chunk_mutex to get rid of this lockdep splat. CC: stable(a)vger.kernel.org # 4.4.x: f3cd2c58110dad14e: btrfs: sysfs, rename device_link add/remove functions CC: stable(a)vger.kernel.org # 4.4.x Reported-by: David Sterba <dsterba(a)suse.com> Signed-off-by: Josef Bacik <josef(a)toxicpanda.com> Reviewed-by: David Sterba <dsterba(a)suse.com> Signed-off-by: David Sterba <dsterba(a)suse.com> diff --git a/fs/btrfs/volumes.c b/fs/btrfs/volumes.c index 9d169cba8514..c86ffad04641 100644 --- a/fs/btrfs/volumes.c +++ b/fs/btrfs/volumes.c @@ -2597,9 +2597,6 @@ int btrfs_init_new_device(struct btrfs_fs_info *fs_info, const char *device_path btrfs_set_super_num_devices(fs_info->super_copy, orig_super_num_devices + 1); - /* add sysfs device entry */ - btrfs_sysfs_add_devices_dir(fs_devices, device); - /* * we've got more storage, clear any full flags on the space * infos @@ -2607,6 +2604,10 @@ int btrfs_init_new_device(struct btrfs_fs_info *fs_info, const char *device_path btrfs_clear_space_info_full(fs_info); mutex_unlock(&fs_info->chunk_mutex); + + /* Add sysfs device entry */ + btrfs_sysfs_add_devices_dir(fs_devices, device); + mutex_unlock(&fs_devices->device_list_mutex); if (seeding_dev) {

5 years

3
2
0 0

[RFC 2/2] drm/i915/display: Protect pipe_update against dc3co exit

by Anshuman Gupta

At usual case DC3CO exit happen automatically by DMC f/w whenever PSR2 clears idle. This happens smoothly by DMC f/w to work with flips. But there are certain scenario where DC3CO Disallowed by driver asynchronous with flips. In such scenario display engine could be already in DC3CO state and driver has disallowed it, It initiates DC3CO exit sequence in DMC f/w which requires a dc3co exit delay of 200us in driver. It requires to protect intel_pipe_update_{update_end} with dc3co exit delay. Cc: Imre Deak <imre.deak(a)intel.com> Cc: <stable(a)vger.kernel.org> Signed-off-by: Anshuman Gupta <anshuman.gupta(a)intel.com> --- drivers/gpu/drm/i915/display/intel_display.c | 3 +++ 1 file changed, 3 insertions(+) diff --git a/drivers/gpu/drm/i915/display/intel_display.c b/drivers/gpu/drm/i915/display/intel_display.c index ba26545392bc..3b81b98c0daf 100644 --- a/drivers/gpu/drm/i915/display/intel_display.c +++ b/drivers/gpu/drm/i915/display/intel_display.c @@ -15924,6 +15924,8 @@ static void intel_update_crtc(struct intel_atomic_state *state, else intel_fbc_enable(state, crtc); + /* Protect intel_pipe_update_{start,end} with power_domians lock */ + mutex_lock(&dev_priv->power_domains.lock); /* Perform vblank evasion around commit operation */ intel_pipe_update_start(new_crtc_state); @@ -15935,6 +15937,7 @@ static void intel_update_crtc(struct intel_atomic_state *state, i9xx_update_planes_on_crtc(state, crtc); intel_pipe_update_end(new_crtc_state); + mutex_unlock(&dev_prive->power_domains.lock); /* * We usually enable FIFO underrun interrupts as part of the -- 2.26.2

5 years

3
6
0 0

[PATCH] exfat: Avoid allocating upcase table using kcalloc()

by Artem Labazov

The table for Unicode upcase conversion requires an order-5 allocation, which may fail on a highly-fragmented system: pool-udisksd: page allocation failure: order:5, mode:0x40dc0(GFP_KERNEL|__GFP_COMP|__GFP_ZERO), nodemask=(null),cpuset=/,mems_allowed=0 CPU: 4 PID: 3756880 Comm: pool-udisksd Tainted: G U 5.8.10-200.fc32.x86_64 #1 Hardware name: Dell Inc. XPS 13 9360/0PVG6D, BIOS 2.13.0 11/14/2019 Call Trace: dump_stack+0x6b/0x88 warn_alloc.cold+0x75/0xd9 ? _cond_resched+0x16/0x40 ? __alloc_pages_direct_compact+0x144/0x150 __alloc_pages_slowpath.constprop.0+0xcfa/0xd30 ? __schedule+0x28a/0x840 ? __wait_on_bit_lock+0x92/0xa0 __alloc_pages_nodemask+0x2df/0x320 kmalloc_order+0x1b/0x80 kmalloc_order_trace+0x1d/0xa0 exfat_create_upcase_table+0x115/0x390 [exfat] exfat_fill_super+0x3ef/0x7f0 [exfat] ? sget_fc+0x1d0/0x240 ? exfat_init_fs_context+0x120/0x120 [exfat] get_tree_bdev+0x15c/0x250 vfs_get_tree+0x25/0xb0 do_mount+0x7c3/0xaf0 ? copy_mount_options+0xab/0x180 __x64_sys_mount+0x8e/0xd0 do_syscall_64+0x4d/0x90 entry_SYSCALL_64_after_hwframe+0x44/0xa9 Make the driver use vmalloc() to eliminate the issue. Cc: stable(a)vger.kernel.org # v5.7+ Signed-off-by: Artem Labazov <123321artyom(a)gmail.com> --- fs/exfat/nls.c | 7 ++++--- 1 file changed, 4 insertions(+), 3 deletions(-) diff --git a/fs/exfat/nls.c b/fs/exfat/nls.c index 675d0e7058c5..0582faf8de77 100644 --- a/fs/exfat/nls.c +++ b/fs/exfat/nls.c @@ -6,6 +6,7 @@ #include <linux/string.h> #include <linux/slab.h> #include <linux/buffer_head.h> +#include <linux/vmalloc.h> #include <asm/unaligned.h> #include "exfat_raw.h" @@ -659,7 +660,7 @@ static int exfat_load_upcase_table(struct super_block *sb, unsigned char skip = false; unsigned short *upcase_table; - upcase_table = kcalloc(UTBL_COUNT, sizeof(unsigned short), GFP_KERNEL); + upcase_table = vmalloc(UTBL_COUNT*sizeof(unsigned short)); if (!upcase_table) return -ENOMEM; @@ -715,7 +716,7 @@ static int exfat_load_default_upcase_table(struct super_block *sb) unsigned short uni = 0, *upcase_table; unsigned int index = 0; - upcase_table = kcalloc(UTBL_COUNT, sizeof(unsigned short), GFP_KERNEL); + upcase_table = vmalloc(UTBL_COUNT*sizeof(unsigned short)); if (!upcase_table) return -ENOMEM; @@ -803,5 +804,5 @@ int exfat_create_upcase_table(struct super_block *sb) void exfat_free_upcase_table(struct exfat_sb_info *sbi) { - kfree(sbi->vol_utbl); + vfree(sbi->vol_utbl); } -- 2.26.2

5 years

4
8
0 0

2025

2024

2023

2022

2021

2020

2019

2018

2017

Linux-stable-mirror November 2020