November 2022 - Linux-stable-mirror

[PATCH 1/1] net: cdc_ncm: Allow for dwNtbOutMaxSize to be unset or zero

by Lee Jones

Currently, due to the sequential use of min_t() and clamp_t() macros, in cdc_ncm_check_tx_max(), if dwNtbOutMaxSize is not set, the logic sets tx_max to 0. This is then used to allocate the data area of the SKB requested later in cdc_ncm_fill_tx_frame(). This does not cause an issue presently because when memory is allocated during initialisation phase of SKB creation, more memory (512b) is allocated than is required for the SKB headers alone (320b), leaving some space (512b - 320b = 192b) for CDC data (172b). However, if more elements (for example 3 x u64 = [24b]) were added to one of the SKB header structs, say 'struct skb_shared_info', increasing its original size (320b [320b aligned]) to something larger (344b [384b aligned]), then suddenly the CDC data (172b) no longer fits in the spare SKB data area (512b - 384b = 128b). Consequently the SKB bounds checking semantics fails and panics: skbuff: skb_over_panic: text:ffffffff830a5b5f len:184 put:172 \ head:ffff888119227c00 data:ffff888119227c00 tail:0xb8 end:0x80 dev:<NULL> ------------[ cut here ]------------ kernel BUG at net/core/skbuff.c:110! RIP: 0010:skb_panic+0x14f/0x160 net/core/skbuff.c:106 <snip> Call Trace: <IRQ> skb_over_panic+0x2c/0x30 net/core/skbuff.c:115 skb_put+0x205/0x210 net/core/skbuff.c:1877 skb_put_zero include/linux/skbuff.h:2270 [inline] cdc_ncm_ndp16 drivers/net/usb/cdc_ncm.c:1116 [inline] cdc_ncm_fill_tx_frame+0x127f/0x3d50 drivers/net/usb/cdc_ncm.c:1293 cdc_ncm_tx_fixup+0x98/0xf0 drivers/net/usb/cdc_ncm.c:1514 By overriding the max value with the default CDC_NCM_NTB_MAX_SIZE_TX when not offered through the system provided params, we ensure enough data space is allocated to handle the CDC data, meaning no crash will occur. Cc: stable(a)vger.kernel.org Cc: Oliver Neukum <oliver(a)neukum.org> Cc: "David S. Miller" <davem(a)davemloft.net> Cc: Jakub Kicinski <kuba(a)kernel.org> Cc: linux-usb(a)vger.kernel.org Cc: netdev(a)vger.kernel.org Cc: linux-kernel(a)vger.kernel.org Fixes: 289507d3364f9 ("net: cdc_ncm: use sysfs for rx/tx aggregation tuning") Signed-off-by: Lee Jones <lee.jones(a)linaro.org> --- drivers/net/usb/cdc_ncm.c | 2 ++ 1 file changed, 2 insertions(+) diff --git a/drivers/net/usb/cdc_ncm.c b/drivers/net/usb/cdc_ncm.c index 24753a4da7e60..e303b522efb50 100644 --- a/drivers/net/usb/cdc_ncm.c +++ b/drivers/net/usb/cdc_ncm.c @@ -181,6 +181,8 @@ static u32 cdc_ncm_check_tx_max(struct usbnet *dev, u32 new_tx) min = ctx->max_datagram_size + ctx->max_ndp_size + sizeof(struct usb_cdc_ncm_nth32); max = min_t(u32, CDC_NCM_NTB_MAX_SIZE_TX, le32_to_cpu(ctx->ncm_parm.dwNtbOutMaxSize)); + if (max == 0) + max = CDC_NCM_NTB_MAX_SIZE_TX; /* dwNtbOutMaxSize not set */ /* some devices set dwNtbOutMaxSize too low for the above default */ min = min(min, max); -- 2.34.0.384.gca35af8252-goog

2 years, 1 month

6
13
0 0

[PATCH 0/9] KVM backports to 5.10

by Rishabh Bhatnagar

This patch series backports a few VM preemption_status, steal_time and PV TLB flushing fixes to 5.10 stable kernel. Most of the changes backport cleanly except i had to work around a few becauseof missing support/APIs in 5.10 kernel. I have captured those in the changelog as well in the individual patches. Changelog - Use mark_page_dirty_in_slot api without kvm argument (KVM: x86: Fix recording of guest steal time / preempted status) - Avoid checking for xen_msr and SEV-ES conditions (KVM: x86: do not set st->preempted when going back to user space) - Use VCPU_STAT macro to expose preemption_reported and preemption_other fields (KVM: x86: do not report a vCPU as preempted outside instruction boundaries) David Woodhouse (2): KVM: x86: Fix recording of guest steal time / preempted status KVM: Fix steal time asm constraints Lai Jiangshan (1): KVM: x86: Ensure PV TLB flush tracepoint reflects KVM behavior Paolo Bonzini (5): KVM: x86: do not set st->preempted when going back to user space KVM: x86: do not report a vCPU as preempted outside instruction boundaries KVM: x86: revalidate steal time cache if MSR value changes KVM: x86: do not report preemption if the steal time cache is stale KVM: x86: move guest_pv_has out of user_access section Sean Christopherson (1): KVM: x86: Remove obsolete disabling of page faults in kvm_arch_vcpu_put() arch/x86/include/asm/kvm_host.h | 5 +- arch/x86/kvm/svm/svm.c | 2 + arch/x86/kvm/vmx/vmx.c | 1 + arch/x86/kvm/x86.c | 164 ++++++++++++++++++++++---------- 4 files changed, 122 insertions(+), 50 deletions(-) -- 2.37.1

2 years, 2 months

6
19
0 0

[PATCH v2 1/8] drm: Disable the cursor plane on atomic contexts with virtualized drivers

by Zack Rusin

From: Zack Rusin <zackr(a)vmware.com> Cursor planes on virtualized drivers have special meaning and require that the clients handle them in specific ways, e.g. the cursor plane should react to the mouse movement the way a mouse cursor would be expected to and the client is required to set hotspot properties on it in order for the mouse events to be routed correctly. This breaks the contract as specified by the "universal planes". Fix it by disabling the cursor planes on virtualized drivers while adding a foundation on top of which it's possible to special case mouse cursor planes for clients that want it. Disabling the cursor planes makes some kms compositors which were broken, e.g. Weston, fallback to software cursor which works fine or at least better than currently while having no effect on others, e.g. gnome-shell or kwin, which put virtualized drivers on a deny-list when running in atomic context to make them fallback to legacy kms and avoid this issue. Signed-off-by: Zack Rusin <zackr(a)vmware.com> Fixes: 681e7ec73044 ("drm: Allow userspace to ask for universal plane list (v2)") Cc: <stable(a)vger.kernel.org> # v5.4+ Cc: Maarten Lankhorst <maarten.lankhorst(a)linux.intel.com> Cc: Maxime Ripard <mripard(a)kernel.org> Cc: Thomas Zimmermann <tzimmermann(a)suse.de> Cc: David Airlie <airlied(a)linux.ie> Cc: Daniel Vetter <daniel(a)ffwll.ch> Cc: Dave Airlie <airlied(a)redhat.com> Cc: Gerd Hoffmann <kraxel(a)redhat.com> Cc: Hans de Goede <hdegoede(a)redhat.com> Cc: Gurchetan Singh <gurchetansingh(a)chromium.org> Cc: Chia-I Wu <olvaffe(a)gmail.com> Cc: dri-devel(a)lists.freedesktop.org Cc: virtualization(a)lists.linux-foundation.org Cc: spice-devel(a)lists.freedesktop.org --- drivers/gpu/drm/drm_plane.c | 11 +++++++++++ drivers/gpu/drm/qxl/qxl_drv.c | 2 +- drivers/gpu/drm/vboxvideo/vbox_drv.c | 2 +- drivers/gpu/drm/virtio/virtgpu_drv.c | 3 ++- drivers/gpu/drm/vmwgfx/vmwgfx_drv.c | 2 +- include/drm/drm_drv.h | 10 ++++++++++ include/drm/drm_file.h | 12 ++++++++++++ 7 files changed, 38 insertions(+), 4 deletions(-) diff --git a/drivers/gpu/drm/drm_plane.c b/drivers/gpu/drm/drm_plane.c index 726f2f163c26..e1e2a65c7119 100644 --- a/drivers/gpu/drm/drm_plane.c +++ b/drivers/gpu/drm/drm_plane.c @@ -667,6 +667,17 @@ int drm_mode_getplane_res(struct drm_device *dev, void *data, !file_priv->universal_planes) continue; + /* + * Unless userspace supports virtual cursor plane + * then if we're running on virtual driver do not + * advertise cursor planes because they'll be broken + */ + if (plane->type == DRM_PLANE_TYPE_CURSOR && + drm_core_check_feature(dev, DRIVER_VIRTUAL) && + file_priv->atomic && + !file_priv->supports_virtual_cursor_plane) + continue; + if (drm_lease_held(file_priv, plane->base.id)) { if (count < plane_resp->count_planes && put_user(plane->base.id, plane_ptr + count)) diff --git a/drivers/gpu/drm/qxl/qxl_drv.c b/drivers/gpu/drm/qxl/qxl_drv.c index 1cb6f0c224bb..0e4212e05caa 100644 --- a/drivers/gpu/drm/qxl/qxl_drv.c +++ b/drivers/gpu/drm/qxl/qxl_drv.c @@ -281,7 +281,7 @@ static const struct drm_ioctl_desc qxl_ioctls[] = { }; static struct drm_driver qxl_driver = { - .driver_features = DRIVER_GEM | DRIVER_MODESET | DRIVER_ATOMIC, + .driver_features = DRIVER_GEM | DRIVER_MODESET | DRIVER_ATOMIC | DRIVER_VIRTUAL, .dumb_create = qxl_mode_dumb_create, .dumb_map_offset = drm_gem_ttm_dumb_map_offset, diff --git a/drivers/gpu/drm/vboxvideo/vbox_drv.c b/drivers/gpu/drm/vboxvideo/vbox_drv.c index f4f2bd79a7cb..84e75bcc3384 100644 --- a/drivers/gpu/drm/vboxvideo/vbox_drv.c +++ b/drivers/gpu/drm/vboxvideo/vbox_drv.c @@ -176,7 +176,7 @@ DEFINE_DRM_GEM_FOPS(vbox_fops); static const struct drm_driver driver = { .driver_features = - DRIVER_MODESET | DRIVER_GEM | DRIVER_ATOMIC, + DRIVER_MODESET | DRIVER_GEM | DRIVER_ATOMIC | DRIVER_VIRTUAL, .lastclose = drm_fb_helper_lastclose, diff --git a/drivers/gpu/drm/virtio/virtgpu_drv.c b/drivers/gpu/drm/virtio/virtgpu_drv.c index 5f25a8d15464..3c5bb006159a 100644 --- a/drivers/gpu/drm/virtio/virtgpu_drv.c +++ b/drivers/gpu/drm/virtio/virtgpu_drv.c @@ -198,7 +198,8 @@ MODULE_AUTHOR("Alon Levy"); DEFINE_DRM_GEM_FOPS(virtio_gpu_driver_fops); static const struct drm_driver driver = { - .driver_features = DRIVER_MODESET | DRIVER_GEM | DRIVER_RENDER | DRIVER_ATOMIC, + .driver_features = + DRIVER_MODESET | DRIVER_GEM | DRIVER_RENDER | DRIVER_ATOMIC | DRIVER_VIRTUAL, .open = virtio_gpu_driver_open, .postclose = virtio_gpu_driver_postclose, diff --git a/drivers/gpu/drm/vmwgfx/vmwgfx_drv.c b/drivers/gpu/drm/vmwgfx/vmwgfx_drv.c index 01a5b47e95f9..712f6ad0b014 100644 --- a/drivers/gpu/drm/vmwgfx/vmwgfx_drv.c +++ b/drivers/gpu/drm/vmwgfx/vmwgfx_drv.c @@ -1581,7 +1581,7 @@ static const struct file_operations vmwgfx_driver_fops = { static const struct drm_driver driver = { .driver_features = - DRIVER_MODESET | DRIVER_RENDER | DRIVER_ATOMIC | DRIVER_GEM, + DRIVER_MODESET | DRIVER_RENDER | DRIVER_ATOMIC | DRIVER_GEM | DRIVER_VIRTUAL, .ioctls = vmw_ioctls, .num_ioctls = ARRAY_SIZE(vmw_ioctls), .master_set = vmw_master_set, diff --git a/include/drm/drm_drv.h b/include/drm/drm_drv.h index f6159acb8856..c4cd7fc350d9 100644 --- a/include/drm/drm_drv.h +++ b/include/drm/drm_drv.h @@ -94,6 +94,16 @@ enum drm_driver_feature { * synchronization of command submission. */ DRIVER_SYNCOBJ_TIMELINE = BIT(6), + /** + * @DRIVER_VIRTUAL: + * + * Driver is running on top of virtual hardware. The most significant + * implication of this is a requirement of special handling of the + * cursor plane (e.g. cursor plane has to actually track the mouse + * cursor and the clients are required to set hotspot in order for + * the cursor planes to work correctly). + */ + DRIVER_VIRTUAL = BIT(7), /* IMPORTANT: Below are all the legacy flags, add new ones above. */ diff --git a/include/drm/drm_file.h b/include/drm/drm_file.h index e0a73a1e2df7..3e5c36891161 100644 --- a/include/drm/drm_file.h +++ b/include/drm/drm_file.h @@ -223,6 +223,18 @@ struct drm_file { */ bool is_master; + /** + * @supports_virtual_cursor_plane: + * + * This client is capable of handling the cursor plane with the + * restrictions imposed on it by the virtualized drivers. + * + * The implies that the cursor plane has to behave like a cursor + * i.e. track cursor movement. It also requires setting of the + * hotspot properties by the client on the cursor plane. + */ + bool supports_virtual_cursor_plane; + /** * @master: * -- 2.34.1

2 years, 2 months

6
12
0 0

[v4 PATCH] fs/proc: task_mmu.c: don't read mapcount for migration entry

by Yang Shi

The syzbot reported the below BUG: kernel BUG at include/linux/page-flags.h:785! invalid opcode: 0000 [#1] PREEMPT SMP KASAN CPU: 1 PID: 4392 Comm: syz-executor560 Not tainted 5.16.0-rc6-syzkaller #0 Hardware name: Google Google Compute Engine/Google Compute Engine, BIOS Google 01/01/2011 RIP: 0010:PageDoubleMap include/linux/page-flags.h:785 [inline] RIP: 0010:__page_mapcount+0x2d2/0x350 mm/util.c:744 Code: e8 d3 16 d1 ff 48 c7 c6 c0 00 b6 89 48 89 ef e8 94 4e 04 00 0f 0b e8 bd 16 d1 ff 48 c7 c6 60 01 b6 89 48 89 ef e8 7e 4e 04 00 <0f> 0b e8 a7 16 d1 ff 48 c7 c6 a0 01 b6 89 4c 89 f7 e8 68 4e 04 00 RSP: 0018:ffffc90002b6f7b8 EFLAGS: 00010293 RAX: 0000000000000000 RBX: 0000000000000000 RCX: 0000000000000000 RDX: ffff888019619d00 RSI: ffffffff81a68c12 RDI: 0000000000000003 RBP: ffffea0001bdc2c0 R08: 0000000000000029 R09: 00000000ffffffff R10: ffffffff8903e29f R11: 00000000ffffffff R12: 00000000ffffffff R13: 00000000ffffea00 R14: ffffc90002b6fb30 R15: ffffea0001bd8001 FS: 00007faa2aefd700(0000) GS:ffff8880b9d00000(0000) knlGS:0000000000000000 CS: 0010 DS: 0000 ES: 0000 CR0: 0000000080050033 CR2: 00007fff7e663318 CR3: 0000000018c6e000 CR4: 00000000003506e0 DR0: 0000000000000000 DR1: 0000000000000000 DR2: 0000000000000000 DR3: 0000000000000000 DR6: 00000000fffe0ff0 DR7: 0000000000000400 Call Trace: <TASK> page_mapcount include/linux/mm.h:837 [inline] smaps_account+0x470/0xb10 fs/proc/task_mmu.c:466 smaps_pte_entry fs/proc/task_mmu.c:538 [inline] smaps_pte_range+0x611/0x1250 fs/proc/task_mmu.c:601 walk_pmd_range mm/pagewalk.c:128 [inline] walk_pud_range mm/pagewalk.c:205 [inline] walk_p4d_range mm/pagewalk.c:240 [inline] walk_pgd_range mm/pagewalk.c:277 [inline] __walk_page_range+0xe23/0x1ea0 mm/pagewalk.c:379 walk_page_vma+0x277/0x350 mm/pagewalk.c:530 smap_gather_stats.part.0+0x148/0x260 fs/proc/task_mmu.c:768 smap_gather_stats fs/proc/task_mmu.c:741 [inline] show_smap+0xc6/0x440 fs/proc/task_mmu.c:822 seq_read_iter+0xbb0/0x1240 fs/seq_file.c:272 seq_read+0x3e0/0x5b0 fs/seq_file.c:162 vfs_read+0x1b5/0x600 fs/read_write.c:479 ksys_read+0x12d/0x250 fs/read_write.c:619 do_syscall_x64 arch/x86/entry/common.c:50 [inline] do_syscall_64+0x35/0xb0 arch/x86/entry/common.c:80 entry_SYSCALL_64_after_hwframe+0x44/0xae RIP: 0033:0x7faa2af6c969 Code: 28 00 00 00 75 05 48 83 c4 28 c3 e8 11 15 00 00 90 48 89 f8 48 89 f7 48 89 d6 48 89 ca 4d 89 c2 4d 89 c8 4c 8b 4c 24 08 0f 05 <48> 3d 01 f0 ff ff 73 01 c3 48 c7 c1 b8 ff ff ff f7 d8 64 89 01 48 RSP: 002b:00007faa2aefd288 EFLAGS: 00000246 ORIG_RAX: 0000000000000000 RAX: ffffffffffffffda RBX: 00007faa2aff4418 RCX: 00007faa2af6c969 RDX: 0000000000002025 RSI: 0000000020000100 RDI: 0000000000000003 RBP: 00007faa2aff4410 R08: 00007faa2aefd700 R09: 0000000000000000 R10: 00007faa2aefd700 R11: 0000000000000246 R12: 00007faa2afc20ac R13: 00007fff7e6632bf R14: 00007faa2aefd400 R15: 0000000000022000 </TASK> Modules linked in: ---[ end trace 24ec93ff95e4ac3d ]--- RIP: 0010:PageDoubleMap include/linux/page-flags.h:785 [inline] RIP: 0010:__page_mapcount+0x2d2/0x350 mm/util.c:744 Code: e8 d3 16 d1 ff 48 c7 c6 c0 00 b6 89 48 89 ef e8 94 4e 04 00 0f 0b e8 bd 16 d1 ff 48 c7 c6 60 01 b6 89 48 89 ef e8 7e 4e 04 00 <0f> 0b e8 a7 16 d1 ff 48 c7 c6 a0 01 b6 89 4c 89 f7 e8 68 4e 04 00 RSP: 0018:ffffc90002b6f7b8 EFLAGS: 00010293 RAX: 0000000000000000 RBX: 0000000000000000 RCX: 0000000000000000 RDX: ffff888019619d00 RSI: ffffffff81a68c12 RDI: 0000000000000003 RBP: ffffea0001bdc2c0 R08: 0000000000000029 R09: 00000000ffffffff R10: ffffffff8903e29f R11: 00000000ffffffff R12: 00000000ffffffff R13: 00000000ffffea00 R14: ffffc90002b6fb30 R15: ffffea0001bd8001 FS: 00007faa2aefd700(0000) GS:ffff8880b9d00000(0000) knlGS:0000000000000000 CS: 0010 DS: 0000 ES: 0000 CR0: 0000000080050033 CR2: 00007fff7e663318 CR3: 0000000018c6e000 CR4: 00000000003506e0 DR0: 0000000000000000 DR1: 0000000000000000 DR2: 0000000000000000 DR3: 0000000000000000 DR6: 00000000fffe0ff0 DR7: 0000000000000400 The reproducer was trying to reading /proc/$PID/smaps when calling MADV_FREE at the mean time. MADV_FREE may split THPs if it is called for partial THP. It may trigger the below race: CPU A CPU B ----- ----- smaps walk: MADV_FREE: page_mapcount() PageCompound() split_huge_page() page = compound_head(page) PageDoubleMap(page) When calling PageDoubleMap() this page is not a tail page of THP anymore so the BUG is triggered. This could be fixed by elevated refcount of the page before calling mapcount, but it prevents from counting migration entries, and it seems overkilling because the race just could happen when PMD is split so all PTE entries of tail pages are actually migration entries, and smaps_account() does treat migration entries as mapcount == 1 as Kirill pointed out. Add a new parameter for smaps_account() to tell this entry is migration entry then skip calling page_mapcount(). Don't skip getting mapcount for device private entries since they do track references with mapcount. Pagemap also has the similar issue although it was not reported. Fixed it as well. Fixes: e9b61f19858a ("thp: reintroduce split_huge_page()") Reported-by: syzbot+1f52b3a18d5633fa7f82(a)syzkaller.appspotmail.com Acked-by: David Hildenbrand <david(a)redhat.com> Cc: "Kirill A. Shutemov" <kirill.shutemov(a)linux.intel.com> Cc: Jann Horn <jannh(a)google.com> Cc: Matthew Wilcox <willy(a)infradead.org> Cc: <stable(a)vger.kernel.org> Signed-off-by: Yang Shi <shy828301(a)gmail.com> --- v4: * s/Treated/Treat per David * Collected acked-by tag from David v3: * Fixed the fix tag, the one used by v2 was not accurate * Added comment about the risk calling page_mapcount() per David * Fix pagemap v2: * Added proper fix tag per Jann Horn * Rebased to the latest linus's tree fs/proc/task_mmu.c | 38 ++++++++++++++++++++++++++++++-------- 1 file changed, 30 insertions(+), 8 deletions(-) diff --git a/fs/proc/task_mmu.c b/fs/proc/task_mmu.c index 18f8c3acbb85..bc2f46033231 100644 --- a/fs/proc/task_mmu.c +++ b/fs/proc/task_mmu.c @@ -440,7 +440,8 @@ static void smaps_page_accumulate(struct mem_size_stats *mss, } static void smaps_account(struct mem_size_stats *mss, struct page *page, - bool compound, bool young, bool dirty, bool locked) + bool compound, bool young, bool dirty, bool locked, + bool migration) { int i, nr = compound ? compound_nr(page) : 1; unsigned long size = nr * PAGE_SIZE; @@ -467,8 +468,15 @@ static void smaps_account(struct mem_size_stats *mss, struct page *page, * page_count(page) == 1 guarantees the page is mapped exactly once. * If any subpage of the compound page mapped with PTE it would elevate * page_count(). + * + * The page_mapcount() is called to get a snapshot of the mapcount. + * Without holding the page lock this snapshot can be slightly wrong as + * we cannot always read the mapcount atomically. It is not safe to + * call page_mapcount() even with PTL held if the page is not mapped, + * especially for migration entries. Treat regular migration entries + * as mapcount == 1. */ - if (page_count(page) == 1) { + if ((page_count(page) == 1) || migration) { smaps_page_accumulate(mss, page, size, size << PSS_SHIFT, dirty, locked, true); return; @@ -517,6 +525,7 @@ static void smaps_pte_entry(pte_t *pte, unsigned long addr, struct vm_area_struct *vma = walk->vma; bool locked = !!(vma->vm_flags & VM_LOCKED); struct page *page = NULL; + bool migration = false; if (pte_present(*pte)) { page = vm_normal_page(vma, addr, *pte); @@ -536,8 +545,11 @@ static void smaps_pte_entry(pte_t *pte, unsigned long addr, } else { mss->swap_pss += (u64)PAGE_SIZE << PSS_SHIFT; } - } else if (is_pfn_swap_entry(swpent)) + } else if (is_pfn_swap_entry(swpent)) { + if (is_migration_entry(swpent)) + migration = true; page = pfn_swap_entry_to_page(swpent); + } } else { smaps_pte_hole_lookup(addr, walk); return; @@ -546,7 +558,8 @@ static void smaps_pte_entry(pte_t *pte, unsigned long addr, if (!page) return; - smaps_account(mss, page, false, pte_young(*pte), pte_dirty(*pte), locked); + smaps_account(mss, page, false, pte_young(*pte), pte_dirty(*pte), + locked, migration); } #ifdef CONFIG_TRANSPARENT_HUGEPAGE @@ -557,6 +570,7 @@ static void smaps_pmd_entry(pmd_t *pmd, unsigned long addr, struct vm_area_struct *vma = walk->vma; bool locked = !!(vma->vm_flags & VM_LOCKED); struct page *page = NULL; + bool migration = false; if (pmd_present(*pmd)) { /* FOLL_DUMP will return -EFAULT on huge zero page */ @@ -564,8 +578,10 @@ static void smaps_pmd_entry(pmd_t *pmd, unsigned long addr, } else if (unlikely(thp_migration_supported() && is_swap_pmd(*pmd))) { swp_entry_t entry = pmd_to_swp_entry(*pmd); - if (is_migration_entry(entry)) + if (is_migration_entry(entry)) { + migration = true; page = pfn_swap_entry_to_page(entry); + } } if (IS_ERR_OR_NULL(page)) return; @@ -577,7 +593,9 @@ static void smaps_pmd_entry(pmd_t *pmd, unsigned long addr, /* pass */; else mss->file_thp += HPAGE_PMD_SIZE; - smaps_account(mss, page, true, pmd_young(*pmd), pmd_dirty(*pmd), locked); + + smaps_account(mss, page, true, pmd_young(*pmd), pmd_dirty(*pmd), + locked, migration); } #else static void smaps_pmd_entry(pmd_t *pmd, unsigned long addr, @@ -1378,6 +1396,7 @@ static pagemap_entry_t pte_to_pagemap_entry(struct pagemapread *pm, { u64 frame = 0, flags = 0; struct page *page = NULL; + bool migration = false; if (pte_present(pte)) { if (pm->show_pfn) @@ -1399,13 +1418,14 @@ static pagemap_entry_t pte_to_pagemap_entry(struct pagemapread *pm, frame = swp_type(entry) | (swp_offset(entry) << MAX_SWAPFILES_SHIFT); flags |= PM_SWAP; + migration = is_migration_entry(entry); if (is_pfn_swap_entry(entry)) page = pfn_swap_entry_to_page(entry); } if (page && !PageAnon(page)) flags |= PM_FILE; - if (page && page_mapcount(page) == 1) + if (page && !migration && page_mapcount(page) == 1) flags |= PM_MMAP_EXCLUSIVE; if (vma->vm_flags & VM_SOFTDIRTY) flags |= PM_SOFT_DIRTY; @@ -1421,6 +1441,7 @@ static int pagemap_pmd_range(pmd_t *pmdp, unsigned long addr, unsigned long end, spinlock_t *ptl; pte_t *pte, *orig_pte; int err = 0; + bool migration = false; #ifdef CONFIG_TRANSPARENT_HUGEPAGE ptl = pmd_trans_huge_lock(pmdp, vma); @@ -1461,11 +1482,12 @@ static int pagemap_pmd_range(pmd_t *pmdp, unsigned long addr, unsigned long end, if (pmd_swp_uffd_wp(pmd)) flags |= PM_UFFD_WP; VM_BUG_ON(!is_pmd_migration_entry(pmd)); + migration = is_migration_entry(entry); page = pfn_swap_entry_to_page(entry); } #endif - if (page && page_mapcount(page) == 1) + if (page && !migration && page_mapcount(page) == 1) flags |= PM_MMAP_EXCLUSIVE; for (; addr != end; addr += PAGE_SIZE) { -- 2.26.3

2 years, 2 months

5
12
0 0

Re: [PATCH 5.4 182/389] PCI/portdrv: Dont disable AER reporting in get_port_device_capability()

by Greg Kroah-Hartman

On Tue, Aug 23, 2022 at 07:20:14AM -0500, Bjorn Helgaas wrote: > On Tue, Aug 23, 2022, 6:35 AM Greg Kroah-Hartman <gregkh(a)linuxfoundation.org> > wrote: > > > From: Stefan Roese <sr(a)denx.de> > > > > [ Upstream commit 8795e182b02dc87e343c79e73af6b8b7f9c5e635 ] > > > > There's an open regression related to this commit: > > https://bugzilla.kernel.org/show_bug.cgi?id=216373 This is already in the following released stable kernels: 5.10.137 5.15.61 5.18.18 5.19.2 I'll go drop it from the 4.19 and 5.4 queues, but when this gets resolved in Linus's tree, make sure there's a cc: stable on the fix so that we know to backport it to the above branches as well. Or at the least, a "Fixes:" tag. thanks, greg k-h

2 years, 2 months

5
14
0 0

[PATCH] drm/panfrost: Fix the panfrost_mmu_map_fault_addr() error path

by Boris Brezillon

Make sure all bo->base.pages entries are either NULL or pointing to a valid page before calling drm_gem_shmem_put_pages(). Reported-by: Tomeu Vizoso <tomeu.vizoso(a)collabora.com> Cc: <stable(a)vger.kernel.org> Fixes: 187d2929206e ("drm/panfrost: Add support for GPU heap allocations") Signed-off-by: Boris Brezillon <boris.brezillon(a)collabora.com> --- drivers/gpu/drm/panfrost/panfrost_mmu.c | 1 + 1 file changed, 1 insertion(+) diff --git a/drivers/gpu/drm/panfrost/panfrost_mmu.c b/drivers/gpu/drm/panfrost/panfrost_mmu.c index 569509c2ba27..d76dff201ea6 100644 --- a/drivers/gpu/drm/panfrost/panfrost_mmu.c +++ b/drivers/gpu/drm/panfrost/panfrost_mmu.c @@ -460,6 +460,7 @@ static int panfrost_mmu_map_fault_addr(struct panfrost_device *pfdev, int as, if (IS_ERR(pages[i])) { mutex_unlock(&bo->base.pages_lock); ret = PTR_ERR(pages[i]); + pages[i] = NULL; goto err_pages; } } -- 2.31.1

2 years, 3 months

2
2
0 0

tcpci module in Kernel 5.15.74 with PTN5110 not working correctly

by Christian Bach

Hello For a few weeks now I am trying to make the PTN5110 chip work with the new Kernel 5.15.74. The same hardware setup was working with the 4.19.72 Kernel. The steps I took so far are as follows: 1. Study the Documentation and look at example Device Tree's in the Kernel 2. Try out different Device Tree configurations derived from the Documentation and examples 3. I did look on Stackoverflow, the NXP and other forums for any similar issue but could not find any 4. Updating the Kernel to the newest Version I was able to find: 5.15.74 (Hash: f0bee94053065c7cb8eacadfdd6bf739a2042b35 in Repo: git://git.yoctoproject.org/linux-yocto.git;branch=v5.15/standard/base) 5. Downgrade to the earliest Kernel possible: v5.10-rc1 (Hash: 3650b228f83adda7e5ee532e2b90429c03f7b9ec in Repo: git://git.yoctoproject.org/linux-yocto.git;branch=v5.15/standard/base) None of those steps had any effect. Every time I plug in a USB-A to USB-C cable the Kernel gets stuck in the ISR until I unplug the cable. (Attaching a full USB-PD capable Power Source over a USB-C cable works fine) This results in an unreasonable high CPU usage (most of the times the CPU gets blocked completely). I did analyze the I2C bus and found that the old Kernel did change many configurations after the A-C cable got attached while the new Kernel does nothing (please see logs below). I also did compare what happens on the I2C bus during chip initialization but did not find any mentionable differences. My HW setup is an i.mx6ul with the PTN5110 attached on I2C4. ================================================= My device tree looks like this: / { regulators { compatible = "simple-bus"; #address-cells = <1>; #size-cells = <0>; reg_usb_otg1_vbus: regulator@2 { compatible = "regulator-fixed"; reg = <2>; regulator-name = "usb_otg1_vbus"; pinctrl-names = "default"; pinctrl-0 = <&pinctrl_usb_otg1_vbus>; regulator-min-microvolt = <5000000>; regulator-max-microvolt = <5000000>; gpio = <&gpio2 8 GPIO_ACTIVE_HIGH>; enable-active-high; status = "okay"; }; }; }; &usbotg1 { /*pinctrl-names = "default"; pinctrl-0 = <&pinctrl_usbotg1>;*/ dr_mode = "otg"; status = "okay"; disable-over-current; vbus-supply = <&reg_usb_otg1_vbus>; }; &i2c4 { clock-frequency = <100000>; pinctrl-names = "default"; pinctrl-0 = <&pinctrl_i2c4>; status = "okay"; usb_pd: ptn5110@50 { compatible = "nxp,ptn5110"; reg = <0x50>; pinctrl-names = "default"; pinctrl-0 = <&pinctrl_usb_pd>; interrupt-parent = <&gpio2>; interrupts = <11 IRQ_TYPE_LEVEL_LOW>; wakeup-source; usb_con: connector { compatible = "usb-c-connector"; label = "USB-C"; data-role = "dual"; power-role = "dual"; try-power-role = "sink"; source-pdos = <PDO_FIXED(VSAFE5V, 2000, PDO_FIXED_USB_COMM | PDO_FIXED_DUAL_ROLE)>; sink-pdos = <PDO_FIXED(VSAFE5V, 2000, PDO_FIXED_USB_COMM | PDO_FIXED_DUAL_ROLE) //PDO_FIXED(VSAFE5V, 3000, 0) //PDO_FIXED(9000, 3000, 0) PDO_FIXED(12000, 3000, 0) PDO_FIXED(20000, 3000, 0)>; //PDO_FIXED(20000, 5000, 0)>; op-sink-microwatt = <10000000>; }; }; }; &iomuxc { pinctrl_i2c4: i2c4grp { fsl,pins = < MX6UL_PAD_UART2_TX_DATA__I2C4_SCL 0x4001b8b0 MX6UL_PAD_UART2_RX_DATA__I2C4_SDA 0x4001b8b0 >; }; pinctrl_usb_pd: usbpdgrp { fsl,pins = < MX6UL_PAD_ENET2_TX_DATA0__GPIO2_IO11 0x0001b020 /* Alert Interrupt */ MX6UL_PAD_ENET2_TX_CLK__GPIO2_IO14 0x0001b020 /* Fault Interrupt */ >; }; pinctrl_usb_otg1_vbus: usbotg1 { fsl,pins = < MX6UL_PAD_ENET2_RX_DATA0__GPIO2_IO08 0x000000b9 MX6UL_PAD_ENET2_RX_DATA1__USB_OTG1_OC 0x000010b0 >; }; }; ================================================= I2C Log on plug in event of Kernel 5.15.68: Direction | Address | Data ------------------------------- Read | 10 | 02 22 Write | 10 | 02 22 Read | 14 | 04 Read | 10 | 02 02 Write | 10 | 02 02 Read | 1E | 0C Read | 14 | 04 Read | 10 | 03 02 Write | 10 | 03 02 Read | 1E | 0C Read | 1A | 4A Read | 14 | 04 Read | 1D | 11 Read | 1E | 0C Pause for 200ms Read | 1A | 4A Read | 1A | 4A Read | 1D | 11 Write | 1A | 0E Write | 19 | 00 Write | 2E | 02 Write | 23 | 66 Write | 23 | 55 Write | 2F | 21 Pause for 300ms Write | 51 | 02 Write | 52 | 00 00 Write | 50 | 25 Read | 10 | 50 02 Write | 10 | 50 02 Read | 10 | 00 02 Write | 10 | 00 02 Write | 72 | 8C 00 Read | 1C | 60 Read | 10 | 00 02 Write | 10 | 00 02 Write | 2F | 00 Read | 1C | 60 Read | 10 | 00 02 Write | 10 | 00 02 Write | 2E | 02 Read | 10 | 00 02 Write | 10 | 00 02 Read | 10 | 00 02 Write | 10 | 00 02 Read | 10 | 00 02 Write | 10 | 00 02 Read | 10 | 00 02 Write | 10 | 00 02 (It will loop like this until the cable gets detached) I2C Log on plug in event of Kernel 5.15.68: Direction | Address | Data ------------------------------- Read | 10 | 02 22 Write | 10 | 02 22 Read | 14 | 04 Read | 10 | 02 02 Write | 10 | 02 02 Read | 1E | 0C Read | 14 | 04 Read | 10 | 02 02 Write | 10 | 02 02 Read | 1E | 0C Read | 14 | 04 Read | 1E | 0C Pause for 200ms Read | 10 | 01 02 Write | 10 | 01 02 Read | 1D | 11 Pause for 250ms Read | 1A | 4A Write | 1A | 4E Write | 19 | 00 Write | 2E | 02 Write | 23 | 66 Write | 23 | 55 Write | 2F | 21 Pause for 4ms Write | 51 | 02 Write | 52 | 00 00 Write | 50 | 35 Read | 10 | 50 02 Write | 10 | 50 02 Read | 10 | 00 02 Write | 10 | 00 02 Write | 2F | 00 Read | 10 | 00 02 Write | 10 | 00 02 Read | 1C | 60 Read | 10 | 00 02 Write | 10 | 00 02 Write | 23 | 66 Read | 10 | 02 02 Write | 10 | 02 02 Write | 23 | 44 Read | 14 | FF Write | 2E | 02 Read | 1E | 0C Write | 10 | FF FF Write | 14 | 04 Write | 23 | 33 Write | 12 | 7F 00 Write | 2F | 00 Write | 23 | 66 Write | 23 | 44 Read | 1C | 60 Read | 1A | 4E Write | 1A | 4E Write | 19 | 00 Write | 2E | 02 Read | 1E | 0C Read | 1D | 11 Write | 2F | 00 Write | 23 | 66 Write | 23 | 44 Read | 1C | 60 Read | 1A | 4E Write | 1A | 4E Write | 19 | 00 Write | 2E | 02 Write | 1A | 0F Read | 10 | 01 02 Write | 10 | 01 02 Read | 1D | 00 Write | 1A | 4A Write | 23 | 99 Read | 10 | 01 02 Write | 10 | 01 02 Read | 1D | 11 (no more communication after this point) -- Dipl. El-Ing. FH Christian Bach, Projektleiter Direct +41 43 456 16 96 . http://www.scs.ch Supercomputing Systems AG . Technoparkstrasse 1 . CH-8005 Zürich

2 years, 4 months

3
9
0 0

[PATCH 0/6] hwpoison, shmem, hugetlb: fix data loss issue 5.10.y

by Mike Kravetz

This is a request for adding the following patches to stable 5.10.y. Poisoned shmem and hugetlb pages are removed from the pagecache. Subsequent access to the offset in the file results in a NEW zero filled page. Application code does not get notified of the data loss, and the only 'clue' is a message in the system log. Data loss has been experienced by real users. This was addressed upstream. Most commits were marked for backports, but some were not. This was discussed here [1] and here [2]. Patches apply cleanly to v5.4.224 and pass tests checking for this specific data loss issue. LTP mm tests show no regressions. All patches except 4 "mm: hwpoison: handle non-anonymous THP correctly" required a small bit of change to apply correctly: mostly for context. linux-mm Cc'ed as it would be great to get at least an ACK from others familiar with this issue. [1] https://lore.kernel.org/linux-mm/Y2UTUNBHVY5U9si2@monkey/ [2] https://lore.kernel.org/stable/20221114131403.GA3807058@u2004/ James Houghton (1): hugetlbfs: don't delete error page from pagecache Yang Shi (5): mm: hwpoison: remove the unnecessary THP check mm: filemap: check if THP has hwpoisoned subpage for PMD page fault mm: hwpoison: refactor refcount check handling mm: hwpoison: handle non-anonymous THP correctly mm: shmem: don't truncate page if memory failure happens fs/hugetlbfs/inode.c | 13 ++-- include/linux/page-flags.h | 23 ++++++ mm/huge_memory.c | 2 + mm/hugetlb.c | 4 + mm/memory-failure.c | 153 ++++++++++++++++++++++++------------- mm/memory.c | 9 +++ mm/page_alloc.c | 4 +- mm/shmem.c | 51 +++++++++++-- 8 files changed, 191 insertions(+), 68 deletions(-) -- 2.38.1

2 years, 4 months

2
7
0 0

[PATCH v2] KVM: x86: Do not return host topology information from KVM_GET_SUPPORTED_CPUID

by Paolo Bonzini

Passing the host topology to the guest is almost certainly wrong and will confuse the scheduler. In addition, several fields of these CPUID leaves vary on each processor; it is simply impossible to return the right values from KVM_GET_SUPPORTED_CPUID in such a way that they can be passed to KVM_SET_CPUID2. The values that will most likely prevent confusion are all zeroes. Userspace will have to override it anyway if it wishes to present a specific topology to the guest. Cc: stable(a)vger.kernel.org Signed-off-by: Paolo Bonzini <pbonzini(a)redhat.com> --- Documentation/virt/kvm/api.rst | 14 ++++++++++++++ arch/x86/kvm/cpuid.c | 32 ++++++++++++++++---------------- 2 files changed, 30 insertions(+), 16 deletions(-) diff --git a/Documentation/virt/kvm/api.rst b/Documentation/virt/kvm/api.rst index eee9f857a986..20f4f6b302ff 100644 --- a/Documentation/virt/kvm/api.rst +++ b/Documentation/virt/kvm/api.rst @@ -8249,6 +8249,20 @@ CPU[EAX=1]:ECX[24] (TSC_DEADLINE) is not reported by ``KVM_GET_SUPPORTED_CPUID`` It can be enabled if ``KVM_CAP_TSC_DEADLINE_TIMER`` is present and the kernel has enabled in-kernel emulation of the local APIC. +CPU topology +~~~~~~~~~~~~ + +Several CPUID values include topology information for the host CPU: +0x0b and 0x1f for Intel systems, 0x8000001e for AMD systems. Different +versions of KVM return different values for this information and userspace +should not rely on it. Currently they return all zeroes. + +If userspace wishes to set up a guest topology, it should be careful that +the values of these three leaves differ for each CPU. In particular, +the APIC ID is found in EDX for all subleaves of 0x0b and 0x1f, and in EAX +for 0x8000001e; the latter also encodes the core id and node id in bits +7:0 of EBX and ECX respectively. + Obsolete ioctls and capabilities ^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^ diff --git a/arch/x86/kvm/cpuid.c b/arch/x86/kvm/cpuid.c index 0810e93cbedc..164bfb7e7a16 100644 --- a/arch/x86/kvm/cpuid.c +++ b/arch/x86/kvm/cpuid.c @@ -759,16 +759,22 @@ struct kvm_cpuid_array { int nent; }; +static struct kvm_cpuid_entry2 *get_next_cpuid(struct kvm_cpuid_array *array) +{ + if (array->nent >= array->maxnent) + return NULL; + + return &array->entries[array->nent++]; +} + static struct kvm_cpuid_entry2 *do_host_cpuid(struct kvm_cpuid_array *array, u32 function, u32 index) { - struct kvm_cpuid_entry2 *entry; + struct kvm_cpuid_entry2 *entry = get_next_cpuid(array); - if (array->nent >= array->maxnent) + if (!entry) return NULL; - entry = &array->entries[array->nent++]; - memset(entry, 0, sizeof(*entry)); entry->function = function; entry->index = index; @@ -945,22 +951,13 @@ static inline int __do_cpuid_func(struct kvm_cpuid_array *array, u32 function) entry->edx = edx.full; break; } - /* - * Per Intel's SDM, the 0x1f is a superset of 0xb, - * thus they can be handled by common code. - */ case 0x1f: case 0xb: /* - * Populate entries until the level type (ECX[15:8]) of the - * previous entry is zero. Note, CPUID EAX.{0x1f,0xb}.0 is - * the starting entry, filled by the primary do_host_cpuid(). + * No topology; a valid topology is indicated by the presence + * of subleaf 1. */ - for (i = 1; entry->ecx & 0xff00; ++i) { - entry = do_host_cpuid(array, function, i); - if (!entry) - goto out; - } + entry->eax = entry->ebx = entry->ecx = 0; break; case 0xd: { u64 permitted_xcr0 = kvm_caps.supported_xcr0 & xstate_get_guest_group_perm(); @@ -1193,6 +1190,9 @@ static inline int __do_cpuid_func(struct kvm_cpuid_array *array, u32 function) entry->ebx = entry->ecx = entry->edx = 0; break; case 0x8000001e: + /* Do not return host topology information. */ + entry->eax = entry->ebx = entry->ecx = 0; + entry->edx = 0; /* reserved */ break; case 0x8000001F: if (!kvm_cpu_cap_has(X86_FEATURE_SEV)) { -- 2.31.1

2 years, 5 months

3
12
0 0

[PATCH v6 1/6] locking/rwsem: Prevent non-first waiter from spinning in down_write() slowpath

by Waiman Long

A non-first waiter can potentially spin in the for loop of rwsem_down_write_slowpath() without sleeping but fail to acquire the lock even if the rwsem is free if the following sequence happens: Non-first RT waiter First waiter Lock holder ------------------- ------------ ----------- Acquire wait_lock rwsem_try_write_lock(): Set handoff bit if RT or wait too long Set waiter->handoff_set Release wait_lock Acquire wait_lock Inherit waiter->handoff_set Release wait_lock Clear owner Release lock if (waiter.handoff_set) { rwsem_spin_on_owner((); if (OWNER_NULL) goto trylock_again; } trylock_again: Acquire wait_lock rwsem_try_write_lock(): if (first->handoff_set && (waiter != first)) return false; Release wait_lock A non-first waiter cannot really acquire the rwsem even if it mistakenly believes that it can spin on OWNER_NULL value. If that waiter happens to be an RT task running on the same CPU as the first waiter, it can block the first waiter from acquiring the rwsem leading to live lock. Fix this problem by making sure that a non-first waiter cannot spin in the slowpath loop without sleeping. Fixes: d257cc8cb8d5 ("locking/rwsem: Make handoff bit handling more consistent") Reviewed-and-tested-by: Mukesh Ojha <quic_mojha(a)quicinc.com> Signed-off-by: Waiman Long <longman(a)redhat.com> Cc: stable(a)vger.kernel.org --- kernel/locking/rwsem.c | 19 +++++++++---------- 1 file changed, 9 insertions(+), 10 deletions(-) diff --git a/kernel/locking/rwsem.c b/kernel/locking/rwsem.c index 44873594de03..be2df9ea7c30 100644 --- a/kernel/locking/rwsem.c +++ b/kernel/locking/rwsem.c @@ -624,18 +624,16 @@ static inline bool rwsem_try_write_lock(struct rw_semaphore *sem, */ if (first->handoff_set && (waiter != first)) return false; - - /* - * First waiter can inherit a previously set handoff - * bit and spin on rwsem if lock acquisition fails. - */ - if (waiter == first) - waiter->handoff_set = true; } new = count; if (count & RWSEM_LOCK_MASK) { + /* + * A waiter (first or not) can set the handoff bit + * if it is an RT task or wait in the wait queue + * for too long. + */ if (has_handoff || (!rt_task(waiter->task) && !time_after(jiffies, waiter->timeout))) return false; @@ -651,11 +649,12 @@ static inline bool rwsem_try_write_lock(struct rw_semaphore *sem, } while (!atomic_long_try_cmpxchg_acquire(&sem->count, &count, new)); /* - * We have either acquired the lock with handoff bit cleared or - * set the handoff bit. + * We have either acquired the lock with handoff bit cleared or set + * the handoff bit. Only the first waiter can have its handoff_set + * set here to enable optimistic spinning in slowpath loop. */ if (new & RWSEM_FLAG_HANDOFF) { - waiter->handoff_set = true; + first->handoff_set = true; lockevent_inc(rwsem_wlock_handoff); return false; } -- 2.31.1

2 years, 5 months

2
2
0 0

2025

2024

2023

2022

2021

2020

2019

2018

2017

Linux-stable-mirror November 2022