August 2021 - Linux-stable-mirror

[PATCH] kthread: Fix PF_KTHREAD vs to_kthread() race

by Borislav Petkov

Hi stable folks, please queue for 5.10-stable. See https://bugzilla.kernel.org/show_bug.cgi?id=214159 for more info. --- Commit 3a7956e25e1d7b3c148569e78895e1f3178122a9 upstream. The kthread_is_per_cpu() construct relies on only being called on PF_KTHREAD tasks (per the WARN in to_kthread). This gives rise to the following usage pattern: if ((p->flags & PF_KTHREAD) && kthread_is_per_cpu(p)) However, as reported by syzcaller, this is broken. The scenario is: CPU0 CPU1 (running p) (p->flags & PF_KTHREAD) // true begin_new_exec() me->flags &= ~(PF_KTHREAD|...); kthread_is_per_cpu(p) to_kthread(p) WARN(!(p->flags & PF_KTHREAD) <-- *SPLAT* Introduce __to_kthread() that omits the WARN and is sure to check both values. Use this to remove the problematic pattern for kthread_is_per_cpu() and fix a number of other kthread_*() functions that have similar issues but are currently not used in ways that would expose the problem. Notably kthread_func() is only ever called on 'current', while kthread_probe_data() is only used for PF_WQ_WORKER, which implies the task is from kthread_create*(). Fixes: ac687e6e8c26 ("kthread: Extract KTHREAD_IS_PER_CPU") Signed-off-by: Peter Zijlstra (Intel) <peterz(a)infradead.org> Reviewed-by: Valentin Schneider <Valentin.Schneider(a)arm.com> Link: https://lkml.kernel.org/r/YH6WJc825C4P0FCK@hirez.programming.kicks-ass.net [ Drop the balance_push() hunk as it is not needed. ] Signed-off-by: Borislav Petkov <bp(a)suse.de> --- kernel/kthread.c | 33 +++++++++++++++++++++++++++------ kernel/sched/fair.c | 2 +- 2 files changed, 28 insertions(+), 7 deletions(-) diff --git a/kernel/kthread.c b/kernel/kthread.c index 9825cf89c614..508fe5278285 100644 --- a/kernel/kthread.c +++ b/kernel/kthread.c @@ -84,6 +84,25 @@ static inline struct kthread *to_kthread(struct task_struct *k) return (__force void *)k->set_child_tid; } +/* + * Variant of to_kthread() that doesn't assume @p is a kthread. + * + * Per construction; when: + * + * (p->flags & PF_KTHREAD) && p->set_child_tid + * + * the task is both a kthread and struct kthread is persistent. However + * PF_KTHREAD on it's own is not, kernel_thread() can exec() (See umh.c and + * begin_new_exec()). + */ +static inline struct kthread *__to_kthread(struct task_struct *p) +{ + void *kthread = (__force void *)p->set_child_tid; + if (kthread && !(p->flags & PF_KTHREAD)) + kthread = NULL; + return kthread; +} + void free_kthread_struct(struct task_struct *k) { struct kthread *kthread; @@ -168,8 +187,9 @@ EXPORT_SYMBOL_GPL(kthread_freezable_should_stop); */ void *kthread_func(struct task_struct *task) { - if (task->flags & PF_KTHREAD) - return to_kthread(task)->threadfn; + struct kthread *kthread = __to_kthread(task); + if (kthread) + return kthread->threadfn; return NULL; } EXPORT_SYMBOL_GPL(kthread_func); @@ -199,10 +219,11 @@ EXPORT_SYMBOL_GPL(kthread_data); */ void *kthread_probe_data(struct task_struct *task) { - struct kthread *kthread = to_kthread(task); + struct kthread *kthread = __to_kthread(task); void *data = NULL; - copy_from_kernel_nofault(&data, &kthread->data, sizeof(data)); + if (kthread) + copy_from_kernel_nofault(&data, &kthread->data, sizeof(data)); return data; } @@ -514,9 +535,9 @@ void kthread_set_per_cpu(struct task_struct *k, int cpu) set_bit(KTHREAD_IS_PER_CPU, &kthread->flags); } -bool kthread_is_per_cpu(struct task_struct *k) +bool kthread_is_per_cpu(struct task_struct *p) { - struct kthread *kthread = to_kthread(k); + struct kthread *kthread = __to_kthread(p); if (!kthread) return false; diff --git a/kernel/sched/fair.c b/kernel/sched/fair.c index 262b02d75007..bad97d35684d 100644 --- a/kernel/sched/fair.c +++ b/kernel/sched/fair.c @@ -7569,7 +7569,7 @@ int can_migrate_task(struct task_struct *p, struct lb_env *env) return 0; /* Disregard pcpu kthreads; they are where they need to be. */ - if ((p->flags & PF_KTHREAD) && kthread_is_per_cpu(p)) + if (kthread_is_per_cpu(p)) return 0; if (!cpumask_test_cpu(env->dst_cpu, p->cpus_ptr)) { -- 2.29.2 -- Regards/Gruss, Boris. https://people.kernel.org/tglx/notes-about-netiquette

3 years, 10 months

3
3
0 0

[PATCH] drm/sun4i: Fix macros in sun8i_csc.h

by Jernej Skrabec

Macros SUN8I_CSC_CTRL() and SUN8I_CSC_COEFF() don't follow usual recommendation of having arguments enclosed in parenthesis. While that didn't change anything for quiet sometime, it actually become important after CSC code rework with commit ea067aee45a8 ("drm/sun4i: de2/de3: Remove redundant CSC matrices"). Without this fix, colours are completely off for supported YVU formats on SoCs with DE2 (A64, H3, R40, etc.). Fix the issue by enclosing macro arguments in parenthesis. Cc: stable(a)vger.kernel.org # 5.12+ Fixes: 883029390550 ("drm/sun4i: Add DE2 CSC library") Reported-by: Roman Stratiienko <r.stratiienko(a)gmail.com> Signed-off-by: Jernej Skrabec <jernej.skrabec(a)gmail.com> --- drivers/gpu/drm/sun4i/sun8i_csc.h | 4 ++-- 1 file changed, 2 insertions(+), 2 deletions(-) diff --git a/drivers/gpu/drm/sun4i/sun8i_csc.h b/drivers/gpu/drm/sun4i/sun8i_csc.h index a55a38ad849c..022cafa6c06c 100644 --- a/drivers/gpu/drm/sun4i/sun8i_csc.h +++ b/drivers/gpu/drm/sun4i/sun8i_csc.h @@ -16,8 +16,8 @@ struct sun8i_mixer; #define CCSC10_OFFSET 0xA0000 #define CCSC11_OFFSET 0xF0000 -#define SUN8I_CSC_CTRL(base) (base + 0x0) -#define SUN8I_CSC_COEFF(base, i) (base + 0x10 + 4 * i) +#define SUN8I_CSC_CTRL(base) ((base) + 0x0) +#define SUN8I_CSC_COEFF(base, i) ((base) + 0x10 + 4 * (i)) #define SUN8I_CSC_CTRL_EN BIT(0) -- 2.33.0

3 years, 10 months

3
2
0 0

[PATCH 4.14] KVM: X86: MMU: Use the correct inherited permissions to get shadow page

by Ovidiu Panait

From: Lai Jiangshan <laijs(a)linux.alibaba.com> commit b1bd5cba3306691c771d558e94baa73e8b0b96b7 upstream. When computing the access permissions of a shadow page, use the effective permissions of the walk up to that point, i.e. the logic AND of its parents' permissions. Two guest PxE entries that point at the same table gfn need to be shadowed with different shadow pages if their parents' permissions are different. KVM currently uses the effective permissions of the last non-leaf entry for all non-leaf entries. Because all non-leaf SPTEs have full ("uwx") permissions, and the effective permissions are recorded only in role.access and merged into the leaves, this can lead to incorrect reuse of a shadow page and eventually to a missing guest protection page fault. For example, here is a shared pagetable: pgd[] pud[] pmd[] virtual address pointers /->pmd1(u--)->pte1(uw-)->page1 <- ptr1 (u--) /->pud1(uw-)--->pmd2(uw-)->pte2(uw-)->page2 <- ptr2 (uw-) pgd-| (shared pmd[] as above) \->pud2(u--)--->pmd1(u--)->pte1(uw-)->page1 <- ptr3 (u--) \->pmd2(uw-)->pte2(uw-)->page2 <- ptr4 (u--) pud1 and pud2 point to the same pmd table, so: - ptr1 and ptr3 points to the same page. - ptr2 and ptr4 points to the same page. (pud1 and pud2 here are pud entries, while pmd1 and pmd2 here are pmd entries) - First, the guest reads from ptr1 first and KVM prepares a shadow page table with role.access=u--, from ptr1's pud1 and ptr1's pmd1. "u--" comes from the effective permissions of pgd, pud1 and pmd1, which are stored in pt->access. "u--" is used also to get the pagetable for pud1, instead of "uw-". - Then the guest writes to ptr2 and KVM reuses pud1 which is present. The hypervisor set up a shadow page for ptr2 with pt->access is "uw-" even though the pud1 pmd (because of the incorrect argument to kvm_mmu_get_page in the previous step) has role.access="u--". - Then the guest reads from ptr3. The hypervisor reuses pud1's shadow pmd for pud2, because both use "u--" for their permissions. Thus, the shadow pmd already includes entries for both pmd1 and pmd2. - At last, the guest writes to ptr4. This causes no vmexit or pagefault, because pud1's shadow page structures included an "uw-" page even though its role.access was "u--". Any kind of shared pagetable might have the similar problem when in virtual machine without TDP enabled if the permissions are different from different ancestors. In order to fix the problem, we change pt->access to be an array, and any access in it will not include permissions ANDed from child ptes. The test code is: https://lore.kernel.org/kvm/20210603050537.19605-1-jiangshanlai@gmail.com/ Remember to test it with TDP disabled. The problem had existed long before the commit 41074d07c78b ("KVM: MMU: Fix inherited permissions for emulated guest pte updates"), and it is hard to find which is the culprit. So there is no fixes tag here. Signed-off-by: Lai Jiangshan <laijs(a)linux.alibaba.com> Message-Id: <20210603052455.21023-1-jiangshanlai(a)gmail.com> Cc: stable(a)vger.kernel.org Fixes: cea0f0e7ea54 ("[PATCH] KVM: MMU: Shadow page table caching") Signed-off-by: Paolo Bonzini <pbonzini(a)redhat.com> [OP: - apply arch/x86/kvm/mmu/* changes to arch/x86/kvm - apply documentation changes to Documentation/virtual/kvm/mmu.txt - add vcpu parameter to gpte_access() call] Signed-off-by: Ovidiu Panait <ovidiu.panait(a)windriver.com> --- Note: The backport was validated by running the kvm-unit-tests testcase [1] mentioned in the commit message (the testcase fails without the patch and passes with the patch applied). [1] https://gitlab.com/kvm-unit-tests/kvm-unit-tests/-/commit/47fd6bc54674fb1d8… Documentation/virtual/kvm/mmu.txt | 4 ++-- arch/x86/kvm/paging_tmpl.h | 14 +++++++++----- 2 files changed, 11 insertions(+), 7 deletions(-) diff --git a/Documentation/virtual/kvm/mmu.txt b/Documentation/virtual/kvm/mmu.txt index e507a9e0421e..851a8abcadce 100644 --- a/Documentation/virtual/kvm/mmu.txt +++ b/Documentation/virtual/kvm/mmu.txt @@ -152,8 +152,8 @@ Shadow pages contain the following information: shadow pages) so role.quadrant takes values in the range 0..3. Each quadrant maps 1GB virtual address space. role.access: - Inherited guest access permissions in the form uwx. Note execute - permission is positive, not negative. + Inherited guest access permissions from the parent ptes in the form uwx. + Note execute permission is positive, not negative. role.invalid: The page is invalid and should not be used. It is a root page that is currently pinned (by a cpu hardware register pointing to it); once it is diff --git a/arch/x86/kvm/paging_tmpl.h b/arch/x86/kvm/paging_tmpl.h index 8cf7a09bdd73..677b195f7cf1 100644 --- a/arch/x86/kvm/paging_tmpl.h +++ b/arch/x86/kvm/paging_tmpl.h @@ -93,8 +93,8 @@ struct guest_walker { gpa_t pte_gpa[PT_MAX_FULL_LEVELS]; pt_element_t __user *ptep_user[PT_MAX_FULL_LEVELS]; bool pte_writable[PT_MAX_FULL_LEVELS]; - unsigned pt_access; - unsigned pte_access; + unsigned int pt_access[PT_MAX_FULL_LEVELS]; + unsigned int pte_access; gfn_t gfn; struct x86_exception fault; }; @@ -388,13 +388,15 @@ static int FNAME(walk_addr_generic)(struct guest_walker *walker, } walker->ptes[walker->level - 1] = pte; + + /* Convert to ACC_*_MASK flags for struct guest_walker. */ + walker->pt_access[walker->level - 1] = FNAME(gpte_access)(vcpu, pt_access ^ walk_nx_mask); } while (!is_last_gpte(mmu, walker->level, pte)); pte_pkey = FNAME(gpte_pkeys)(vcpu, pte); accessed_dirty = have_ad ? pte_access & PT_GUEST_ACCESSED_MASK : 0; /* Convert to ACC_*_MASK flags for struct guest_walker. */ - walker->pt_access = FNAME(gpte_access)(vcpu, pt_access ^ walk_nx_mask); walker->pte_access = FNAME(gpte_access)(vcpu, pte_access ^ walk_nx_mask); errcode = permission_fault(vcpu, mmu, walker->pte_access, pte_pkey, access); if (unlikely(errcode)) @@ -432,7 +434,8 @@ static int FNAME(walk_addr_generic)(struct guest_walker *walker, } pgprintk("%s: pte %llx pte_access %x pt_access %x\n", - __func__, (u64)pte, walker->pte_access, walker->pt_access); + __func__, (u64)pte, walker->pte_access, + walker->pt_access[walker->level - 1]); return 1; error: @@ -601,7 +604,7 @@ static int FNAME(fetch)(struct kvm_vcpu *vcpu, gva_t addr, { struct kvm_mmu_page *sp = NULL; struct kvm_shadow_walk_iterator it; - unsigned direct_access, access = gw->pt_access; + unsigned int direct_access, access; int top_level, ret; gfn_t gfn, base_gfn; @@ -633,6 +636,7 @@ static int FNAME(fetch)(struct kvm_vcpu *vcpu, gva_t addr, sp = NULL; if (!is_shadow_present_pte(*it.sptep)) { table_gfn = gw->table_gfn[it.level - 2]; + access = gw->pt_access[it.level - 2]; sp = kvm_mmu_get_page(vcpu, table_gfn, addr, it.level-1, false, access); } -- 2.18.1

3 years, 10 months

2
2
0 0

[PATCH 1/1] arm64: dts: qcom: msm8994-angler: Fix gpio-reserved-ranges 85-88

by Petr Vorel

commit f890f89d9a80fffbfa7ca791b78927e5b8aba869 upstream. Reserve GPIO pins 85-88 as these aren't meant to be accessible from the application CPUs (causes reboot). Yet another fix similar to 9134586715e3, 5f8d3ab136d0, which is needed to allow angler to boot after 3edfb7bd76bd ("gpiolib: Show correct direction from the beginning"). Fixes: feeaf56ac78d ("arm64: dts: msm8994 SoC and Huawei Angler (Nexus 6P) support") [ pvorel: rebased for 5.4: tlmm -> msmgpio ] Signed-off-by: Petr Vorel <petr.vorel(a)gmail.com> Reviewed-by: Konrad Dybcio <konrad.dybcio(a)somainline.org> Link: https://lore.kernel.org/r/20210415193913.1836153-1-petr.vorel@gmail.com Signed-off-by: Bjorn Andersson <bjorn.andersson(a)linaro.org> Signed-off-by: Petr Vorel <pvorel(a)suse.cz> --- For 5.4.y arch/arm64/boot/dts/qcom/msm8994-angler-rev-101.dts | 4 ++++ 1 file changed, 4 insertions(+) diff --git a/arch/arm64/boot/dts/qcom/msm8994-angler-rev-101.dts b/arch/arm64/boot/dts/qcom/msm8994-angler-rev-101.dts index a5f9a6ab512c..9b989cc30edc 100644 --- a/arch/arm64/boot/dts/qcom/msm8994-angler-rev-101.dts +++ b/arch/arm64/boot/dts/qcom/msm8994-angler-rev-101.dts @@ -30,3 +30,7 @@ serial@f991e000 { }; }; }; + +&msmgpio { + gpio-reserved-ranges = <85 4>; +}; -- 2.32.0

3 years, 10 months

2
1
0 0

FAILED: patch "[PATCH] KVM: x86/mmu: Treat NX as used (not reserved) for all !TDP" failed to apply to 4.9-stable tree

by gregkh＠linuxfoundation.org

The patch below does not apply to the 4.9-stable tree. If someone wants it applied there, or to any other stable or longterm tree, then please email the backport, including the original git commit id to <stable(a)vger.kernel.org>. thanks, greg k-h ------------------ original commit in Linus's tree ------------------ >From 112022bdb5bc372e00e6e43cb88ee38ea67b97bd Mon Sep 17 00:00:00 2001 From: Sean Christopherson <seanjc(a)google.com> Date: Tue, 22 Jun 2021 10:56:47 -0700 Subject: [PATCH] KVM: x86/mmu: Treat NX as used (not reserved) for all !TDP shadow MMUs Mark NX as being used for all non-nested shadow MMUs, as KVM will set the NX bit for huge SPTEs if the iTLB mutli-hit mitigation is enabled. Checking the mitigation itself is not sufficient as it can be toggled on at any time and KVM doesn't reset MMU contexts when that happens. KVM could reset the contexts, but that would require purging all SPTEs in all MMUs, for no real benefit. And, KVM already forces EFER.NX=1 when TDP is disabled (for WP=0, SMEP=1, NX=0), so technically NX is never reserved for shadow MMUs. Fixes: b8e8c8303ff2 ("kvm: mmu: ITLB_MULTIHIT mitigation") Cc: stable(a)vger.kernel.org Signed-off-by: Sean Christopherson <seanjc(a)google.com> Message-Id: <20210622175739.3610207-3-seanjc(a)google.com> Signed-off-by: Paolo Bonzini <pbonzini(a)redhat.com> diff --git a/arch/x86/kvm/mmu/mmu.c b/arch/x86/kvm/mmu/mmu.c index b3be690d081a..444e068e6ad9 100644 --- a/arch/x86/kvm/mmu/mmu.c +++ b/arch/x86/kvm/mmu/mmu.c @@ -4221,7 +4221,15 @@ static inline u64 reserved_hpa_bits(void) void reset_shadow_zero_bits_mask(struct kvm_vcpu *vcpu, struct kvm_mmu *context) { - bool uses_nx = context->nx || + /* + * KVM uses NX when TDP is disabled to handle a variety of scenarios, + * notably for huge SPTEs if iTLB multi-hit mitigation is enabled and + * to generate correct permissions for CR0.WP=0/CR4.SMEP=1/EFER.NX=0. + * The iTLB multi-hit workaround can be toggled at any time, so assume + * NX can be used by any non-nested shadow MMU to avoid having to reset + * MMU contexts. Note, KVM forces EFER.NX=1 when TDP is disabled. + */ + bool uses_nx = context->nx || !tdp_enabled || context->mmu_role.base.smep_andnot_wp; struct rsvd_bits_validate *shadow_zero_check; int i;

3 years, 10 months

3
2
0 0

FAILED: patch "[PATCH] lkdtm: Enable DOUBLE_FAULT on all architectures" failed to apply to 5.10-stable tree

by gregkh＠linuxfoundation.org

The patch below does not apply to the 5.10-stable tree. If someone wants it applied there, or to any other stable or longterm tree, then please email the backport, including the original git commit id to <stable(a)vger.kernel.org>. thanks, greg k-h ------------------ original commit in Linus's tree ------------------ >From f123c42bbeff26bfe8bdb08a01307e92d51eec39 Mon Sep 17 00:00:00 2001 From: Kees Cook <keescook(a)chromium.org> Date: Wed, 23 Jun 2021 13:39:33 -0700 Subject: [PATCH] lkdtm: Enable DOUBLE_FAULT on all architectures Where feasible, I prefer to have all tests visible on all architectures, but to have them wired to XFAIL. DOUBLE_FAIL was set up to XFAIL, but wasn't actually being added to the test list. Fixes: cea23efb4de2 ("lkdtm/bugs: Make double-fault test always available") Cc: stable(a)vger.kernel.org Signed-off-by: Kees Cook <keescook(a)chromium.org> Link: https://lore.kernel.org/r/20210623203936.3151093-7-keescook@chromium.org Signed-off-by: Greg Kroah-Hartman <gregkh(a)linuxfoundation.org> diff --git a/drivers/misc/lkdtm/core.c b/drivers/misc/lkdtm/core.c index 645b31e98c77..2c89fc18669f 100644 --- a/drivers/misc/lkdtm/core.c +++ b/drivers/misc/lkdtm/core.c @@ -178,9 +178,7 @@ static const struct crashtype crashtypes[] = { CRASHTYPE(STACKLEAK_ERASING), CRASHTYPE(CFI_FORWARD_PROTO), CRASHTYPE(FORTIFIED_STRSCPY), -#ifdef CONFIG_X86_32 CRASHTYPE(DOUBLE_FAULT), -#endif #ifdef CONFIG_PPC_BOOK3S_64 CRASHTYPE(PPC_SLB_MULTIHIT), #endif

3 years, 10 months

3
2
0 0

[PATCH 5.4.y] net: dsa: mt7530: fix VLAN traffic leaks again

by DENG Qingfang

[ Upstream commit 7428022b50d0fbb4846dd0f00639ea09d36dff02 ] When a port leaves a VLAN-aware bridge, the current code does not clear other ports' matrix field bit. If the bridge is later set to VLAN-unaware mode, traffic in the bridge may leak to that port. Remove the VLAN filtering check in mt7530_port_bridge_leave. Fixes: 4fe4e1f48ba1 ("net: dsa: mt7530: fix VLAN traffic leaks") Fixes: 83163f7dca56 ("net: dsa: mediatek: add VLAN support for MT7530") Signed-off-by: DENG Qingfang <dqfext(a)gmail.com> Reviewed-by: Vladimir Oltean <olteanv(a)gmail.com> Signed-off-by: David S. Miller <davem(a)davemloft.net> --- drivers/net/dsa/mt7530.c | 5 +---- 1 file changed, 1 insertion(+), 4 deletions(-) diff --git a/drivers/net/dsa/mt7530.c b/drivers/net/dsa/mt7530.c index aec606058d98..fc45af12612f 100644 --- a/drivers/net/dsa/mt7530.c +++ b/drivers/net/dsa/mt7530.c @@ -840,11 +840,8 @@ mt7530_port_bridge_leave(struct dsa_switch *ds, int port, /* Remove this port from the port matrix of the other ports * in the same bridge. If the port is disabled, port matrix * is kept and not being setup until the port becomes enabled. - * And the other port's port matrix cannot be broken when the - * other port is still a VLAN-aware port. */ - if (dsa_is_user_port(ds, i) && i != port && - !dsa_port_is_vlan_filtering(&ds->ports[i])) { + if (dsa_is_user_port(ds, i) && i != port) { if (dsa_to_port(ds, i)->bridge_dev != bridge) continue; if (priv->ports[i].enable) -- 2.25.1

3 years, 10 months

2
1
0 0

FAILED: patch "[PATCH] tracepoint: Use rcu get state and cond sync for static call" failed to apply to 5.10-stable tree

by gregkh＠linuxfoundation.org

The patch below does not apply to the 5.10-stable tree. If someone wants it applied there, or to any other stable or longterm tree, then please email the backport, including the original git commit id to <stable(a)vger.kernel.org>. thanks, greg k-h ------------------ original commit in Linus's tree ------------------ >From 7b40066c97ec66a44e388f82fcf694987451768f Mon Sep 17 00:00:00 2001 From: Mathieu Desnoyers <mathieu.desnoyers(a)efficios.com> Date: Thu, 5 Aug 2021 15:29:54 -0400 Subject: [PATCH] tracepoint: Use rcu get state and cond sync for static call updates State transitions from 1->0->1 and N->2->1 callbacks require RCU synchronization. Rather than performing the RCU synchronization every time the state change occurs, which is quite slow when many tracepoints are registered in batch, instead keep a snapshot of the RCU state on the most recent transitions which belong to a chain, and conditionally wait for a grace period on the last transition of the chain if one g.p. has not elapsed since the last snapshot. This applies to both RCU and SRCU. This brings the performance regression caused by commit 231264d6927f ("Fix: tracepoint: static call function vs data state mismatch") back to what it was originally. Before this commit: # trace-cmd start -e all # time trace-cmd start -p nop real 0m10.593s user 0m0.017s sys 0m0.259s After this commit: # trace-cmd start -e all # time trace-cmd start -p nop real 0m0.878s user 0m0.000s sys 0m0.103s Link: https://lkml.kernel.org/r/20210805192954.30688-1-mathieu.desnoyers@efficios… Link: https://lore.kernel.org/io-uring/4ebea8f0-58c9-e571-fd30-0ce4f6f09c70@samba… Cc: stable(a)vger.kernel.org Cc: Ingo Molnar <mingo(a)redhat.com> Cc: Peter Zijlstra <peterz(a)infradead.org> Cc: Andrew Morton <akpm(a)linux-foundation.org> Cc: "Paul E. McKenney" <paulmck(a)kernel.org> Cc: Stefan Metzmacher <metze(a)samba.org> Fixes: 231264d6927f ("Fix: tracepoint: static call function vs data state mismatch") Signed-off-by: Mathieu Desnoyers <mathieu.desnoyers(a)efficios.com> Reviewed-by: Paul E. McKenney <paulmck(a)kernel.org> Signed-off-by: Steven Rostedt (VMware) <rostedt(a)goodmis.org> diff --git a/kernel/tracepoint.c b/kernel/tracepoint.c index 8d772bd6894d..efd14c79fab4 100644 --- a/kernel/tracepoint.c +++ b/kernel/tracepoint.c @@ -28,6 +28,44 @@ extern tracepoint_ptr_t __stop___tracepoints_ptrs[]; DEFINE_SRCU(tracepoint_srcu); EXPORT_SYMBOL_GPL(tracepoint_srcu); +enum tp_transition_sync { + TP_TRANSITION_SYNC_1_0_1, + TP_TRANSITION_SYNC_N_2_1, + + _NR_TP_TRANSITION_SYNC, +}; + +struct tp_transition_snapshot { + unsigned long rcu; + unsigned long srcu; + bool ongoing; +}; + +/* Protected by tracepoints_mutex */ +static struct tp_transition_snapshot tp_transition_snapshot[_NR_TP_TRANSITION_SYNC]; + +static void tp_rcu_get_state(enum tp_transition_sync sync) +{ + struct tp_transition_snapshot *snapshot = &tp_transition_snapshot[sync]; + + /* Keep the latest get_state snapshot. */ + snapshot->rcu = get_state_synchronize_rcu(); + snapshot->srcu = start_poll_synchronize_srcu(&tracepoint_srcu); + snapshot->ongoing = true; +} + +static void tp_rcu_cond_sync(enum tp_transition_sync sync) +{ + struct tp_transition_snapshot *snapshot = &tp_transition_snapshot[sync]; + + if (!snapshot->ongoing) + return; + cond_synchronize_rcu(snapshot->rcu); + if (!poll_state_synchronize_srcu(&tracepoint_srcu, snapshot->srcu)) + synchronize_srcu(&tracepoint_srcu); + snapshot->ongoing = false; +} + /* Set to 1 to enable tracepoint debug output */ static const int tracepoint_debug; @@ -311,6 +349,11 @@ static int tracepoint_add_func(struct tracepoint *tp, */ switch (nr_func_state(tp_funcs)) { case TP_FUNC_1: /* 0->1 */ + /* + * Make sure new static func never uses old data after a + * 1->0->1 transition sequence. + */ + tp_rcu_cond_sync(TP_TRANSITION_SYNC_1_0_1); /* Set static call to first function */ tracepoint_update_call(tp, tp_funcs); /* Both iterator and static call handle NULL tp->funcs */ @@ -325,10 +368,15 @@ static int tracepoint_add_func(struct tracepoint *tp, * Requires ordering between RCU assign/dereference and * static call update/call. */ - rcu_assign_pointer(tp->funcs, tp_funcs); - break; + fallthrough; case TP_FUNC_N: /* N->N+1 (N>1) */ rcu_assign_pointer(tp->funcs, tp_funcs); + /* + * Make sure static func never uses incorrect data after a + * N->...->2->1 (N>1) transition sequence. + */ + if (tp_funcs[0].data != old[0].data) + tp_rcu_get_state(TP_TRANSITION_SYNC_N_2_1); break; default: WARN_ON_ONCE(1); @@ -372,24 +420,23 @@ static int tracepoint_remove_func(struct tracepoint *tp, /* Both iterator and static call handle NULL tp->funcs */ rcu_assign_pointer(tp->funcs, NULL); /* - * Make sure new func never uses old data after a 1->0->1 - * transition sequence. - * Considering that transition 0->1 is the common case - * and don't have rcu-sync, issue rcu-sync after - * transition 1->0 to break that sequence by waiting for - * readers to be quiescent. + * Make sure new static func never uses old data after a + * 1->0->1 transition sequence. */ - tracepoint_synchronize_unregister(); + tp_rcu_get_state(TP_TRANSITION_SYNC_1_0_1); break; case TP_FUNC_1: /* 2->1 */ rcu_assign_pointer(tp->funcs, tp_funcs); /* - * On 2->1 transition, RCU sync is needed before setting - * static call to first callback, because the observer - * may have loaded any prior tp->funcs after the last one - * associated with an rcu-sync. + * Make sure static func never uses incorrect data after a + * N->...->2->1 (N>2) transition sequence. If the first + * element's data has changed, then force the synchronization + * to prevent current readers that have loaded the old data + * from calling the new function. */ - tracepoint_synchronize_unregister(); + if (tp_funcs[0].data != old[0].data) + tp_rcu_get_state(TP_TRANSITION_SYNC_N_2_1); + tp_rcu_cond_sync(TP_TRANSITION_SYNC_N_2_1); /* Set static call to first function */ tracepoint_update_call(tp, tp_funcs); break; @@ -397,6 +444,12 @@ static int tracepoint_remove_func(struct tracepoint *tp, fallthrough; case TP_FUNC_N: rcu_assign_pointer(tp->funcs, tp_funcs); + /* + * Make sure static func never uses incorrect data after a + * N->...->2->1 (N>2) transition sequence. + */ + if (tp_funcs[0].data != old[0].data) + tp_rcu_get_state(TP_TRANSITION_SYNC_N_2_1); break; default: WARN_ON_ONCE(1);

3 years, 10 months

5
7
0 0

[PATCH 5.10.y 0/6] Collected perf patches for stable 5.10.y

by Hanjun Guo

I collected some bugfix perf patches which are missing for 5.10.y, please consider to apply. Athira Rajeev (1): powerpc/perf: Invoke per-CPU variable access with disabled interrupts Jianlin Lv (1): perf tools: Fix arm64 build error with gcc-11 Martin Liška (1): perf annotate: Fix jump parsing for C++ code. Namhyung Kim (1): perf record: Fix memory leak in vDSO found using ASAN Riccardo Mancini (2): perf env: Fix memory leak of bpf_prog_info_linear member perf symbol-elf: Fix memory leak by freeing sdt_note.args arch/powerpc/perf/core-book3s.c | 10 ++++++---- tools/perf/arch/arm/include/perf_regs.h | 2 +- tools/perf/arch/arm64/include/perf_regs.h | 2 +- tools/perf/arch/csky/include/perf_regs.h | 2 +- tools/perf/arch/powerpc/include/perf_regs.h | 2 +- tools/perf/arch/riscv/include/perf_regs.h | 2 +- tools/perf/arch/s390/include/perf_regs.h | 2 +- tools/perf/arch/x86/include/perf_regs.h | 2 +- tools/perf/util/annotate.c | 8 ++++++++ tools/perf/util/annotate.h | 1 + tools/perf/util/env.c | 1 + tools/perf/util/perf_regs.h | 7 +++++++ tools/perf/util/symbol-elf.c | 1 + tools/perf/util/vdso.c | 2 ++ 14 files changed, 33 insertions(+), 11 deletions(-) -- 1.7.12.4

3 years, 10 months

2
7
0 0

[PATCH 5.10.y 0/2] disable ftrace of sbi functions

by Dimitri John Ledkov

One cannot ftrace functions used to setup ftrace. Doing so leads to a racy kernel panic as observed by users on SiFive HiFive Unmatched boards with Ubuntu kernels. This has been debugged and fixed in v5.12 kernels by ensuring that all sbi functions are excluded from ftrace. Link: https://forums.sifive.com/t/u-boot-says-unhandled-exception-illegal-instruc… BugLink: https://bugs.launchpad.net/bugs/1934548 Guo Ren (2): riscv: Fixup wrong ftrace remove cflag riscv: Fixup patch_text panic in ftrace arch/riscv/kernel/Makefile | 5 +++-- arch/riscv/mm/Makefile | 3 ++- 2 files changed, 5 insertions(+), 3 deletions(-) -- 2.30.2

3 years, 10 months

2
3
0 0

2025

2024

2023

2022

2021

2020

2019

2018

2017

Linux-stable-mirror August 2021