January 2018 - Linux-stable-mirror

[Linux-stable-mirror] Patch "x86, vdso, pvclock: Simplify and speed up the vdso pvclock reader" has been added to the 4.4-stable tree

by gregkh＠linuxfoundation.org

This is a note to let you know that I've just added the patch titled x86, vdso, pvclock: Simplify and speed up the vdso pvclock reader to the 4.4-stable tree which can be found at: http://www.kernel.org/git/?p=linux/kernel/git/stable/stable-queue.git;a=sum… The filename of the patch is: x86-vdso-pvclock-simplify-and-speed-up-the-vdso-pvclock-reader.patch and it can be found in the queue-4.4 subdirectory. If you, or anyone else, feels it should not be added to the stable tree, please let <stable(a)vger.kernel.org> know about it. >From 6b078f5de7fc0851af4102493c7b5bb07e49c4cb Mon Sep 17 00:00:00 2001 From: Andy Lutomirski <luto(a)amacapital.net> Date: Thu, 10 Dec 2015 19:20:19 -0800 Subject: x86, vdso, pvclock: Simplify and speed up the vdso pvclock reader From: Andy Lutomirski <luto(a)amacapital.net> commit 6b078f5de7fc0851af4102493c7b5bb07e49c4cb upstream. The pvclock vdso code was too abstracted to understand easily and excessively paranoid. Simplify it for a huge speedup. This opens the door for additional simplifications, as the vdso no longer accesses the pvti for any vcpu other than vcpu 0. Before, vclock_gettime using kvm-clock took about 45ns on my machine. With this change, it takes 29ns, which is almost as fast as the pure TSC implementation. Signed-off-by: Andy Lutomirski <luto(a)amacapital.net> Reviewed-by: Paolo Bonzini <pbonzini(a)redhat.com> Cc: Borislav Petkov <bp(a)alien8.de> Cc: Brian Gerst <brgerst(a)gmail.com> Cc: Denys Vlasenko <dvlasenk(a)redhat.com> Cc: H. Peter Anvin <hpa(a)zytor.com> Cc: Linus Torvalds <torvalds(a)linux-foundation.org> Cc: Peter Zijlstra <peterz(a)infradead.org> Cc: Thomas Gleixner <tglx(a)linutronix.de> Cc: linux-mm(a)kvack.org Link: http://lkml.kernel.org/r/6b51dcc41f1b101f963945c5ec7093d72bdac429.144970253… Signed-off-by: Ingo Molnar <mingo(a)kernel.org> Cc: Jamie Iles <jamie.iles(a)oracle.com> Signed-off-by: Greg Kroah-Hartman <gregkh(a)linuxfoundation.org> --- arch/x86/entry/vdso/vclock_gettime.c | 79 +++++++++++++++++++---------------- 1 file changed, 45 insertions(+), 34 deletions(-) --- a/arch/x86/entry/vdso/vclock_gettime.c +++ b/arch/x86/entry/vdso/vclock_gettime.c @@ -78,47 +78,58 @@ static notrace const struct pvclock_vsys static notrace cycle_t vread_pvclock(int *mode) { - const struct pvclock_vsyscall_time_info *pvti; + const struct pvclock_vcpu_time_info *pvti = &get_pvti(0)->pvti; cycle_t ret; - u64 last; - u32 version; - u8 flags; - unsigned cpu, cpu1; - + u64 tsc, pvti_tsc; + u64 last, delta, pvti_system_time; + u32 version, pvti_tsc_to_system_mul, pvti_tsc_shift; /* - * Note: hypervisor must guarantee that: - * 1. cpu ID number maps 1:1 to per-CPU pvclock time info. - * 2. that per-CPU pvclock time info is updated if the - * underlying CPU changes. - * 3. that version is increased whenever underlying CPU - * changes. + * Note: The kernel and hypervisor must guarantee that cpu ID + * number maps 1:1 to per-CPU pvclock time info. + * + * Because the hypervisor is entirely unaware of guest userspace + * preemption, it cannot guarantee that per-CPU pvclock time + * info is updated if the underlying CPU changes or that that + * version is increased whenever underlying CPU changes. + * + * On KVM, we are guaranteed that pvti updates for any vCPU are + * atomic as seen by *all* vCPUs. This is an even stronger + * guarantee than we get with a normal seqlock. * + * On Xen, we don't appear to have that guarantee, but Xen still + * supplies a valid seqlock using the version field. + + * We only do pvclock vdso timing at all if + * PVCLOCK_TSC_STABLE_BIT is set, and we interpret that bit to + * mean that all vCPUs have matching pvti and that the TSC is + * synced, so we can just look at vCPU 0's pvti. */ - do { - cpu = __getcpu() & VGETCPU_CPU_MASK; - /* TODO: We can put vcpu id into higher bits of pvti.version. - * This will save a couple of cycles by getting rid of - * __getcpu() calls (Gleb). - */ - - pvti = get_pvti(cpu); - - version = __pvclock_read_cycles(&pvti->pvti, &ret, &flags); - - /* - * Test we're still on the cpu as well as the version. - * We could have been migrated just after the first - * vgetcpu but before fetching the version, so we - * wouldn't notice a version change. - */ - cpu1 = __getcpu() & VGETCPU_CPU_MASK; - } while (unlikely(cpu != cpu1 || - (pvti->pvti.version & 1) || - pvti->pvti.version != version)); - if (unlikely(!(flags & PVCLOCK_TSC_STABLE_BIT))) + if (unlikely(!(pvti->flags & PVCLOCK_TSC_STABLE_BIT))) { *mode = VCLOCK_NONE; + return 0; + } + + do { + version = pvti->version; + + /* This is also a read barrier, so we'll read version first. */ + tsc = rdtsc_ordered(); + + pvti_tsc_to_system_mul = pvti->tsc_to_system_mul; + pvti_tsc_shift = pvti->tsc_shift; + pvti_system_time = pvti->system_time; + pvti_tsc = pvti->tsc_timestamp; + + /* Make sure that the version double-check is last. */ + smp_rmb(); + } while (unlikely((version & 1) || version != pvti->version)); + + delta = tsc - pvti_tsc; + ret = pvti_system_time + + pvclock_scale_delta(delta, pvti_tsc_to_system_mul, + pvti_tsc_shift); /* refer to tsc.c read_tsc() comment for rationale */ last = gtod->cycle_last; Patches currently in stable-queue which might be from luto(a)amacapital.net are queue-4.4/x86-vdso-pvclock-simplify-and-speed-up-the-vdso-pvclock-reader.patch queue-4.4/x86-vdso-get-pvclock-data-from-the-vvar-vma-instead-of-the-fixmap.patch

7 years, 5 months

1
0
0 0

[Linux-stable-mirror] Patch "x86/vdso: Get pvclock data from the vvar VMA instead of the fixmap" has been added to the 4.4-stable tree

by gregkh＠linuxfoundation.org

This is a note to let you know that I've just added the patch titled x86/vdso: Get pvclock data from the vvar VMA instead of the fixmap to the 4.4-stable tree which can be found at: http://www.kernel.org/git/?p=linux/kernel/git/stable/stable-queue.git;a=sum… The filename of the patch is: x86-vdso-get-pvclock-data-from-the-vvar-vma-instead-of-the-fixmap.patch and it can be found in the queue-4.4 subdirectory. If you, or anyone else, feels it should not be added to the stable tree, please let <stable(a)vger.kernel.org> know about it. >From dac16fba6fc590fa7239676b35ed75dae4c4cd2b Mon Sep 17 00:00:00 2001 From: Andy Lutomirski <luto(a)kernel.org> Date: Thu, 10 Dec 2015 19:20:20 -0800 Subject: x86/vdso: Get pvclock data from the vvar VMA instead of the fixmap From: Andy Lutomirski <luto(a)kernel.org> commit dac16fba6fc590fa7239676b35ed75dae4c4cd2b upstream. Signed-off-by: Andy Lutomirski <luto(a)kernel.org> Reviewed-by: Paolo Bonzini <pbonzini(a)redhat.com> Cc: Andy Lutomirski <luto(a)amacapital.net> Cc: Borislav Petkov <bp(a)alien8.de> Cc: Brian Gerst <brgerst(a)gmail.com> Cc: Denys Vlasenko <dvlasenk(a)redhat.com> Cc: H. Peter Anvin <hpa(a)zytor.com> Cc: Linus Torvalds <torvalds(a)linux-foundation.org> Cc: Peter Zijlstra <peterz(a)infradead.org> Cc: Thomas Gleixner <tglx(a)linutronix.de> Cc: linux-mm(a)kvack.org Link: http://lkml.kernel.org/r/9d37826fdc7e2d2809efe31d5345f97186859284.144970253… Signed-off-by: Ingo Molnar <mingo(a)kernel.org> Cc: Jamie Iles <jamie.iles(a)oracle.com> Signed-off-by: Greg Kroah-Hartman <gregkh(a)linuxfoundation.org> --- arch/x86/entry/vdso/vclock_gettime.c | 20 ++++++++------------ arch/x86/entry/vdso/vdso-layout.lds.S | 3 ++- arch/x86/entry/vdso/vdso2c.c | 3 +++ arch/x86/entry/vdso/vma.c | 13 +++++++++++++ arch/x86/include/asm/pvclock.h | 9 +++++++++ arch/x86/include/asm/vdso.h | 1 + arch/x86/kernel/kvmclock.c | 5 +++++ 7 files changed, 41 insertions(+), 13 deletions(-) --- a/arch/x86/entry/vdso/vclock_gettime.c +++ b/arch/x86/entry/vdso/vclock_gettime.c @@ -36,6 +36,11 @@ static notrace cycle_t vread_hpet(void) } #endif +#ifdef CONFIG_PARAVIRT_CLOCK +extern u8 pvclock_page + __attribute__((visibility("hidden"))); +#endif + #ifndef BUILD_VDSO32 #include <linux/kernel.h> @@ -62,23 +67,14 @@ notrace static long vdso_fallback_gtod(s #ifdef CONFIG_PARAVIRT_CLOCK -static notrace const struct pvclock_vsyscall_time_info *get_pvti(int cpu) +static notrace const struct pvclock_vsyscall_time_info *get_pvti0(void) { - const struct pvclock_vsyscall_time_info *pvti_base; - int idx = cpu / (PAGE_SIZE/PVTI_SIZE); - int offset = cpu % (PAGE_SIZE/PVTI_SIZE); - - BUG_ON(PVCLOCK_FIXMAP_BEGIN + idx > PVCLOCK_FIXMAP_END); - - pvti_base = (struct pvclock_vsyscall_time_info *) - __fix_to_virt(PVCLOCK_FIXMAP_BEGIN+idx); - - return &pvti_base[offset]; + return (const struct pvclock_vsyscall_time_info *)&pvclock_page; } static notrace cycle_t vread_pvclock(int *mode) { - const struct pvclock_vcpu_time_info *pvti = &get_pvti(0)->pvti; + const struct pvclock_vcpu_time_info *pvti = &get_pvti0()->pvti; cycle_t ret; u64 tsc, pvti_tsc; u64 last, delta, pvti_system_time; --- a/arch/x86/entry/vdso/vdso-layout.lds.S +++ b/arch/x86/entry/vdso/vdso-layout.lds.S @@ -25,7 +25,7 @@ SECTIONS * segment. */ - vvar_start = . - 2 * PAGE_SIZE; + vvar_start = . - 3 * PAGE_SIZE; vvar_page = vvar_start; /* Place all vvars at the offsets in asm/vvar.h. */ @@ -36,6 +36,7 @@ SECTIONS #undef EMIT_VVAR hpet_page = vvar_start + PAGE_SIZE; + pvclock_page = vvar_start + 2 * PAGE_SIZE; . = SIZEOF_HEADERS; --- a/arch/x86/entry/vdso/vdso2c.c +++ b/arch/x86/entry/vdso/vdso2c.c @@ -73,6 +73,7 @@ enum { sym_vvar_start, sym_vvar_page, sym_hpet_page, + sym_pvclock_page, sym_VDSO_FAKE_SECTION_TABLE_START, sym_VDSO_FAKE_SECTION_TABLE_END, }; @@ -80,6 +81,7 @@ enum { const int special_pages[] = { sym_vvar_page, sym_hpet_page, + sym_pvclock_page, }; struct vdso_sym { @@ -91,6 +93,7 @@ struct vdso_sym required_syms[] = { [sym_vvar_start] = {"vvar_start", true}, [sym_vvar_page] = {"vvar_page", true}, [sym_hpet_page] = {"hpet_page", true}, + [sym_pvclock_page] = {"pvclock_page", true}, [sym_VDSO_FAKE_SECTION_TABLE_START] = { "VDSO_FAKE_SECTION_TABLE_START", false }, --- a/arch/x86/entry/vdso/vma.c +++ b/arch/x86/entry/vdso/vma.c @@ -100,6 +100,7 @@ static int map_vdso(const struct vdso_im .name = "[vvar]", .pages = no_pages, }; + struct pvclock_vsyscall_time_info *pvti; if (calculate_addr) { addr = vdso_addr(current->mm->start_stack, @@ -169,6 +170,18 @@ static int map_vdso(const struct vdso_im } #endif + pvti = pvclock_pvti_cpu0_va(); + if (pvti && image->sym_pvclock_page) { + ret = remap_pfn_range(vma, + text_start + image->sym_pvclock_page, + __pa(pvti) >> PAGE_SHIFT, + PAGE_SIZE, + PAGE_READONLY); + + if (ret) + goto up_fail; + } + up_fail: if (ret) current->mm->context.vdso = NULL; --- a/arch/x86/include/asm/pvclock.h +++ b/arch/x86/include/asm/pvclock.h @@ -4,6 +4,15 @@ #include <linux/clocksource.h> #include <asm/pvclock-abi.h> +#ifdef CONFIG_PARAVIRT_CLOCK +extern struct pvclock_vsyscall_time_info *pvclock_pvti_cpu0_va(void); +#else +static inline struct pvclock_vsyscall_time_info *pvclock_pvti_cpu0_va(void) +{ + return NULL; +} +#endif + /* some helper functions for xen and kvm pv clock sources */ cycle_t pvclock_clocksource_read(struct pvclock_vcpu_time_info *src); u8 pvclock_read_flags(struct pvclock_vcpu_time_info *src); --- a/arch/x86/include/asm/vdso.h +++ b/arch/x86/include/asm/vdso.h @@ -22,6 +22,7 @@ struct vdso_image { long sym_vvar_page; long sym_hpet_page; + long sym_pvclock_page; long sym_VDSO32_NOTE_MASK; long sym___kernel_sigreturn; long sym___kernel_rt_sigreturn; --- a/arch/x86/kernel/kvmclock.c +++ b/arch/x86/kernel/kvmclock.c @@ -45,6 +45,11 @@ early_param("no-kvmclock", parse_no_kvmc static struct pvclock_vsyscall_time_info *hv_clock; static struct pvclock_wall_clock wall_clock; +struct pvclock_vsyscall_time_info *pvclock_pvti_cpu0_va(void) +{ + return hv_clock; +} + /* * The wallclock is the time of day when we booted. Since then, some time may * have elapsed since the hypervisor wrote the data. So we try to account for Patches currently in stable-queue which might be from luto(a)kernel.org are queue-4.4/x86-paravirt-dont-patch-flush_tlb_single.patch queue-4.4/x86-boot-add-early-cmdline-parsing-for-options-with-arguments.patch queue-4.4/x86-vdso-pvclock-simplify-and-speed-up-the-vdso-pvclock-reader.patch queue-4.4/x86-vdso-get-pvclock-data-from-the-vvar-vma-instead-of-the-fixmap.patch

7 years, 5 months

1
0
0 0

[Linux-stable-mirror] Init panic in 4.4 PTI backports with CONFIG_PARAVIRT_CLOCK=y

by Jamie Iles

Hi Greg, Another panic in the 4.4 PTI backports with CONFIG_PARAVIRT_CLOCK=y: [ 83.008457] init[1]: segfault at ffffffffff5ff0c0 ip 00007ffdcbf609c5 sp 00007ffdcbeb2fa0 error 5 [ 83.012895] Kernel panic - not syncing: Attempted to kill init! exitcode=0x0000000b [ 83.012895] [ 83.013764] CPU: 3 PID: 1 Comm: init Not tainted 4.4.110-rc1+ #84 [ 83.014345] Hardware name: QEMU Standard PC (i440FX + PIIX, 1996), BIOS 1.10.1-1ubuntu1 04/01/2014 [ 83.015219] ffff880119207b68 ffff8801191579b8 ffffffff8159b31d ffffffff820441c0 [ 83.015981] ffff880119157a90 ffff880119157a80 ffffffff8120cc1a 0000000041b58ab3 [ 83.016724] ffffffff825a770d ffffffff8120cae0 ffff8801191487b8 0000000000000010 [ 83.017464] Call Trace: [ 83.017708] [<ffffffff8159b31d>] dump_stack+0x86/0xc9 [ 83.018197] [<ffffffff8120cc1a>] panic+0x13a/0x283 [ 83.018695] [<ffffffff8120cae0>] ? set_ti_thread_flag+0xf/0xf [ 83.019259] [<ffffffff81109a52>] ? mark_held_locks+0x22/0xb0 [ 83.019802] [<ffffffff81e6bfec>] ? _raw_write_unlock_irq+0x2c/0x40 [ 83.020391] [<ffffffff810964ef>] do_exit+0x96f/0x1480 [ 83.020885] [<ffffffff81095b80>] ? release_task+0x8e0/0x8e0 [ 83.021437] [<ffffffff81109d48>] ? trace_hardirqs_on_caller+0x268/0x2a0 [ 83.022089] [<ffffffff8109932a>] do_group_exit+0xda/0x160 [ 83.022607] [<ffffffff810ad7b3>] get_signal+0xa93/0xba0 [ 83.023132] [<ffffffff81109a52>] ? mark_held_locks+0x22/0xb0 [ 83.023684] [<ffffffff81007e31>] do_signal+0x91/0xab0 [ 83.024171] [<ffffffff8107abaf>] ? force_sig_info_fault.constprop.19+0xef/0x120 [ 83.024887] [<ffffffff81007da0>] ? setup_sigcontext+0x270/0x270 [ 83.025450] [<ffffffff8107aac0>] ? is_prefetch.isra.16+0x260/0x260 [ 83.026041] [<ffffffff8120d3c2>] ? printk+0x99/0xb5 [ 83.026520] [<ffffffff8120d329>] ? power_down+0xa9/0xa9 [ 83.027057] [<ffffffff811027df>] ? up_read+0x1f/0x40 [ 83.027559] [<ffffffff8120d3c2>] ? printk+0x99/0xb5 [ 83.028030] [<ffffffff8107bb55>] ? __bad_area_nosemaphore+0x265/0x2a0 [ 83.028647] [<ffffffff81109a52>] ? mark_held_locks+0x22/0xb0 [ 83.029199] [<ffffffff810023a2>] ? exit_to_usermode_loop+0x52/0xd0 [ 83.029789] [<ffffffff810023c3>] exit_to_usermode_loop+0x73/0xd0 [ 83.030365] [<ffffffff81003840>] prepare_exit_to_usermode+0x60/0x80 [ 83.031067] [<ffffffff81e6cfc8>] retint_user+0x8/0x3c [ 83.031699] Kernel Offset: disabled [ 83.032067] ---[ end Kernel panic - not syncing: Attempted to kill init! exitcode=0x0000000b [ 83.032067] This one is from the PVCLOCK_FIXMAP_BEGIN fixmap page, and I can fix it by additionally picking: 6b078f5de7fc (x86, vdso, pvclock: Simplify and speed up the vdso pvclock reader) dac16fba6fc5 (x86/vdso: Get pvclock data from the vvar VMA instead of the fixmap) onto the stable-rc linux-4.4.y branch. Thanks, Jamie

7 years, 5 months

2
1
0 0

[Linux-stable-mirror] [PATCH 4.14 00/14] 4.14.12-stable review

by Greg Kroah-Hartman

This is the start of the stable review cycle for the 4.14.12 release. There are 14 patches in this series, all will be posted as a response to this one. If anyone has any issues with these being applied, please let me know. Responses should be made by Sat Jan 6 12:08:52 UTC 2018. Anything received after that time might be too late. The whole patch series can be found in one patch at: kernel.org/pub/linux/kernel/v4.x/stable-review/patch-4.14.12-rc1.gz or in the git tree and branch at: git://git.kernel.org/pub/scm/linux/kernel/git/stable/linux-stable-rc.git linux-4.14.y and the diffstat can be found below. thanks, greg k-h ------------- Pseudo-Shortlog of commits: Greg Kroah-Hartman <gregkh(a)linuxfoundation.org> Linux 4.14.12-rc1 Troy Kisky <troy.kisky(a)boundarydevices.com> rtc: m41t80: remove unneeded checks from m41t80_sqw_set_rate Troy Kisky <troy.kisky(a)boundarydevices.com> rtc: m41t80: avoid i2c read in m41t80_sqw_is_prepared Troy Kisky <troy.kisky(a)boundarydevices.com> rtc: m41t80: avoid i2c read in m41t80_sqw_recalc_rate Troy Kisky <troy.kisky(a)boundarydevices.com> rtc: m41t80: fix m41t80_sqw_round_rate return value Troy Kisky <troy.kisky(a)boundarydevices.com> rtc: m41t80: m41t80_sqw_set_rate should return 0 on success Steffen Klassert <steffen.klassert(a)secunet.com> Revert "xfrm: Fix stack-out-of-bounds read in xfrm_state_find." Nick Desaulniers <ndesaulniers(a)google.com> x86/process: Define cpu_tss_rw in same section as declaration Thomas Gleixner <tglx(a)linutronix.de> x86/pti: Switch to kernel CR3 at early in entry_SYSCALL_compat() Josh Poimboeuf <jpoimboe(a)redhat.com> x86/dumpstack: Print registers for first stack frame Josh Poimboeuf <jpoimboe(a)redhat.com> x86/dumpstack: Fix partial register dumps Thomas Gleixner <tglx(a)linutronix.de> x86/pti: Make sure the user/kernel PTEs match Tom Lendacky <thomas.lendacky(a)amd.com> x86/cpu, x86/pti: Do not enable PTI on AMD processors Eric Biggers <ebiggers(a)google.com> capabilities: fix buffer overread on very short xattr Kees Cook <keescook(a)chromium.org> exec: Weaken dumpability for secureexec ------------- Diffstat: Makefile | 4 +- arch/x86/entry/entry_64_compat.S | 13 +++---- arch/x86/include/asm/unwind.h | 17 ++++++-- arch/x86/kernel/cpu/common.c | 4 +- arch/x86/kernel/dumpstack.c | 31 ++++++++++----- arch/x86/kernel/process.c | 2 +- arch/x86/kernel/stacktrace.c | 2 +- arch/x86/mm/pti.c | 3 +- drivers/rtc/rtc-m41t80.c | 84 ++++++++++++++++++---------------------- fs/exec.c | 9 ++++- net/xfrm/xfrm_policy.c | 29 ++++++++------ security/commoncap.c | 21 +++++----- 12 files changed, 120 insertions(+), 99 deletions(-)

7 years, 5 months

4
17
0 0

[Linux-stable-mirror] [PATCH 4.9 00/39] 4.9.75-stable review

by Greg Kroah-Hartman

This is the start of the stable review cycle for the 4.9.75 release. There are 39 patches in this series, all will be posted as a response to this one. If anyone has any issues with these being applied, please let me know. Responses should be made by Fri Jan 5 19:50:44 UTC 2018. Anything received after that time might be too late. The whole patch series can be found in one patch at: kernel.org/pub/linux/kernel/v4.x/stable-review/patch-4.9.75-rc1.gz or in the git tree and branch at: git://git.kernel.org/pub/scm/linux/kernel/git/stable/linux-stable-rc.git linux-4.9.y and the diffstat can be found below. thanks, greg k-h ------------- Pseudo-Shortlog of commits: Greg Kroah-Hartman <gregkh(a)linuxfoundation.org> Linux 4.9.75-rc1 Kees Cook <keescook(a)chromium.org> KPTI: Report when enabled Kees Cook <keescook(a)chromium.org> KPTI: Rename to PAGE_TABLE_ISOLATION Borislav Petkov <bp(a)suse.de> x86/kaiser: Move feature detection up Jiri Kosina <jkosina(a)suse.cz> kaiser: disabled on Xen PV Borislav Petkov <bp(a)suse.de> x86/kaiser: Reenable PARAVIRT Thomas Gleixner <tglx(a)linutronix.de> x86/paravirt: Dont patch flush_tlb_single Hugh Dickins <hughd(a)google.com> kaiser: kaiser_flush_tlb_on_return_to_user() check PCID Hugh Dickins <hughd(a)google.com> kaiser: asm/tlbflush.h handle noPGE at lower level Hugh Dickins <hughd(a)google.com> kaiser: drop is_atomic arg to kaiser_pagetable_walk() Hugh Dickins <hughd(a)google.com> kaiser: use ALTERNATIVE instead of x86_cr3_pcid_noflush Borislav Petkov <bp(a)suse.de> x86/kaiser: Check boottime cmdline params Borislav Petkov <bp(a)suse.de> x86/kaiser: Rename and simplify X86_FEATURE_KAISER handling Hugh Dickins <hughd(a)google.com> kaiser: add "nokaiser" boot option, using ALTERNATIVE Hugh Dickins <hughd(a)google.com> kaiser: fix unlikely error in alloc_ldt_struct() Hugh Dickins <hughd(a)google.com> kaiser: kaiser_remove_mapping() move along the pgd Hugh Dickins <hughd(a)google.com> kaiser: paranoid_entry pass cr3 need to paranoid_exit Hugh Dickins <hughd(a)google.com> kaiser: x86_cr3_pcid_noflush and x86_cr3_pcid_user Hugh Dickins <hughd(a)google.com> kaiser: PCID 0 for kernel and 128 for user Hugh Dickins <hughd(a)google.com> kaiser: load_new_mm_cr3() let SWITCH_USER_CR3 flush user Hugh Dickins <hughd(a)google.com> kaiser: enhanced by kernel and user PCIDs Hugh Dickins <hughd(a)google.com> kaiser: vmstat show NR_KAISERTABLE as nr_overhead Hugh Dickins <hughd(a)google.com> kaiser: delete KAISER_REAL_SWITCH option Hugh Dickins <hughd(a)google.com> kaiser: name that 0x1000 KAISER_SHADOW_PGD_OFFSET Hugh Dickins <hughd(a)google.com> kaiser: cleanups while trying for gold link Hugh Dickins <hughd(a)google.com> kaiser: align addition to x86/mm/Makefile Hugh Dickins <hughd(a)google.com> kaiser: tidied up kaiser_add/remove_mapping slightly Hugh Dickins <hughd(a)google.com> kaiser: tidied up asm/kaiser.h somewhat Hugh Dickins <hughd(a)google.com> kaiser: ENOMEM if kaiser_pagetable_walk() NULL Hugh Dickins <hughd(a)google.com> kaiser: fix perf crashes Hugh Dickins <hughd(a)google.com> kaiser: fix regs to do_nmi() ifndef CONFIG_KAISER Hugh Dickins <hughd(a)google.com> kaiser: KAISER depends on SMP Hugh Dickins <hughd(a)google.com> kaiser: fix build and FIXME in alloc_ldt_struct() Hugh Dickins <hughd(a)google.com> kaiser: stack map PAGE_SIZE at THREAD_SIZE-PAGE_SIZE Hugh Dickins <hughd(a)google.com> kaiser: do not set _PAGE_NX on pgd_none Dave Hansen <dave.hansen(a)linux.intel.com> kaiser: merged update Richard Fellner <richard.fellner(a)student.tugraz.at> KAISER: Kernel Address Isolation Tom Lendacky <thomas.lendacky(a)amd.com> x86/boot: Add early cmdline parsing for options with arguments Neal Cardwell <ncardwell(a)google.com> tcp_bbr: reset long-term bandwidth sampling on loss recovery undo Neal Cardwell <ncardwell(a)google.com> tcp_bbr: reset full pipe detection on loss recovery undo ------------- Diffstat: Documentation/kernel-parameters.txt | 8 + Makefile | 4 +- arch/x86/boot/compressed/misc.h | 1 + arch/x86/entry/entry_64.S | 163 ++++++++-- arch/x86/entry/entry_64_compat.S | 8 +- arch/x86/events/intel/ds.c | 57 +++- arch/x86/include/asm/cmdline.h | 2 + arch/x86/include/asm/cpufeatures.h | 4 + arch/x86/include/asm/desc.h | 2 +- arch/x86/include/asm/hw_irq.h | 2 +- arch/x86/include/asm/kaiser.h | 141 +++++++++ arch/x86/include/asm/pgtable.h | 28 +- arch/x86/include/asm/pgtable_64.h | 25 +- arch/x86/include/asm/pgtable_types.h | 29 +- arch/x86/include/asm/processor.h | 2 +- arch/x86/include/asm/tlbflush.h | 74 ++++- arch/x86/include/uapi/asm/processor-flags.h | 3 +- arch/x86/kernel/cpu/common.c | 28 +- arch/x86/kernel/espfix_64.c | 10 + arch/x86/kernel/head_64.S | 35 ++- arch/x86/kernel/irqinit.c | 2 +- arch/x86/kernel/ldt.c | 25 +- arch/x86/kernel/paravirt_patch_64.c | 2 - arch/x86/kernel/process.c | 2 +- arch/x86/kernel/setup.c | 7 + arch/x86/kernel/tracepoint.c | 2 + arch/x86/kvm/x86.c | 3 +- arch/x86/lib/cmdline.c | 105 +++++++ arch/x86/mm/Makefile | 4 +- arch/x86/mm/init.c | 2 +- arch/x86/mm/init_64.c | 10 + arch/x86/mm/kaiser.c | 454 ++++++++++++++++++++++++++++ arch/x86/mm/kaslr.c | 4 +- arch/x86/mm/pageattr.c | 63 +++- arch/x86/mm/pgtable.c | 12 +- arch/x86/mm/tlb.c | 39 ++- include/asm-generic/vmlinux.lds.h | 7 + include/linux/kaiser.h | 52 ++++ include/linux/mmzone.h | 3 +- include/linux/percpu-defs.h | 32 +- init/main.c | 2 + kernel/fork.c | 6 + mm/vmstat.c | 1 + net/ipv4/tcp_bbr.c | 5 + security/Kconfig | 10 + tools/arch/x86/include/asm/cpufeatures.h | 3 + 46 files changed, 1382 insertions(+), 101 deletions(-)

7 years, 5 months

5
44
0 0

[Linux-stable-mirror] [PATCH] iwlwifi: pcie: fix DMA memory mapping / unmapping

by Emmanuel Grumbach

22000 devices (previously referenced as A000) can support short transmit queues. This means that we have less DMA descriptors (TFD) for those shorter queues. Previous devices must still have 256 TFDs for each queue even if those 256 TFDs point to fewer buffers. When I introduced support for the short queues for 22000 I broke older devices by assuming that they can also have less TFDs in their queues. This led to several problems: 1) the payload of the commands weren't unmapped properly which caused the SWIOTLB to complain at some point. 2) the hardware could get confused and we get hardware crashes. The corresponding bugzilla entries are: https://bugzilla.kernel.org/show_bug.cgi?id=198201 https://bugzilla.kernel.org/show_bug.cgi?id=198265 Cc: stable(a)vger.kernel.org # 4.14+ Fixes: 4ecab5616023 ("iwlwifi: pcie: support short Tx queues for A000 device family") Reviewed-by: Sharon, Sara <sara.sharon(a)intel.com> Signed-off-by: Emmanuel Grumbach <emmanuel.grumbach(a)intel.com> --- Hi Kalle, Luca is on vacation is 4.15 will be closed soon. I am fixing here a bug that caused much troube on our side. There are two bugzillas on it. Users on both bugs validated this fix. Please apply this on wireless-drivers.git directly and I'll sync with Luca when he'll be back. Thank you! --- drivers/net/wireless/intel/iwlwifi/pcie/internal.h | 10 +++++++--- drivers/net/wireless/intel/iwlwifi/pcie/tx-gen2.c | 11 +++-------- drivers/net/wireless/intel/iwlwifi/pcie/tx.c | 8 ++++---- 3 files changed, 14 insertions(+), 15 deletions(-) diff --git a/drivers/net/wireless/intel/iwlwifi/pcie/internal.h b/drivers/net/wireless/intel/iwlwifi/pcie/internal.h index d749abeca3ae..403e65c309d0 100644 --- a/drivers/net/wireless/intel/iwlwifi/pcie/internal.h +++ b/drivers/net/wireless/intel/iwlwifi/pcie/internal.h @@ -670,11 +670,15 @@ static inline u8 iwl_pcie_get_cmd_index(struct iwl_txq *q, u32 index) return index & (q->n_window - 1); } -static inline void *iwl_pcie_get_tfd(struct iwl_trans_pcie *trans_pcie, +static inline void *iwl_pcie_get_tfd(struct iwl_trans *trans, struct iwl_txq *txq, int idx) { - return txq->tfds + trans_pcie->tfd_size * iwl_pcie_get_cmd_index(txq, - idx); + struct iwl_trans_pcie *trans_pcie = IWL_TRANS_GET_PCIE_TRANS(trans); + + if (trans->cfg->use_tfh) + idx = iwl_pcie_get_cmd_index(txq, idx); + + return txq->tfds + trans_pcie->tfd_size * idx; } static inline void iwl_enable_rfkill_int(struct iwl_trans *trans) diff --git a/drivers/net/wireless/intel/iwlwifi/pcie/tx-gen2.c b/drivers/net/wireless/intel/iwlwifi/pcie/tx-gen2.c index 16b345f54ff0..6d0a907d5ba5 100644 --- a/drivers/net/wireless/intel/iwlwifi/pcie/tx-gen2.c +++ b/drivers/net/wireless/intel/iwlwifi/pcie/tx-gen2.c @@ -171,8 +171,6 @@ static void iwl_pcie_gen2_tfd_unmap(struct iwl_trans *trans, static void iwl_pcie_gen2_free_tfd(struct iwl_trans *trans, struct iwl_txq *txq) { - struct iwl_trans_pcie *trans_pcie = IWL_TRANS_GET_PCIE_TRANS(trans); - /* rd_ptr is bounded by TFD_QUEUE_SIZE_MAX and * idx is bounded by n_window */ @@ -181,7 +179,7 @@ static void iwl_pcie_gen2_free_tfd(struct iwl_trans *trans, struct iwl_txq *txq) lockdep_assert_held(&txq->lock); iwl_pcie_gen2_tfd_unmap(trans, &txq->entries[idx].meta, - iwl_pcie_get_tfd(trans_pcie, txq, idx)); + iwl_pcie_get_tfd(trans, txq, idx)); /* free SKB */ if (txq->entries) { @@ -364,11 +362,9 @@ struct iwl_tfh_tfd *iwl_pcie_gen2_build_tfd(struct iwl_trans *trans, struct sk_buff *skb, struct iwl_cmd_meta *out_meta) { - struct iwl_trans_pcie *trans_pcie = IWL_TRANS_GET_PCIE_TRANS(trans); struct ieee80211_hdr *hdr = (struct ieee80211_hdr *)skb->data; int idx = iwl_pcie_get_cmd_index(txq, txq->write_ptr); - struct iwl_tfh_tfd *tfd = - iwl_pcie_get_tfd(trans_pcie, txq, idx); + struct iwl_tfh_tfd *tfd = iwl_pcie_get_tfd(trans, txq, idx); dma_addr_t tb_phys; bool amsdu; int i, len, tb1_len, tb2_len, hdr_len; @@ -565,8 +561,7 @@ static int iwl_pcie_gen2_enqueue_hcmd(struct iwl_trans *trans, u8 group_id = iwl_cmd_groupid(cmd->id); const u8 *cmddata[IWL_MAX_CMD_TBS_PER_TFD]; u16 cmdlen[IWL_MAX_CMD_TBS_PER_TFD]; - struct iwl_tfh_tfd *tfd = - iwl_pcie_get_tfd(trans_pcie, txq, txq->write_ptr); + struct iwl_tfh_tfd *tfd = iwl_pcie_get_tfd(trans, txq, txq->write_ptr); memset(tfd, 0, sizeof(*tfd)); diff --git a/drivers/net/wireless/intel/iwlwifi/pcie/tx.c b/drivers/net/wireless/intel/iwlwifi/pcie/tx.c index fed6d842a5e1..3f85713c41dc 100644 --- a/drivers/net/wireless/intel/iwlwifi/pcie/tx.c +++ b/drivers/net/wireless/intel/iwlwifi/pcie/tx.c @@ -373,7 +373,7 @@ static void iwl_pcie_tfd_unmap(struct iwl_trans *trans, { struct iwl_trans_pcie *trans_pcie = IWL_TRANS_GET_PCIE_TRANS(trans); int i, num_tbs; - void *tfd = iwl_pcie_get_tfd(trans_pcie, txq, index); + void *tfd = iwl_pcie_get_tfd(trans, txq, index); /* Sanity check on number of chunks */ num_tbs = iwl_pcie_tfd_get_num_tbs(trans, tfd); @@ -2018,7 +2018,7 @@ static int iwl_fill_data_tbs(struct iwl_trans *trans, struct sk_buff *skb, } trace_iwlwifi_dev_tx(trans->dev, skb, - iwl_pcie_get_tfd(trans_pcie, txq, txq->write_ptr), + iwl_pcie_get_tfd(trans, txq, txq->write_ptr), trans_pcie->tfd_size, &dev_cmd->hdr, IWL_FIRST_TB_SIZE + tb1_len, hdr_len); @@ -2092,7 +2092,7 @@ static int iwl_fill_data_tbs_amsdu(struct iwl_trans *trans, struct sk_buff *skb, IEEE80211_CCMP_HDR_LEN : 0; trace_iwlwifi_dev_tx(trans->dev, skb, - iwl_pcie_get_tfd(trans_pcie, txq, txq->write_ptr), + iwl_pcie_get_tfd(trans, txq, txq->write_ptr), trans_pcie->tfd_size, &dev_cmd->hdr, IWL_FIRST_TB_SIZE + tb1_len, 0); @@ -2425,7 +2425,7 @@ int iwl_trans_pcie_tx(struct iwl_trans *trans, struct sk_buff *skb, memcpy(&txq->first_tb_bufs[txq->write_ptr], &dev_cmd->hdr, IWL_FIRST_TB_SIZE); - tfd = iwl_pcie_get_tfd(trans_pcie, txq, txq->write_ptr); + tfd = iwl_pcie_get_tfd(trans, txq, txq->write_ptr); /* Set up entry for this TFD in Tx byte-count array */ iwl_pcie_txq_update_byte_cnt_tbl(trans, txq, le16_to_cpu(tx_cmd->len), iwl_pcie_tfd_get_num_tbs(trans, tfd)); -- 2.9.3

7 years, 5 months

2
2
0 0

[Linux-stable-mirror] [PATCH] crypto: algapi - fix NULL dereference in crypto_remove_spawns()

by Eric Biggers

From: Eric Biggers <ebiggers(a)google.com> syzkaller triggered a NULL pointer dereference in crypto_remove_spawns() via a program that repeatedly and concurrently requests AEADs "authenc(cmac(des3_ede-asm),pcbc-aes-aesni)" and hashes "cmac(des3_ede)" through AF_ALG, where the hashes are requested as "untested" (CRYPTO_ALG_TESTED is set in ->salg_mask but clear in ->salg_feat; this causes the template to be instantiated for every request). Although AF_ALG users really shouldn't be able to request an "untested" algorithm, the NULL pointer dereference is actually caused by a longstanding race condition where crypto_remove_spawns() can encounter an instance which has had spawn(s) "grabbed" but hasn't yet been registered, resulting in ->cra_users still being NULL. We probably should properly initialize ->cra_users earlier, but that would require updating many templates individually. For now just fix the bug in a simple way that can easily be backported: make crypto_remove_spawns() treat a NULL ->cra_users list as empty. Reported-by: syzbot <syzkaller(a)googlegroups.com> Cc: stable(a)vger.kernel.org Signed-off-by: Eric Biggers <ebiggers(a)google.com> --- crypto/algapi.c | 12 ++++++++++++ 1 file changed, 12 insertions(+) diff --git a/crypto/algapi.c b/crypto/algapi.c index 9895cafcce7e..395b082d03a9 100644 --- a/crypto/algapi.c +++ b/crypto/algapi.c @@ -166,6 +166,18 @@ void crypto_remove_spawns(struct crypto_alg *alg, struct list_head *list, spawn->alg = NULL; spawns = &inst->alg.cra_users; + + /* + * We may encounter an unregistered instance here, since + * an instance's spawns are set up prior to the instance + * being registered. An unregistered instance will have + * NULL ->cra_users.next, since ->cra_users isn't + * properly initialized until registration. But an + * unregistered instance cannot have any users, so treat + * it the same as ->cra_users being empty. + */ + if (spawns->next == NULL) + break; } } while ((spawns = crypto_more_spawns(alg, &stack, &top, &secondary_spawns))); -- 2.15.1

7 years, 5 months

2
1
0 0

Re: [Linux-stable-mirror] [PATCH 4.14 00/14] 4.14.12-stable review

by Greg Kroah-Hartman

On Thu, Jan 04, 2018 at 04:12:31PM -0800, Kevin Hilman wrote: > kernelci.org bot <bot(a)kernelci.org> writes: > > > stable-rc/linux-4.14.y boot: 118 boots: 4 failed, 113 passed with 1 offline (v4.14.11-15-g732141e47ee6) > > > > Full Boot Summary: https://kernelci.org/boot/all/job/stable-rc/branch/linux-4.14.y/kernel/v4.1… > > Full Build Summary: https://kernelci.org/build/stable-rc/branch/linux-4.14.y/kernel/v4.14.11-15… > > > > Tree: stable-rc > > Branch: linux-4.14.y > > Git Describe: v4.14.11-15-g732141e47ee6 > > Git Commit: 732141e47ee614d70aeb8ad828a977ad19447e87 > > Git URL: http://git.kernel.org/pub/scm/linux/kernel/git/stable/linux-stable-rc.git > > Tested: 68 unique boards, 23 SoC families, 16 builds out of 185 > > > > Boot Regressions Detected: > > TL;DR; All is well. Thanks for the summary of all of these, and for your continued testing. greg k-h

7 years, 5 months

1
0
0 0

[Linux-stable-mirror] [PATCH] drm/i915: forward hotplug events again

by Rodrigo Vivi

As mentioned on commit '88be58be886f ("drm/i915/fbdev: Always forward hotplug events") we have real valid cases of hotplugs where fbdev is not fully setup yet. Unfortunately this remove the checkpoint after the sync point. So probably we can live without it. Or we need a more robust serialization. Bugzilla: https://bugs.freedesktop.org/show_bug.cgi?id=104158 Fixes: a45b30a6c5db ("drm/i915/fbdev: Serialise early hotplug events with async fbdev config") Cc: Chris Wilson <chris(a)chris-wilson.co.uk> Cc: Lukas Wunner <lukas(a)wunner.de> Cc: jrg2718(a)gmail.com Cc: stable(a)vger.kernel.org Signed-off-by: Rodrigo Vivi <rodrigo.vivi(a)intel.com> --- drivers/gpu/drm/i915/intel_fbdev.c | 3 +-- 1 file changed, 1 insertion(+), 2 deletions(-) diff --git a/drivers/gpu/drm/i915/intel_fbdev.c b/drivers/gpu/drm/i915/intel_fbdev.c index da48af11eb6b..7a6069b389f2 100644 --- a/drivers/gpu/drm/i915/intel_fbdev.c +++ b/drivers/gpu/drm/i915/intel_fbdev.c @@ -801,8 +801,7 @@ void intel_fbdev_output_poll_changed(struct drm_device *dev) return; intel_fbdev_sync(ifbdev); - if (ifbdev->vma) - drm_fb_helper_hotplug_event(&ifbdev->helper); + drm_fb_helper_hotplug_event(&ifbdev->helper); } void intel_fbdev_restore_mode(struct drm_device *dev) -- 2.13.6

7 years, 5 months

2
1
0 0

Re: [Linux-stable-mirror] [Intel-gfx] [PATCH] drm/i915: Whitelist SLICE_COMMON_ECO_CHICKEN1 on Geminilake.

by Rodrigo Vivi

On Thu, Jan 04, 2018 at 11:39:23PM +0000, Kenneth Graunke wrote: > On Thursday, January 4, 2018 1:23:06 PM PST Chris Wilson wrote: > > Quoting Kenneth Graunke (2018-01-04 19:38:05) > > > Geminilake requires the 3D driver to select whether barriers are > > > intended for compute shaders, or tessellation control shaders, by > > > whacking a "Barrier Mode" bit in SLICE_COMMON_ECO_CHICKEN1 when > > > switching pipelines. Failure to do this properly can result in GPU > > > hangs. > > > > > > Unfortunately, this means it needs to switch mid-batch, so only > > > userspace can properly set it. To facilitate this, the kernel needs > > > to whitelist the register. > > > > > > Signed-off-by: Kenneth Graunke <kenneth(a)whitecape.org> > > > Cc: stable(a)vger.kernel.org > > > --- > > > drivers/gpu/drm/i915/i915_reg.h | 2 ++ > > > drivers/gpu/drm/i915/intel_engine_cs.c | 5 +++++ > > > 2 files changed, 7 insertions(+) > > > > > > Hello, > > > > > > We unfortunately need to whitelist an extra register for GPU hang fix > > > on Geminilake. Here's the corresponding Mesa patch: > > > > Thankfully it appears to be context saved. Has a w/a name been assigned > > for this? > > -Chris > > There doesn't appear to be one. The workaround page lists it, but there > is no name. The register description has a note saying that you need to > set this, but doesn't call it out as a workaround. It mentions only BXT:ALL, but not mention to GLK. Should we add to both then? > > That's why I put a generic comment, rather than the name. On Display side we started using the row name for this case, to help easily finding this later. ex: "Display WA #0390: skl,kbl" The number for this apparently is: WA #0862 Maybe we could use this one to start /* GT WA #0862: bxt,glk */ GT? GEM? Unnamed WA #0862? Thanks, Rodrigo. > > --Ken > _______________________________________________ > Intel-gfx mailing list > Intel-gfx(a)lists.freedesktop.org > https://lists.freedesktop.org/mailman/listinfo/intel-gfx

7 years, 5 months

1
0
0 0

2025

2024

2023

2022

2021

2020

2019

2018

2017

Linux-stable-mirror January 2018