This is a note to let you know that I've just added the patch titled
x86, vdso, pvclock: Simplify and speed up the vdso pvclock reader
to the 4.4-stable tree which can be found at:
http://www.kernel.org/git/?p=linux/kernel/git/stable/stable-queue.git;a=sum…
The filename of the patch is:
x86-vdso-pvclock-simplify-and-speed-up-the-vdso-pvclock-reader.patch
and it can be found in the queue-4.4 subdirectory.
If you, or anyone else, feels it should not be added to the stable tree,
please let <stable(a)vger.kernel.org> know about it.
>From 6b078f5de7fc0851af4102493c7b5bb07e49c4cb Mon Sep 17 00:00:00 2001
From: Andy Lutomirski <luto(a)amacapital.net>
Date: Thu, 10 Dec 2015 19:20:19 -0800
Subject: x86, vdso, pvclock: Simplify and speed up the vdso pvclock reader
From: Andy Lutomirski <luto(a)amacapital.net>
commit 6b078f5de7fc0851af4102493c7b5bb07e49c4cb upstream.
The pvclock vdso code was too abstracted to understand easily
and excessively paranoid. Simplify it for a huge speedup.
This opens the door for additional simplifications, as the vdso
no longer accesses the pvti for any vcpu other than vcpu 0.
Before, vclock_gettime using kvm-clock took about 45ns on my
machine. With this change, it takes 29ns, which is almost as
fast as the pure TSC implementation.
Signed-off-by: Andy Lutomirski <luto(a)amacapital.net>
Reviewed-by: Paolo Bonzini <pbonzini(a)redhat.com>
Cc: Borislav Petkov <bp(a)alien8.de>
Cc: Brian Gerst <brgerst(a)gmail.com>
Cc: Denys Vlasenko <dvlasenk(a)redhat.com>
Cc: H. Peter Anvin <hpa(a)zytor.com>
Cc: Linus Torvalds <torvalds(a)linux-foundation.org>
Cc: Peter Zijlstra <peterz(a)infradead.org>
Cc: Thomas Gleixner <tglx(a)linutronix.de>
Cc: linux-mm(a)kvack.org
Link: http://lkml.kernel.org/r/6b51dcc41f1b101f963945c5ec7093d72bdac429.144970253…
Signed-off-by: Ingo Molnar <mingo(a)kernel.org>
Cc: Jamie Iles <jamie.iles(a)oracle.com>
Signed-off-by: Greg Kroah-Hartman <gregkh(a)linuxfoundation.org>
---
arch/x86/entry/vdso/vclock_gettime.c | 79 +++++++++++++++++++----------------
1 file changed, 45 insertions(+), 34 deletions(-)
--- a/arch/x86/entry/vdso/vclock_gettime.c
+++ b/arch/x86/entry/vdso/vclock_gettime.c
@@ -78,47 +78,58 @@ static notrace const struct pvclock_vsys
static notrace cycle_t vread_pvclock(int *mode)
{
- const struct pvclock_vsyscall_time_info *pvti;
+ const struct pvclock_vcpu_time_info *pvti = &get_pvti(0)->pvti;
cycle_t ret;
- u64 last;
- u32 version;
- u8 flags;
- unsigned cpu, cpu1;
-
+ u64 tsc, pvti_tsc;
+ u64 last, delta, pvti_system_time;
+ u32 version, pvti_tsc_to_system_mul, pvti_tsc_shift;
/*
- * Note: hypervisor must guarantee that:
- * 1. cpu ID number maps 1:1 to per-CPU pvclock time info.
- * 2. that per-CPU pvclock time info is updated if the
- * underlying CPU changes.
- * 3. that version is increased whenever underlying CPU
- * changes.
+ * Note: The kernel and hypervisor must guarantee that cpu ID
+ * number maps 1:1 to per-CPU pvclock time info.
+ *
+ * Because the hypervisor is entirely unaware of guest userspace
+ * preemption, it cannot guarantee that per-CPU pvclock time
+ * info is updated if the underlying CPU changes or that that
+ * version is increased whenever underlying CPU changes.
+ *
+ * On KVM, we are guaranteed that pvti updates for any vCPU are
+ * atomic as seen by *all* vCPUs. This is an even stronger
+ * guarantee than we get with a normal seqlock.
*
+ * On Xen, we don't appear to have that guarantee, but Xen still
+ * supplies a valid seqlock using the version field.
+
+ * We only do pvclock vdso timing at all if
+ * PVCLOCK_TSC_STABLE_BIT is set, and we interpret that bit to
+ * mean that all vCPUs have matching pvti and that the TSC is
+ * synced, so we can just look at vCPU 0's pvti.
*/
- do {
- cpu = __getcpu() & VGETCPU_CPU_MASK;
- /* TODO: We can put vcpu id into higher bits of pvti.version.
- * This will save a couple of cycles by getting rid of
- * __getcpu() calls (Gleb).
- */
-
- pvti = get_pvti(cpu);
-
- version = __pvclock_read_cycles(&pvti->pvti, &ret, &flags);
-
- /*
- * Test we're still on the cpu as well as the version.
- * We could have been migrated just after the first
- * vgetcpu but before fetching the version, so we
- * wouldn't notice a version change.
- */
- cpu1 = __getcpu() & VGETCPU_CPU_MASK;
- } while (unlikely(cpu != cpu1 ||
- (pvti->pvti.version & 1) ||
- pvti->pvti.version != version));
- if (unlikely(!(flags & PVCLOCK_TSC_STABLE_BIT)))
+ if (unlikely(!(pvti->flags & PVCLOCK_TSC_STABLE_BIT))) {
*mode = VCLOCK_NONE;
+ return 0;
+ }
+
+ do {
+ version = pvti->version;
+
+ /* This is also a read barrier, so we'll read version first. */
+ tsc = rdtsc_ordered();
+
+ pvti_tsc_to_system_mul = pvti->tsc_to_system_mul;
+ pvti_tsc_shift = pvti->tsc_shift;
+ pvti_system_time = pvti->system_time;
+ pvti_tsc = pvti->tsc_timestamp;
+
+ /* Make sure that the version double-check is last. */
+ smp_rmb();
+ } while (unlikely((version & 1) || version != pvti->version));
+
+ delta = tsc - pvti_tsc;
+ ret = pvti_system_time +
+ pvclock_scale_delta(delta, pvti_tsc_to_system_mul,
+ pvti_tsc_shift);
/* refer to tsc.c read_tsc() comment for rationale */
last = gtod->cycle_last;
Patches currently in stable-queue which might be from luto(a)amacapital.net are
queue-4.4/x86-vdso-pvclock-simplify-and-speed-up-the-vdso-pvclock-reader.patch
queue-4.4/x86-vdso-get-pvclock-data-from-the-vvar-vma-instead-of-the-fixmap.patch
This is the start of the stable review cycle for the 4.14.12 release.
There are 14 patches in this series, all will be posted as a response
to this one. If anyone has any issues with these being applied, please
let me know.
Responses should be made by Sat Jan 6 12:08:52 UTC 2018.
Anything received after that time might be too late.
The whole patch series can be found in one patch at:
kernel.org/pub/linux/kernel/v4.x/stable-review/patch-4.14.12-rc1.gz
or in the git tree and branch at:
git://git.kernel.org/pub/scm/linux/kernel/git/stable/linux-stable-rc.git linux-4.14.y
and the diffstat can be found below.
thanks,
greg k-h
-------------
Pseudo-Shortlog of commits:
Greg Kroah-Hartman <gregkh(a)linuxfoundation.org>
Linux 4.14.12-rc1
Troy Kisky <troy.kisky(a)boundarydevices.com>
rtc: m41t80: remove unneeded checks from m41t80_sqw_set_rate
Troy Kisky <troy.kisky(a)boundarydevices.com>
rtc: m41t80: avoid i2c read in m41t80_sqw_is_prepared
Troy Kisky <troy.kisky(a)boundarydevices.com>
rtc: m41t80: avoid i2c read in m41t80_sqw_recalc_rate
Troy Kisky <troy.kisky(a)boundarydevices.com>
rtc: m41t80: fix m41t80_sqw_round_rate return value
Troy Kisky <troy.kisky(a)boundarydevices.com>
rtc: m41t80: m41t80_sqw_set_rate should return 0 on success
Steffen Klassert <steffen.klassert(a)secunet.com>
Revert "xfrm: Fix stack-out-of-bounds read in xfrm_state_find."
Nick Desaulniers <ndesaulniers(a)google.com>
x86/process: Define cpu_tss_rw in same section as declaration
Thomas Gleixner <tglx(a)linutronix.de>
x86/pti: Switch to kernel CR3 at early in entry_SYSCALL_compat()
Josh Poimboeuf <jpoimboe(a)redhat.com>
x86/dumpstack: Print registers for first stack frame
Josh Poimboeuf <jpoimboe(a)redhat.com>
x86/dumpstack: Fix partial register dumps
Thomas Gleixner <tglx(a)linutronix.de>
x86/pti: Make sure the user/kernel PTEs match
Tom Lendacky <thomas.lendacky(a)amd.com>
x86/cpu, x86/pti: Do not enable PTI on AMD processors
Eric Biggers <ebiggers(a)google.com>
capabilities: fix buffer overread on very short xattr
Kees Cook <keescook(a)chromium.org>
exec: Weaken dumpability for secureexec
-------------
Diffstat:
Makefile | 4 +-
arch/x86/entry/entry_64_compat.S | 13 +++----
arch/x86/include/asm/unwind.h | 17 ++++++--
arch/x86/kernel/cpu/common.c | 4 +-
arch/x86/kernel/dumpstack.c | 31 ++++++++++-----
arch/x86/kernel/process.c | 2 +-
arch/x86/kernel/stacktrace.c | 2 +-
arch/x86/mm/pti.c | 3 +-
drivers/rtc/rtc-m41t80.c | 84 ++++++++++++++++++----------------------
fs/exec.c | 9 ++++-
net/xfrm/xfrm_policy.c | 29 ++++++++------
security/commoncap.c | 21 +++++-----
12 files changed, 120 insertions(+), 99 deletions(-)
From: Eric Biggers <ebiggers(a)google.com>
syzkaller triggered a NULL pointer dereference in crypto_remove_spawns()
via a program that repeatedly and concurrently requests AEADs
"authenc(cmac(des3_ede-asm),pcbc-aes-aesni)" and hashes "cmac(des3_ede)"
through AF_ALG, where the hashes are requested as "untested"
(CRYPTO_ALG_TESTED is set in ->salg_mask but clear in ->salg_feat; this
causes the template to be instantiated for every request).
Although AF_ALG users really shouldn't be able to request an "untested"
algorithm, the NULL pointer dereference is actually caused by a
longstanding race condition where crypto_remove_spawns() can encounter
an instance which has had spawn(s) "grabbed" but hasn't yet been
registered, resulting in ->cra_users still being NULL.
We probably should properly initialize ->cra_users earlier, but that
would require updating many templates individually. For now just fix
the bug in a simple way that can easily be backported: make
crypto_remove_spawns() treat a NULL ->cra_users list as empty.
Reported-by: syzbot <syzkaller(a)googlegroups.com>
Cc: stable(a)vger.kernel.org
Signed-off-by: Eric Biggers <ebiggers(a)google.com>
---
crypto/algapi.c | 12 ++++++++++++
1 file changed, 12 insertions(+)
diff --git a/crypto/algapi.c b/crypto/algapi.c
index 9895cafcce7e..395b082d03a9 100644
--- a/crypto/algapi.c
+++ b/crypto/algapi.c
@@ -166,6 +166,18 @@ void crypto_remove_spawns(struct crypto_alg *alg, struct list_head *list,
spawn->alg = NULL;
spawns = &inst->alg.cra_users;
+
+ /*
+ * We may encounter an unregistered instance here, since
+ * an instance's spawns are set up prior to the instance
+ * being registered. An unregistered instance will have
+ * NULL ->cra_users.next, since ->cra_users isn't
+ * properly initialized until registration. But an
+ * unregistered instance cannot have any users, so treat
+ * it the same as ->cra_users being empty.
+ */
+ if (spawns->next == NULL)
+ break;
}
} while ((spawns = crypto_more_spawns(alg, &stack, &top,
&secondary_spawns)));
--
2.15.1
As mentioned on commit '88be58be886f ("drm/i915/fbdev:
Always forward hotplug events") we have real valid cases
of hotplugs where fbdev is not fully setup yet.
Unfortunately this remove the checkpoint after the sync point.
So probably we can live without it. Or we need a more robust
serialization.
Bugzilla: https://bugs.freedesktop.org/show_bug.cgi?id=104158
Fixes: a45b30a6c5db ("drm/i915/fbdev: Serialise early hotplug events with async fbdev config")
Cc: Chris Wilson <chris(a)chris-wilson.co.uk>
Cc: Lukas Wunner <lukas(a)wunner.de>
Cc: jrg2718(a)gmail.com
Cc: stable(a)vger.kernel.org
Signed-off-by: Rodrigo Vivi <rodrigo.vivi(a)intel.com>
---
drivers/gpu/drm/i915/intel_fbdev.c | 3 +--
1 file changed, 1 insertion(+), 2 deletions(-)
diff --git a/drivers/gpu/drm/i915/intel_fbdev.c b/drivers/gpu/drm/i915/intel_fbdev.c
index da48af11eb6b..7a6069b389f2 100644
--- a/drivers/gpu/drm/i915/intel_fbdev.c
+++ b/drivers/gpu/drm/i915/intel_fbdev.c
@@ -801,8 +801,7 @@ void intel_fbdev_output_poll_changed(struct drm_device *dev)
return;
intel_fbdev_sync(ifbdev);
- if (ifbdev->vma)
- drm_fb_helper_hotplug_event(&ifbdev->helper);
+ drm_fb_helper_hotplug_event(&ifbdev->helper);
}
void intel_fbdev_restore_mode(struct drm_device *dev)
--
2.13.6
On Thu, Jan 04, 2018 at 11:39:23PM +0000, Kenneth Graunke wrote:
> On Thursday, January 4, 2018 1:23:06 PM PST Chris Wilson wrote:
> > Quoting Kenneth Graunke (2018-01-04 19:38:05)
> > > Geminilake requires the 3D driver to select whether barriers are
> > > intended for compute shaders, or tessellation control shaders, by
> > > whacking a "Barrier Mode" bit in SLICE_COMMON_ECO_CHICKEN1 when
> > > switching pipelines. Failure to do this properly can result in GPU
> > > hangs.
> > >
> > > Unfortunately, this means it needs to switch mid-batch, so only
> > > userspace can properly set it. To facilitate this, the kernel needs
> > > to whitelist the register.
> > >
> > > Signed-off-by: Kenneth Graunke <kenneth(a)whitecape.org>
> > > Cc: stable(a)vger.kernel.org
> > > ---
> > > drivers/gpu/drm/i915/i915_reg.h | 2 ++
> > > drivers/gpu/drm/i915/intel_engine_cs.c | 5 +++++
> > > 2 files changed, 7 insertions(+)
> > >
> > > Hello,
> > >
> > > We unfortunately need to whitelist an extra register for GPU hang fix
> > > on Geminilake. Here's the corresponding Mesa patch:
> >
> > Thankfully it appears to be context saved. Has a w/a name been assigned
> > for this?
> > -Chris
>
> There doesn't appear to be one. The workaround page lists it, but there
> is no name. The register description has a note saying that you need to
> set this, but doesn't call it out as a workaround.
It mentions only BXT:ALL, but not mention to GLK.
Should we add to both then?
>
> That's why I put a generic comment, rather than the name.
On Display side we started using the row name for this case, to help
easily finding this later.
ex: "Display WA #0390: skl,kbl"
The number for this apparently is:
WA #0862
Maybe we could use this one to start
/* GT WA #0862: bxt,glk */
GT? GEM?
Unnamed WA #0862?
Thanks,
Rodrigo.
>
> --Ken
> _______________________________________________
> Intel-gfx mailing list
> Intel-gfx(a)lists.freedesktop.org
> https://lists.freedesktop.org/mailman/listinfo/intel-gfx