This is a note to let you know that I've just added the patch titled
RDMAVT: Fix synchronization around percpu_ref
to the 4.14-stable tree which can be found at:
http://www.kernel.org/git/?p=linux/kernel/git/stable/stable-queue.git;a=sum…
The filename of the patch is:
rdmavt-fix-synchronization-around-percpu_ref.patch
and it can be found in the queue-4.14 subdirectory.
If you, or anyone else, feels it should not be added to the stable tree,
please let <stable(a)vger.kernel.org> know about it.
>From 74b44bbe80b4c62113ac1501482ea1ee40eb9d67 Mon Sep 17 00:00:00 2001
From: Tejun Heo <tj(a)kernel.org>
Date: Wed, 14 Mar 2018 12:10:18 -0700
Subject: RDMAVT: Fix synchronization around percpu_ref
From: Tejun Heo <tj(a)kernel.org>
commit 74b44bbe80b4c62113ac1501482ea1ee40eb9d67 upstream.
rvt_mregion uses percpu_ref for reference counting and RCU to protect
accesses from lkey_table. When a rvt_mregion needs to be freed, it
first gets unregistered from lkey_table and then rvt_check_refs() is
called to wait for in-flight usages before the rvt_mregion is freed.
rvt_check_refs() seems to have a couple issues.
* It has a fast exit path which tests percpu_ref_is_zero(). However,
a percpu_ref reading zero doesn't mean that the object can be
released. In fact, the ->release() callback might not even have
started executing yet. Proceeding with freeing can lead to
use-after-free.
* lkey_table is RCU protected but there is no RCU grace period in the
free path. percpu_ref uses RCU internally but it's sched-RCU whose
grace periods are different from regular RCU. Also, it generally
isn't a good idea to depend on internal behaviors like this.
To address the above issues, this patch removes the fast exit and adds
an explicit synchronize_rcu().
Signed-off-by: Tejun Heo <tj(a)kernel.org>
Acked-by: Dennis Dalessandro <dennis.dalessandro(a)intel.com>
Cc: Mike Marciniszyn <mike.marciniszyn(a)intel.com>
Cc: linux-rdma(a)vger.kernel.org
Cc: Linus Torvalds <torvalds(a)linux-foundation.org>
Signed-off-by: Greg Kroah-Hartman <gregkh(a)linuxfoundation.org>
---
drivers/infiniband/sw/rdmavt/mr.c | 10 ++++++----
1 file changed, 6 insertions(+), 4 deletions(-)
--- a/drivers/infiniband/sw/rdmavt/mr.c
+++ b/drivers/infiniband/sw/rdmavt/mr.c
@@ -489,11 +489,13 @@ static int rvt_check_refs(struct rvt_mre
unsigned long timeout;
struct rvt_dev_info *rdi = ib_to_rvt(mr->pd->device);
- if (percpu_ref_is_zero(&mr->refcount))
- return 0;
- /* avoid dma mr */
- if (mr->lkey)
+ if (mr->lkey) {
+ /* avoid dma mr */
rvt_dereg_clean_qps(mr);
+ /* @mr was indexed on rcu protected @lkey_table */
+ synchronize_rcu();
+ }
+
timeout = wait_for_completion_timeout(&mr->comp, 5 * HZ);
if (!timeout) {
rvt_pr_err(rdi,
Patches currently in stable-queue which might be from tj(a)kernel.org are
queue-4.14/rdmavt-fix-synchronization-around-percpu_ref.patch
queue-4.14/fs-aio-add-explicit-rcu-grace-period-when-freeing-kioctx.patch
queue-4.14/fs-aio-use-rcu-accessors-for-kioctx_table-table.patch
This is a note to let you know that I've just added the patch titled
lock_parent() needs to recheck if dentry got __dentry_kill'ed under it
to the 4.14-stable tree which can be found at:
http://www.kernel.org/git/?p=linux/kernel/git/stable/stable-queue.git;a=sum…
The filename of the patch is:
lock_parent-needs-to-recheck-if-dentry-got-__dentry_kill-ed-under-it.patch
and it can be found in the queue-4.14 subdirectory.
If you, or anyone else, feels it should not be added to the stable tree,
please let <stable(a)vger.kernel.org> know about it.
>From 3b821409632ab778d46e807516b457dfa72736ed Mon Sep 17 00:00:00 2001
From: Al Viro <viro(a)zeniv.linux.org.uk>
Date: Fri, 23 Feb 2018 20:47:17 -0500
Subject: lock_parent() needs to recheck if dentry got __dentry_kill'ed under it
From: Al Viro <viro(a)zeniv.linux.org.uk>
commit 3b821409632ab778d46e807516b457dfa72736ed upstream.
In case when dentry passed to lock_parent() is protected from freeing only
by the fact that it's on a shrink list and trylock of parent fails, we
could get hit by __dentry_kill() (and subsequent dentry_kill(parent))
between unlocking dentry and locking presumed parent. We need to recheck
that dentry is alive once we lock both it and parent *and* postpone
rcu_read_unlock() until after that point. Otherwise we could return
a pointer to struct dentry that already is rcu-scheduled for freeing, with
->d_lock held on it; caller's subsequent attempt to unlock it can end
up with memory corruption.
Cc: stable(a)vger.kernel.org # 3.12+, counting backports
Signed-off-by: Al Viro <viro(a)zeniv.linux.org.uk>
Signed-off-by: Greg Kroah-Hartman <gregkh(a)linuxfoundation.org>
---
fs/dcache.c | 11 ++++++++---
1 file changed, 8 insertions(+), 3 deletions(-)
--- a/fs/dcache.c
+++ b/fs/dcache.c
@@ -644,11 +644,16 @@ again:
spin_unlock(&parent->d_lock);
goto again;
}
- rcu_read_unlock();
- if (parent != dentry)
+ if (parent != dentry) {
spin_lock_nested(&dentry->d_lock, DENTRY_D_LOCK_NESTED);
- else
+ if (unlikely(dentry->d_lockref.count < 0)) {
+ spin_unlock(&parent->d_lock);
+ parent = NULL;
+ }
+ } else {
parent = NULL;
+ }
+ rcu_read_unlock();
return parent;
}
Patches currently in stable-queue which might be from viro(a)zeniv.linux.org.uk are
queue-4.14/fs-teach-path_connected-to-handle-nfs-filesystems-with-multiple-roots.patch
queue-4.14/lock_parent-needs-to-recheck-if-dentry-got-__dentry_kill-ed-under-it.patch
This is a note to let you know that I've just added the patch titled
kvm: arm/arm64: vgic-v3: Tighten synchronization for guests using v2 on v3
to the 4.14-stable tree which can be found at:
http://www.kernel.org/git/?p=linux/kernel/git/stable/stable-queue.git;a=sum…
The filename of the patch is:
kvm-arm-arm64-vgic-v3-tighten-synchronization-for-guests-using-v2-on-v3.patch
and it can be found in the queue-4.14 subdirectory.
If you, or anyone else, feels it should not be added to the stable tree,
please let <stable(a)vger.kernel.org> know about it.
>From 27e91ad1e746e341ca2312f29bccb9736be7b476 Mon Sep 17 00:00:00 2001
From: Marc Zyngier <marc.zyngier(a)arm.com>
Date: Tue, 6 Mar 2018 21:44:37 +0000
Subject: kvm: arm/arm64: vgic-v3: Tighten synchronization for guests using v2 on v3
From: Marc Zyngier <marc.zyngier(a)arm.com>
commit 27e91ad1e746e341ca2312f29bccb9736be7b476 upstream.
On guest exit, and when using GICv2 on GICv3, we use a dsb(st) to
force synchronization between the memory-mapped guest view and
the system-register view that the hypervisor uses.
This is incorrect, as the spec calls out the need for "a DSB whose
required access type is both loads and stores with any Shareability
attribute", while we're only synchronizing stores.
We also lack an isb after the dsb to ensure that the latter has
actually been executed before we start reading stuff from the sysregs.
The fix is pretty easy: turn dsb(st) into dsb(sy), and slap an isb()
just after.
Cc: stable(a)vger.kernel.org
Fixes: f68d2b1b73cc ("arm64: KVM: Implement vgic-v3 save/restore")
Acked-by: Christoffer Dall <cdall(a)kernel.org>
Reviewed-by: Andre Przywara <andre.przywara(a)arm.com>
Signed-off-by: Marc Zyngier <marc.zyngier(a)arm.com>
Signed-off-by: Greg Kroah-Hartman <gregkh(a)linuxfoundation.org>
---
virt/kvm/arm/hyp/vgic-v3-sr.c | 3 ++-
1 file changed, 2 insertions(+), 1 deletion(-)
--- a/virt/kvm/arm/hyp/vgic-v3-sr.c
+++ b/virt/kvm/arm/hyp/vgic-v3-sr.c
@@ -215,7 +215,8 @@ void __hyp_text __vgic_v3_save_state(str
* are now visible to the system register interface.
*/
if (!cpu_if->vgic_sre) {
- dsb(st);
+ dsb(sy);
+ isb();
cpu_if->vgic_vmcr = read_gicreg(ICH_VMCR_EL2);
}
Patches currently in stable-queue which might be from marc.zyngier(a)arm.com are
queue-4.14/kvm-arm-arm64-vgic-don-t-populate-multiple-lrs-with-the-same-vintid.patch
queue-4.14/kvm-arm-arm64-vgic-v3-tighten-synchronization-for-guests-using-v2-on-v3.patch
queue-4.14/kvm-arm-arm64-reduce-verbosity-of-kvm-init-log.patch
This is a note to let you know that I've just added the patch titled
KVM: x86: Fix device passthrough when SME is active
to the 4.14-stable tree which can be found at:
http://www.kernel.org/git/?p=linux/kernel/git/stable/stable-queue.git;a=sum…
The filename of the patch is:
kvm-x86-fix-device-passthrough-when-sme-is-active.patch
and it can be found in the queue-4.14 subdirectory.
If you, or anyone else, feels it should not be added to the stable tree,
please let <stable(a)vger.kernel.org> know about it.
>From daaf216c06fba4ee4dc3f62715667da929d68774 Mon Sep 17 00:00:00 2001
From: Tom Lendacky <thomas.lendacky(a)amd.com>
Date: Thu, 8 Mar 2018 17:17:31 -0600
Subject: KVM: x86: Fix device passthrough when SME is active
From: Tom Lendacky <thomas.lendacky(a)amd.com>
commit daaf216c06fba4ee4dc3f62715667da929d68774 upstream.
When using device passthrough with SME active, the MMIO range that is
mapped for the device should not be mapped encrypted. Add a check in
set_spte() to insure that a page is not mapped encrypted if that page
is a device MMIO page as indicated by kvm_is_mmio_pfn().
Cc: <stable(a)vger.kernel.org> # 4.14.x-
Signed-off-by: Tom Lendacky <thomas.lendacky(a)amd.com>
Signed-off-by: Paolo Bonzini <pbonzini(a)redhat.com>
Signed-off-by: Greg Kroah-Hartman <gregkh(a)linuxfoundation.org>
---
arch/x86/kvm/mmu.c | 4 +++-
1 file changed, 3 insertions(+), 1 deletion(-)
--- a/arch/x86/kvm/mmu.c
+++ b/arch/x86/kvm/mmu.c
@@ -2758,8 +2758,10 @@ static int set_spte(struct kvm_vcpu *vcp
else
pte_access &= ~ACC_WRITE_MASK;
+ if (!kvm_is_mmio_pfn(pfn))
+ spte |= shadow_me_mask;
+
spte |= (u64)pfn << PAGE_SHIFT;
- spte |= shadow_me_mask;
if (pte_access & ACC_WRITE_MASK) {
Patches currently in stable-queue which might be from thomas.lendacky(a)amd.com are
queue-4.14/kvm-x86-fix-device-passthrough-when-sme-is-active.patch
queue-4.14/x86-cpufeatures-add-intel-pconfig-cpufeature.patch
queue-4.14/x86-cpufeatures-add-intel-total-memory-encryption-cpufeature.patch
This is a note to let you know that I've just added the patch titled
KVM: arm/arm64: vgic: Don't populate multiple LRs with the same vintid
to the 4.14-stable tree which can be found at:
http://www.kernel.org/git/?p=linux/kernel/git/stable/stable-queue.git;a=sum…
The filename of the patch is:
kvm-arm-arm64-vgic-don-t-populate-multiple-lrs-with-the-same-vintid.patch
and it can be found in the queue-4.14 subdirectory.
If you, or anyone else, feels it should not be added to the stable tree,
please let <stable(a)vger.kernel.org> know about it.
>From 16ca6a607d84bef0129698d8d808f501afd08d43 Mon Sep 17 00:00:00 2001
From: Marc Zyngier <marc.zyngier(a)arm.com>
Date: Tue, 6 Mar 2018 21:48:01 +0000
Subject: KVM: arm/arm64: vgic: Don't populate multiple LRs with the same vintid
From: Marc Zyngier <marc.zyngier(a)arm.com>
commit 16ca6a607d84bef0129698d8d808f501afd08d43 upstream.
The vgic code is trying to be clever when injecting GICv2 SGIs,
and will happily populate LRs with the same interrupt number if
they come from multiple vcpus (after all, they are distinct
interrupt sources).
Unfortunately, this is against the letter of the architecture,
and the GICv2 architecture spec says "Each valid interrupt stored
in the List registers must have a unique VirtualID for that
virtual CPU interface.". GICv3 has similar (although slightly
ambiguous) restrictions.
This results in guests locking up when using GICv2-on-GICv3, for
example. The obvious fix is to stop trying so hard, and inject
a single vcpu per SGI per guest entry. After all, pending SGIs
with multiple source vcpus are pretty rare, and are mostly seen
in scenario where the physical CPUs are severely overcomitted.
But as we now only inject a single instance of a multi-source SGI per
vcpu entry, we may delay those interrupts for longer than strictly
necessary, and run the risk of injecting lower priority interrupts
in the meantime.
In order to address this, we adopt a three stage strategy:
- If we encounter a multi-source SGI in the AP list while computing
its depth, we force the list to be sorted
- When populating the LRs, we prevent the injection of any interrupt
of lower priority than that of the first multi-source SGI we've
injected.
- Finally, the injection of a multi-source SGI triggers the request
of a maintenance interrupt when there will be no pending interrupt
in the LRs (HCR_NPIE).
At the point where the last pending interrupt in the LRs switches
from Pending to Active, the maintenance interrupt will be delivered,
allowing us to add the remaining SGIs using the same process.
Cc: stable(a)vger.kernel.org
Fixes: 0919e84c0fc1 ("KVM: arm/arm64: vgic-new: Add IRQ sync/flush framework")
Acked-by: Christoffer Dall <cdall(a)kernel.org>
Signed-off-by: Marc Zyngier <marc.zyngier(a)arm.com>
Signed-off-by: Greg Kroah-Hartman <gregkh(a)linuxfoundation.org>
---
include/linux/irqchip/arm-gic-v3.h | 1
include/linux/irqchip/arm-gic.h | 1
virt/kvm/arm/vgic/vgic-v2.c | 9 ++++-
virt/kvm/arm/vgic/vgic-v3.c | 9 ++++-
virt/kvm/arm/vgic/vgic.c | 61 ++++++++++++++++++++++++++++---------
virt/kvm/arm/vgic/vgic.h | 2 +
6 files changed, 67 insertions(+), 16 deletions(-)
--- a/include/linux/irqchip/arm-gic-v3.h
+++ b/include/linux/irqchip/arm-gic-v3.h
@@ -501,6 +501,7 @@
#define ICH_HCR_EN (1 << 0)
#define ICH_HCR_UIE (1 << 1)
+#define ICH_HCR_NPIE (1 << 3)
#define ICH_HCR_TC (1 << 10)
#define ICH_HCR_TALL0 (1 << 11)
#define ICH_HCR_TALL1 (1 << 12)
--- a/include/linux/irqchip/arm-gic.h
+++ b/include/linux/irqchip/arm-gic.h
@@ -84,6 +84,7 @@
#define GICH_HCR_EN (1 << 0)
#define GICH_HCR_UIE (1 << 1)
+#define GICH_HCR_NPIE (1 << 3)
#define GICH_LR_VIRTUALID (0x3ff << 0)
#define GICH_LR_PHYSID_CPUID_SHIFT (10)
--- a/virt/kvm/arm/vgic/vgic-v2.c
+++ b/virt/kvm/arm/vgic/vgic-v2.c
@@ -37,6 +37,13 @@ void vgic_v2_init_lrs(void)
vgic_v2_write_lr(i, 0);
}
+void vgic_v2_set_npie(struct kvm_vcpu *vcpu)
+{
+ struct vgic_v2_cpu_if *cpuif = &vcpu->arch.vgic_cpu.vgic_v2;
+
+ cpuif->vgic_hcr |= GICH_HCR_NPIE;
+}
+
void vgic_v2_set_underflow(struct kvm_vcpu *vcpu)
{
struct vgic_v2_cpu_if *cpuif = &vcpu->arch.vgic_cpu.vgic_v2;
@@ -63,7 +70,7 @@ void vgic_v2_fold_lr_state(struct kvm_vc
struct vgic_v2_cpu_if *cpuif = &vgic_cpu->vgic_v2;
int lr;
- cpuif->vgic_hcr &= ~GICH_HCR_UIE;
+ cpuif->vgic_hcr &= ~(GICH_HCR_UIE | GICH_HCR_NPIE);
for (lr = 0; lr < vgic_cpu->used_lrs; lr++) {
u32 val = cpuif->vgic_lr[lr];
--- a/virt/kvm/arm/vgic/vgic-v3.c
+++ b/virt/kvm/arm/vgic/vgic-v3.c
@@ -25,6 +25,13 @@ static bool group0_trap;
static bool group1_trap;
static bool common_trap;
+void vgic_v3_set_npie(struct kvm_vcpu *vcpu)
+{
+ struct vgic_v3_cpu_if *cpuif = &vcpu->arch.vgic_cpu.vgic_v3;
+
+ cpuif->vgic_hcr |= ICH_HCR_NPIE;
+}
+
void vgic_v3_set_underflow(struct kvm_vcpu *vcpu)
{
struct vgic_v3_cpu_if *cpuif = &vcpu->arch.vgic_cpu.vgic_v3;
@@ -45,7 +52,7 @@ void vgic_v3_fold_lr_state(struct kvm_vc
u32 model = vcpu->kvm->arch.vgic.vgic_model;
int lr;
- cpuif->vgic_hcr &= ~ICH_HCR_UIE;
+ cpuif->vgic_hcr &= ~(ICH_HCR_UIE | ICH_HCR_NPIE);
for (lr = 0; lr < vgic_cpu->used_lrs; lr++) {
u64 val = cpuif->vgic_lr[lr];
--- a/virt/kvm/arm/vgic/vgic.c
+++ b/virt/kvm/arm/vgic/vgic.c
@@ -610,22 +610,37 @@ static inline void vgic_set_underflow(st
vgic_v3_set_underflow(vcpu);
}
+static inline void vgic_set_npie(struct kvm_vcpu *vcpu)
+{
+ if (kvm_vgic_global_state.type == VGIC_V2)
+ vgic_v2_set_npie(vcpu);
+ else
+ vgic_v3_set_npie(vcpu);
+}
+
/* Requires the ap_list_lock to be held. */
-static int compute_ap_list_depth(struct kvm_vcpu *vcpu)
+static int compute_ap_list_depth(struct kvm_vcpu *vcpu,
+ bool *multi_sgi)
{
struct vgic_cpu *vgic_cpu = &vcpu->arch.vgic_cpu;
struct vgic_irq *irq;
int count = 0;
+ *multi_sgi = false;
+
DEBUG_SPINLOCK_BUG_ON(!spin_is_locked(&vgic_cpu->ap_list_lock));
list_for_each_entry(irq, &vgic_cpu->ap_list_head, ap_list) {
spin_lock(&irq->irq_lock);
/* GICv2 SGIs can count for more than one... */
- if (vgic_irq_is_sgi(irq->intid) && irq->source)
- count += hweight8(irq->source);
- else
+ if (vgic_irq_is_sgi(irq->intid) && irq->source) {
+ int w = hweight8(irq->source);
+
+ count += w;
+ *multi_sgi |= (w > 1);
+ } else {
count++;
+ }
spin_unlock(&irq->irq_lock);
}
return count;
@@ -636,28 +651,43 @@ static void vgic_flush_lr_state(struct k
{
struct vgic_cpu *vgic_cpu = &vcpu->arch.vgic_cpu;
struct vgic_irq *irq;
- int count = 0;
+ int count;
+ bool npie = false;
+ bool multi_sgi;
+ u8 prio = 0xff;
DEBUG_SPINLOCK_BUG_ON(!spin_is_locked(&vgic_cpu->ap_list_lock));
- if (compute_ap_list_depth(vcpu) > kvm_vgic_global_state.nr_lr)
+ count = compute_ap_list_depth(vcpu, &multi_sgi);
+ if (count > kvm_vgic_global_state.nr_lr || multi_sgi)
vgic_sort_ap_list(vcpu);
+ count = 0;
+
list_for_each_entry(irq, &vgic_cpu->ap_list_head, ap_list) {
spin_lock(&irq->irq_lock);
- if (unlikely(vgic_target_oracle(irq) != vcpu))
- goto next;
-
/*
- * If we get an SGI with multiple sources, try to get
- * them in all at once.
+ * If we have multi-SGIs in the pipeline, we need to
+ * guarantee that they are all seen before any IRQ of
+ * lower priority. In that case, we need to filter out
+ * these interrupts by exiting early. This is easy as
+ * the AP list has been sorted already.
*/
- do {
+ if (multi_sgi && irq->priority > prio) {
+ spin_unlock(&irq->irq_lock);
+ break;
+ }
+
+ if (likely(vgic_target_oracle(irq) == vcpu)) {
vgic_populate_lr(vcpu, irq, count++);
- } while (irq->source && count < kvm_vgic_global_state.nr_lr);
-next:
+ if (irq->source) {
+ npie = true;
+ prio = irq->priority;
+ }
+ }
+
spin_unlock(&irq->irq_lock);
if (count == kvm_vgic_global_state.nr_lr) {
@@ -668,6 +698,9 @@ next:
}
}
+ if (npie)
+ vgic_set_npie(vcpu);
+
vcpu->arch.vgic_cpu.used_lrs = count;
/* Nuke remaining LRs */
--- a/virt/kvm/arm/vgic/vgic.h
+++ b/virt/kvm/arm/vgic/vgic.h
@@ -150,6 +150,7 @@ void vgic_v2_fold_lr_state(struct kvm_vc
void vgic_v2_populate_lr(struct kvm_vcpu *vcpu, struct vgic_irq *irq, int lr);
void vgic_v2_clear_lr(struct kvm_vcpu *vcpu, int lr);
void vgic_v2_set_underflow(struct kvm_vcpu *vcpu);
+void vgic_v2_set_npie(struct kvm_vcpu *vcpu);
int vgic_v2_has_attr_regs(struct kvm_device *dev, struct kvm_device_attr *attr);
int vgic_v2_dist_uaccess(struct kvm_vcpu *vcpu, bool is_write,
int offset, u32 *val);
@@ -179,6 +180,7 @@ void vgic_v3_fold_lr_state(struct kvm_vc
void vgic_v3_populate_lr(struct kvm_vcpu *vcpu, struct vgic_irq *irq, int lr);
void vgic_v3_clear_lr(struct kvm_vcpu *vcpu, int lr);
void vgic_v3_set_underflow(struct kvm_vcpu *vcpu);
+void vgic_v3_set_npie(struct kvm_vcpu *vcpu);
void vgic_v3_set_vmcr(struct kvm_vcpu *vcpu, struct vgic_vmcr *vmcr);
void vgic_v3_get_vmcr(struct kvm_vcpu *vcpu, struct vgic_vmcr *vmcr);
void vgic_v3_enable(struct kvm_vcpu *vcpu);
Patches currently in stable-queue which might be from marc.zyngier(a)arm.com are
queue-4.14/kvm-arm-arm64-vgic-don-t-populate-multiple-lrs-with-the-same-vintid.patch
queue-4.14/kvm-arm-arm64-vgic-v3-tighten-synchronization-for-guests-using-v2-on-v3.patch
queue-4.14/kvm-arm-arm64-reduce-verbosity-of-kvm-init-log.patch
This is a note to let you know that I've just added the patch titled
KVM: arm/arm64: Reduce verbosity of KVM init log
to the 4.14-stable tree which can be found at:
http://www.kernel.org/git/?p=linux/kernel/git/stable/stable-queue.git;a=sum…
The filename of the patch is:
kvm-arm-arm64-reduce-verbosity-of-kvm-init-log.patch
and it can be found in the queue-4.14 subdirectory.
If you, or anyone else, feels it should not be added to the stable tree,
please let <stable(a)vger.kernel.org> know about it.
>From 76600428c3677659e3c3633bb4f2ea302220a275 Mon Sep 17 00:00:00 2001
From: Ard Biesheuvel <ard.biesheuvel(a)linaro.org>
Date: Fri, 2 Mar 2018 08:16:30 +0000
Subject: KVM: arm/arm64: Reduce verbosity of KVM init log
From: Ard Biesheuvel <ard.biesheuvel(a)linaro.org>
commit 76600428c3677659e3c3633bb4f2ea302220a275 upstream.
On my GICv3 system, the following is printed to the kernel log at boot:
kvm [1]: 8-bit VMID
kvm [1]: IDMAP page: d20e35000
kvm [1]: HYP VA range: 800000000000:ffffffffffff
kvm [1]: vgic-v2@2c020000
kvm [1]: GIC system register CPU interface enabled
kvm [1]: vgic interrupt IRQ1
kvm [1]: virtual timer IRQ4
kvm [1]: Hyp mode initialized successfully
The KVM IDMAP is a mapping of a statically allocated kernel structure,
and so printing its physical address leaks the physical placement of
the kernel when physical KASLR in effect. So change the kvm_info() to
kvm_debug() to remove it from the log output.
While at it, trim the output a bit more: IRQ numbers can be found in
/proc/interrupts, and the HYP VA and vgic-v2 lines are not highly
informational either.
Cc: <stable(a)vger.kernel.org>
Acked-by: Will Deacon <will.deacon(a)arm.com>
Acked-by: Christoffer Dall <cdall(a)kernel.org>
Signed-off-by: Ard Biesheuvel <ard.biesheuvel(a)linaro.org>
Signed-off-by: Marc Zyngier <marc.zyngier(a)arm.com>
Signed-off-by: Greg Kroah-Hartman <gregkh(a)linuxfoundation.org>
---
virt/kvm/arm/arch_timer.c | 2 +-
virt/kvm/arm/mmu.c | 6 +++---
virt/kvm/arm/vgic/vgic-v2.c | 2 +-
3 files changed, 5 insertions(+), 5 deletions(-)
--- a/virt/kvm/arm/arch_timer.c
+++ b/virt/kvm/arm/arch_timer.c
@@ -602,7 +602,7 @@ int kvm_timer_hyp_init(void)
return err;
}
- kvm_info("virtual timer IRQ%d\n", host_vtimer_irq);
+ kvm_debug("virtual timer IRQ%d\n", host_vtimer_irq);
cpuhp_setup_state(CPUHP_AP_KVM_ARM_TIMER_STARTING,
"kvm/arm/timer:starting", kvm_timer_starting_cpu,
--- a/virt/kvm/arm/mmu.c
+++ b/virt/kvm/arm/mmu.c
@@ -1760,9 +1760,9 @@ int kvm_mmu_init(void)
*/
BUG_ON((hyp_idmap_start ^ (hyp_idmap_end - 1)) & PAGE_MASK);
- kvm_info("IDMAP page: %lx\n", hyp_idmap_start);
- kvm_info("HYP VA range: %lx:%lx\n",
- kern_hyp_va(PAGE_OFFSET), kern_hyp_va(~0UL));
+ kvm_debug("IDMAP page: %lx\n", hyp_idmap_start);
+ kvm_debug("HYP VA range: %lx:%lx\n",
+ kern_hyp_va(PAGE_OFFSET), kern_hyp_va(~0UL));
if (hyp_idmap_start >= kern_hyp_va(PAGE_OFFSET) &&
hyp_idmap_start < kern_hyp_va(~0UL) &&
--- a/virt/kvm/arm/vgic/vgic-v2.c
+++ b/virt/kvm/arm/vgic/vgic-v2.c
@@ -380,7 +380,7 @@ int vgic_v2_probe(const struct gic_kvm_i
kvm_vgic_global_state.type = VGIC_V2;
kvm_vgic_global_state.max_gic_vcpus = VGIC_V2_MAX_CPUS;
- kvm_info("vgic-v2@%llx\n", info->vctrl.start);
+ kvm_debug("vgic-v2@%llx\n", info->vctrl.start);
return 0;
out:
Patches currently in stable-queue which might be from ard.biesheuvel(a)linaro.org are
queue-4.14/kvm-arm-arm64-reduce-verbosity-of-kvm-init-log.patch
This is a note to let you know that I've just added the patch titled
fs: Teach path_connected to handle nfs filesystems with multiple roots.
to the 4.14-stable tree which can be found at:
http://www.kernel.org/git/?p=linux/kernel/git/stable/stable-queue.git;a=sum…
The filename of the patch is:
fs-teach-path_connected-to-handle-nfs-filesystems-with-multiple-roots.patch
and it can be found in the queue-4.14 subdirectory.
If you, or anyone else, feels it should not be added to the stable tree,
please let <stable(a)vger.kernel.org> know about it.
>From 95dd77580ccd66a0da96e6d4696945b8cea39431 Mon Sep 17 00:00:00 2001
From: "Eric W. Biederman" <ebiederm(a)xmission.com>
Date: Wed, 14 Mar 2018 18:20:29 -0500
Subject: fs: Teach path_connected to handle nfs filesystems with multiple roots.
From: Eric W. Biederman <ebiederm(a)xmission.com>
commit 95dd77580ccd66a0da96e6d4696945b8cea39431 upstream.
On nfsv2 and nfsv3 the nfs server can export subsets of the same
filesystem and report the same filesystem identifier, so that the nfs
client can know they are the same filesystem. The subsets can be from
disjoint directory trees. The nfsv2 and nfsv3 filesystems provides no
way to find the common root of all directory trees exported form the
server with the same filesystem identifier.
The practical result is that in struct super s_root for nfs s_root is
not necessarily the root of the filesystem. The nfs mount code sets
s_root to the root of the first subset of the nfs filesystem that the
kernel mounts.
This effects the dcache invalidation code in generic_shutdown_super
currently called shrunk_dcache_for_umount and that code for years
has gone through an additional list of dentries that might be dentry
trees that need to be freed to accomodate nfs.
When I wrote path_connected I did not realize nfs was so special, and
it's hueristic for avoiding calling is_subdir can fail.
The practical case where this fails is when there is a move of a
directory from the subtree exposed by one nfs mount to the subtree
exposed by another nfs mount. This move can happen either locally or
remotely. With the remote case requiring that the move directory be cached
before the move and that after the move someone walks the path
to where the move directory now exists and in so doing causes the
already cached directory to be moved in the dcache through the magic
of d_splice_alias.
If someone whose working directory is in the move directory or a
subdirectory and now starts calling .. from the initial mount of nfs
(where s_root == mnt_root), then path_connected as a heuristic will
not bother with the is_subdir check. As s_root really is not the root
of the nfs filesystem this heuristic is wrong, and the path may
actually not be connected and path_connected can fail.
The is_subdir function might be cheap enough that we can call it
unconditionally. Verifying that will take some benchmarking and
the result may not be the same on all kernels this fix needs
to be backported to. So I am avoiding that for now.
Filesystems with snapshots such as nilfs and btrfs do something
similar. But as the directory tree of the snapshots are disjoint
from one another and from the main directory tree rename won't move
things between them and this problem will not occur.
Cc: stable(a)vger.kernel.org
Reported-by: Al Viro <viro(a)ZenIV.linux.org.uk>
Fixes: 397d425dc26d ("vfs: Test for and handle paths that are unreachable from their mnt_root")
Signed-off-by: "Eric W. Biederman" <ebiederm(a)xmission.com>
Signed-off-by: Al Viro <viro(a)zeniv.linux.org.uk>
Signed-off-by: Greg Kroah-Hartman <gregkh(a)linuxfoundation.org>
---
fs/namei.c | 5 +++--
fs/nfs/super.c | 2 ++
include/linux/fs.h | 1 +
3 files changed, 6 insertions(+), 2 deletions(-)
--- a/fs/namei.c
+++ b/fs/namei.c
@@ -578,9 +578,10 @@ static int __nd_alloc_stack(struct namei
static bool path_connected(const struct path *path)
{
struct vfsmount *mnt = path->mnt;
+ struct super_block *sb = mnt->mnt_sb;
- /* Only bind mounts can have disconnected paths */
- if (mnt->mnt_root == mnt->mnt_sb->s_root)
+ /* Bind mounts and multi-root filesystems can have disconnected paths */
+ if (!(sb->s_iflags & SB_I_MULTIROOT) && (mnt->mnt_root == sb->s_root))
return true;
return is_subdir(path->dentry, mnt->mnt_root);
--- a/fs/nfs/super.c
+++ b/fs/nfs/super.c
@@ -2623,6 +2623,8 @@ struct dentry *nfs_fs_mount_common(struc
/* initial superblock/root creation */
mount_info->fill_super(s, mount_info);
nfs_get_cache_cookie(s, mount_info->parsed, mount_info->cloned);
+ if (!(server->flags & NFS_MOUNT_UNSHARED))
+ s->s_iflags |= SB_I_MULTIROOT;
}
mntroot = nfs_get_root(s, mount_info->mntfh, dev_name);
--- a/include/linux/fs.h
+++ b/include/linux/fs.h
@@ -1312,6 +1312,7 @@ extern int send_sigurg(struct fown_struc
#define SB_I_CGROUPWB 0x00000001 /* cgroup-aware writeback enabled */
#define SB_I_NOEXEC 0x00000002 /* Ignore executables on this fs */
#define SB_I_NODEV 0x00000004 /* Ignore devices on this fs */
+#define SB_I_MULTIROOT 0x00000008 /* Multiple roots to the dentry tree */
/* sb->s_iflags to limit user namespace mounts */
#define SB_I_USERNS_VISIBLE 0x00000010 /* fstype already mounted */
Patches currently in stable-queue which might be from ebiederm(a)xmission.com are
queue-4.14/fs-teach-path_connected-to-handle-nfs-filesystems-with-multiple-roots.patch
This is a note to let you know that I've just added the patch titled
fs/aio: Use RCU accessors for kioctx_table->table[]
to the 4.14-stable tree which can be found at:
http://www.kernel.org/git/?p=linux/kernel/git/stable/stable-queue.git;a=sum…
The filename of the patch is:
fs-aio-use-rcu-accessors-for-kioctx_table-table.patch
and it can be found in the queue-4.14 subdirectory.
If you, or anyone else, feels it should not be added to the stable tree,
please let <stable(a)vger.kernel.org> know about it.
>From d0264c01e7587001a8c4608a5d1818dba9a4c11a Mon Sep 17 00:00:00 2001
From: Tejun Heo <tj(a)kernel.org>
Date: Wed, 14 Mar 2018 12:10:17 -0700
Subject: fs/aio: Use RCU accessors for kioctx_table->table[]
From: Tejun Heo <tj(a)kernel.org>
commit d0264c01e7587001a8c4608a5d1818dba9a4c11a upstream.
While converting ioctx index from a list to a table, db446a08c23d
("aio: convert the ioctx list to table lookup v3") missed tagging
kioctx_table->table[] as an array of RCU pointers and using the
appropriate RCU accessors. This introduces a small window in the
lookup path where init and access may race.
Mark kioctx_table->table[] with __rcu and use the approriate RCU
accessors when using the field.
Signed-off-by: Tejun Heo <tj(a)kernel.org>
Reported-by: Jann Horn <jannh(a)google.com>
Fixes: db446a08c23d ("aio: convert the ioctx list to table lookup v3")
Cc: Benjamin LaHaise <bcrl(a)kvack.org>
Cc: Linus Torvalds <torvalds(a)linux-foundation.org>
Cc: stable(a)vger.kernel.org # v3.12+
Signed-off-by: Greg Kroah-Hartman <gregkh(a)linuxfoundation.org>
---
fs/aio.c | 21 +++++++++++----------
1 file changed, 11 insertions(+), 10 deletions(-)
--- a/fs/aio.c
+++ b/fs/aio.c
@@ -68,9 +68,9 @@ struct aio_ring {
#define AIO_RING_PAGES 8
struct kioctx_table {
- struct rcu_head rcu;
- unsigned nr;
- struct kioctx *table[];
+ struct rcu_head rcu;
+ unsigned nr;
+ struct kioctx __rcu *table[];
};
struct kioctx_cpu {
@@ -330,7 +330,7 @@ static int aio_ring_mremap(struct vm_are
for (i = 0; i < table->nr; i++) {
struct kioctx *ctx;
- ctx = table->table[i];
+ ctx = rcu_dereference(table->table[i]);
if (ctx && ctx->aio_ring_file == file) {
if (!atomic_read(&ctx->dead)) {
ctx->user_id = ctx->mmap_base = vma->vm_start;
@@ -666,9 +666,9 @@ static int ioctx_add_table(struct kioctx
while (1) {
if (table)
for (i = 0; i < table->nr; i++)
- if (!table->table[i]) {
+ if (!rcu_access_pointer(table->table[i])) {
ctx->id = i;
- table->table[i] = ctx;
+ rcu_assign_pointer(table->table[i], ctx);
spin_unlock(&mm->ioctx_lock);
/* While kioctx setup is in progress,
@@ -849,8 +849,8 @@ static int kill_ioctx(struct mm_struct *
}
table = rcu_dereference_raw(mm->ioctx_table);
- WARN_ON(ctx != table->table[ctx->id]);
- table->table[ctx->id] = NULL;
+ WARN_ON(ctx != rcu_access_pointer(table->table[ctx->id]));
+ RCU_INIT_POINTER(table->table[ctx->id], NULL);
spin_unlock(&mm->ioctx_lock);
/* free_ioctx_reqs() will do the necessary RCU synchronization */
@@ -895,7 +895,8 @@ void exit_aio(struct mm_struct *mm)
skipped = 0;
for (i = 0; i < table->nr; ++i) {
- struct kioctx *ctx = table->table[i];
+ struct kioctx *ctx =
+ rcu_dereference_protected(table->table[i], true);
if (!ctx) {
skipped++;
@@ -1084,7 +1085,7 @@ static struct kioctx *lookup_ioctx(unsig
if (!table || id >= table->nr)
goto out;
- ctx = table->table[id];
+ ctx = rcu_dereference(table->table[id]);
if (ctx && ctx->user_id == ctx_id) {
percpu_ref_get(&ctx->users);
ret = ctx;
Patches currently in stable-queue which might be from tj(a)kernel.org are
queue-4.14/rdmavt-fix-synchronization-around-percpu_ref.patch
queue-4.14/fs-aio-add-explicit-rcu-grace-period-when-freeing-kioctx.patch
queue-4.14/fs-aio-use-rcu-accessors-for-kioctx_table-table.patch
This is a note to let you know that I've just added the patch titled
fs/aio: Add explicit RCU grace period when freeing kioctx
to the 4.14-stable tree which can be found at:
http://www.kernel.org/git/?p=linux/kernel/git/stable/stable-queue.git;a=sum…
The filename of the patch is:
fs-aio-add-explicit-rcu-grace-period-when-freeing-kioctx.patch
and it can be found in the queue-4.14 subdirectory.
If you, or anyone else, feels it should not be added to the stable tree,
please let <stable(a)vger.kernel.org> know about it.
>From a6d7cff472eea87d96899a20fa718d2bab7109f3 Mon Sep 17 00:00:00 2001
From: Tejun Heo <tj(a)kernel.org>
Date: Wed, 14 Mar 2018 12:10:17 -0700
Subject: fs/aio: Add explicit RCU grace period when freeing kioctx
From: Tejun Heo <tj(a)kernel.org>
commit a6d7cff472eea87d96899a20fa718d2bab7109f3 upstream.
While fixing refcounting, e34ecee2ae79 ("aio: Fix a trinity splat")
incorrectly removed explicit RCU grace period before freeing kioctx.
The intention seems to be depending on the internal RCU grace periods
of percpu_ref; however, percpu_ref uses a different flavor of RCU,
sched-RCU. This can lead to kioctx being freed while RCU read
protected dereferences are still in progress.
Fix it by updating free_ioctx() to go through call_rcu() explicitly.
v2: Comment added to explain double bouncing.
Signed-off-by: Tejun Heo <tj(a)kernel.org>
Reported-by: Jann Horn <jannh(a)google.com>
Fixes: e34ecee2ae79 ("aio: Fix a trinity splat")
Cc: Kent Overstreet <kent.overstreet(a)gmail.com>
Cc: Linus Torvalds <torvalds(a)linux-foundation.org>
Cc: stable(a)vger.kernel.org # v3.13+
Signed-off-by: Greg Kroah-Hartman <gregkh(a)linuxfoundation.org>
---
fs/aio.c | 23 +++++++++++++++++++----
1 file changed, 19 insertions(+), 4 deletions(-)
--- a/fs/aio.c
+++ b/fs/aio.c
@@ -115,7 +115,8 @@ struct kioctx {
struct page **ring_pages;
long nr_pages;
- struct work_struct free_work;
+ struct rcu_head free_rcu;
+ struct work_struct free_work; /* see free_ioctx() */
/*
* signals when all in-flight requests are done
@@ -588,6 +589,12 @@ static int kiocb_cancel(struct aio_kiocb
return cancel(&kiocb->common);
}
+/*
+ * free_ioctx() should be RCU delayed to synchronize against the RCU
+ * protected lookup_ioctx() and also needs process context to call
+ * aio_free_ring(), so the double bouncing through kioctx->free_rcu and
+ * ->free_work.
+ */
static void free_ioctx(struct work_struct *work)
{
struct kioctx *ctx = container_of(work, struct kioctx, free_work);
@@ -601,6 +608,14 @@ static void free_ioctx(struct work_struc
kmem_cache_free(kioctx_cachep, ctx);
}
+static void free_ioctx_rcufn(struct rcu_head *head)
+{
+ struct kioctx *ctx = container_of(head, struct kioctx, free_rcu);
+
+ INIT_WORK(&ctx->free_work, free_ioctx);
+ schedule_work(&ctx->free_work);
+}
+
static void free_ioctx_reqs(struct percpu_ref *ref)
{
struct kioctx *ctx = container_of(ref, struct kioctx, reqs);
@@ -609,8 +624,8 @@ static void free_ioctx_reqs(struct percp
if (ctx->rq_wait && atomic_dec_and_test(&ctx->rq_wait->count))
complete(&ctx->rq_wait->comp);
- INIT_WORK(&ctx->free_work, free_ioctx);
- schedule_work(&ctx->free_work);
+ /* Synchronize against RCU protected table->table[] dereferences */
+ call_rcu(&ctx->free_rcu, free_ioctx_rcufn);
}
/*
@@ -838,7 +853,7 @@ static int kill_ioctx(struct mm_struct *
table->table[ctx->id] = NULL;
spin_unlock(&mm->ioctx_lock);
- /* percpu_ref_kill() will do the necessary call_rcu() */
+ /* free_ioctx_reqs() will do the necessary RCU synchronization */
wake_up_all(&ctx->wait);
/*
Patches currently in stable-queue which might be from tj(a)kernel.org are
queue-4.14/rdmavt-fix-synchronization-around-percpu_ref.patch
queue-4.14/fs-aio-add-explicit-rcu-grace-period-when-freeing-kioctx.patch
queue-4.14/fs-aio-use-rcu-accessors-for-kioctx_table-table.patch
This is a note to let you know that I've just added the patch titled
drm/radeon: fix prime teardown order
to the 4.14-stable tree which can be found at:
http://www.kernel.org/git/?p=linux/kernel/git/stable/stable-queue.git;a=sum…
The filename of the patch is:
drm-radeon-fix-prime-teardown-order.patch
and it can be found in the queue-4.14 subdirectory.
If you, or anyone else, feels it should not be added to the stable tree,
please let <stable(a)vger.kernel.org> know about it.
>From 0f4f715bc6bed3bf14c5cd7d5fe88d443e756b14 Mon Sep 17 00:00:00 2001
From: =?UTF-8?q?Christian=20K=C3=B6nig?= <christian.koenig(a)amd.com>
Date: Fri, 9 Mar 2018 14:44:32 +0100
Subject: drm/radeon: fix prime teardown order
MIME-Version: 1.0
Content-Type: text/plain; charset=UTF-8
Content-Transfer-Encoding: 8bit
From: Christian König <christian.koenig(a)amd.com>
commit 0f4f715bc6bed3bf14c5cd7d5fe88d443e756b14 upstream.
We unmapped imported DMA-bufs when the GEM handle was dropped, not when the
hardware was done with the buffere.
Signed-off-by: Christian König <christian.koenig(a)amd.com>
Reviewed-by: Michel Dänzer <michel.daenzer(a)amd.com>
CC: stable(a)vger.kernel.org
Signed-off-by: Alex Deucher <alexander.deucher(a)amd.com>
Signed-off-by: Greg Kroah-Hartman <gregkh(a)linuxfoundation.org>
---
drivers/gpu/drm/radeon/radeon_gem.c | 2 --
drivers/gpu/drm/radeon/radeon_object.c | 2 ++
2 files changed, 2 insertions(+), 2 deletions(-)
--- a/drivers/gpu/drm/radeon/radeon_gem.c
+++ b/drivers/gpu/drm/radeon/radeon_gem.c
@@ -34,8 +34,6 @@ void radeon_gem_object_free(struct drm_g
struct radeon_bo *robj = gem_to_radeon_bo(gobj);
if (robj) {
- if (robj->gem_base.import_attach)
- drm_prime_gem_destroy(&robj->gem_base, robj->tbo.sg);
radeon_mn_unregister(robj);
radeon_bo_unref(&robj);
}
--- a/drivers/gpu/drm/radeon/radeon_object.c
+++ b/drivers/gpu/drm/radeon/radeon_object.c
@@ -82,6 +82,8 @@ static void radeon_ttm_bo_destroy(struct
mutex_unlock(&bo->rdev->gem.mutex);
radeon_bo_clear_surface_reg(bo);
WARN_ON_ONCE(!list_empty(&bo->va));
+ if (bo->gem_base.import_attach)
+ drm_prime_gem_destroy(&bo->gem_base, bo->tbo.sg);
drm_gem_object_release(&bo->gem_base);
kfree(bo);
}
Patches currently in stable-queue which might be from christian.koenig(a)amd.com are
queue-4.14/drm-amdgpu-fix-prime-teardown-order.patch
queue-4.14/drm-radeon-fix-prime-teardown-order.patch