As promised on the list [0], this series aims to backport 3 upstream
commits [1,2,3] into 5.12-stable tree.
Patch #1 is already in the queue and therefore not included. Patch #2 can
be applied now by manually adding the __KVM_HOST_SMCCC_FUNC___kvm_adjust_pc
macro (please review). Patch #3 can be applied cleanly then (after #2).
I've slightly tested it on my 920 (boot test and the whole kvm-unit-tests),
on top of the latest linux-stable-rc/linux-5.12.y. Please consider taking
them for 5.12-stable.
* From v1:
- Allocate a new number for __KVM_HOST_SMCCC_FUNC___kvm_adjust_pc
- Collect Marc's R-b tags
[0] https://lore.kernel.org/r/0d9f123c-e9f7-7481-143d-efd488873082@huawei.com
[1] https://git.kernel.org/torvalds/c/f5e30680616a
[2] https://git.kernel.org/torvalds/c/26778aaa134a
[3] https://git.kernel.org/torvalds/c/e3e880bb1518
Marc Zyngier (1):
KVM: arm64: Commit pending PC adjustemnts before returning to
userspace
Zenghui Yu (1):
KVM: arm64: Resolve all pending PC updates before immediate exit
arch/arm64/include/asm/kvm_asm.h | 1 +
arch/arm64/kvm/arm.c | 20 +++++++++++++++++---
arch/arm64/kvm/hyp/exception.c | 4 ++--
arch/arm64/kvm/hyp/nvhe/hyp-main.c | 8 ++++++++
4 files changed, 28 insertions(+), 5 deletions(-)
--
2.19.1
The patch below does not apply to the 5.12-stable tree.
If someone wants it applied there, or to any other stable or longterm
tree, then please email the backport, including the original git commit
id to <stable(a)vger.kernel.org>.
thanks,
greg k-h
------------------ original commit in Linus's tree ------------------
>From 0884335a2e653b8a045083aa1d57ce74269ac81d Mon Sep 17 00:00:00 2001
From: Sean Christopherson <seanjc(a)google.com>
Date: Wed, 21 Apr 2021 19:21:22 -0700
Subject: [PATCH] KVM: SVM: Truncate GPR value for DR and CR accesses in
!64-bit mode
Drop bits 63:32 on loads/stores to/from DRs and CRs when the vCPU is not
in 64-bit mode. The APM states bits 63:32 are dropped for both DRs and
CRs:
In 64-bit mode, the operand size is fixed at 64 bits without the need
for a REX prefix. In non-64-bit mode, the operand size is fixed at 32
bits and the upper 32 bits of the destination are forced to 0.
Fixes: 7ff76d58a9dc ("KVM: SVM: enhance MOV CR intercept handler")
Fixes: cae3797a4639 ("KVM: SVM: enhance mov DR intercept handler")
Cc: stable(a)vger.kernel.org
Signed-off-by: Sean Christopherson <seanjc(a)google.com>
Message-Id: <20210422022128.3464144-4-seanjc(a)google.com>
Signed-off-by: Paolo Bonzini <pbonzini(a)redhat.com>
diff --git a/arch/x86/kvm/svm/svm.c b/arch/x86/kvm/svm/svm.c
index 301792542937..857bcf3a4cda 100644
--- a/arch/x86/kvm/svm/svm.c
+++ b/arch/x86/kvm/svm/svm.c
@@ -2451,7 +2451,7 @@ static int cr_interception(struct kvm_vcpu *vcpu)
err = 0;
if (cr >= 16) { /* mov to cr */
cr -= 16;
- val = kvm_register_read(vcpu, reg);
+ val = kvm_register_readl(vcpu, reg);
trace_kvm_cr_write(cr, val);
switch (cr) {
case 0:
@@ -2497,7 +2497,7 @@ static int cr_interception(struct kvm_vcpu *vcpu)
kvm_queue_exception(vcpu, UD_VECTOR);
return 1;
}
- kvm_register_write(vcpu, reg, val);
+ kvm_register_writel(vcpu, reg, val);
trace_kvm_cr_read(cr, val);
}
return kvm_complete_insn_gp(vcpu, err);
@@ -2563,11 +2563,11 @@ static int dr_interception(struct kvm_vcpu *vcpu)
dr = svm->vmcb->control.exit_code - SVM_EXIT_READ_DR0;
if (dr >= 16) { /* mov to DRn */
dr -= 16;
- val = kvm_register_read(vcpu, reg);
+ val = kvm_register_readl(vcpu, reg);
err = kvm_set_dr(vcpu, dr, val);
} else {
kvm_get_dr(vcpu, dr, &val);
- kvm_register_write(vcpu, reg, val);
+ kvm_register_writel(vcpu, reg, val);
}
return kvm_complete_insn_gp(vcpu, err);
The patch below does not apply to the 5.10-stable tree.
If someone wants it applied there, or to any other stable or longterm
tree, then please email the backport, including the original git commit
id to <stable(a)vger.kernel.org>.
thanks,
greg k-h
------------------ original commit in Linus's tree ------------------
>From 5e753a817b2d5991dfe8a801b7b1e8e79a1c5a20 Mon Sep 17 00:00:00 2001
From: Anand Jain <anand.jain(a)oracle.com>
Date: Fri, 30 Apr 2021 19:59:51 +0800
Subject: [PATCH] btrfs: fix unmountable seed device after fstrim
The following test case reproduces an issue of wrongly freeing in-use
blocks on the readonly seed device when fstrim is called on the rw sprout
device. As shown below.
Create a seed device and add a sprout device to it:
$ mkfs.btrfs -fq -dsingle -msingle /dev/loop0
$ btrfstune -S 1 /dev/loop0
$ mount /dev/loop0 /btrfs
$ btrfs dev add -f /dev/loop1 /btrfs
BTRFS info (device loop0): relocating block group 290455552 flags system
BTRFS info (device loop0): relocating block group 1048576 flags system
BTRFS info (device loop0): disk added /dev/loop1
$ umount /btrfs
Mount the sprout device and run fstrim:
$ mount /dev/loop1 /btrfs
$ fstrim /btrfs
$ umount /btrfs
Now try to mount the seed device, and it fails:
$ mount /dev/loop0 /btrfs
mount: /btrfs: wrong fs type, bad option, bad superblock on /dev/loop0, missing codepage or helper program, or other error.
Block 5292032 is missing on the readonly seed device:
$ dmesg -kt | tail
<snip>
BTRFS error (device loop0): bad tree block start, want 5292032 have 0
BTRFS warning (device loop0): couldn't read-tree root
BTRFS error (device loop0): open_ctree failed
>From the dump-tree of the seed device (taken before the fstrim). Block
5292032 belonged to the block group starting at 5242880:
$ btrfs inspect dump-tree -e /dev/loop0 | grep -A1 BLOCK_GROUP
<snip>
item 3 key (5242880 BLOCK_GROUP_ITEM 8388608) itemoff 16169 itemsize 24
block group used 114688 chunk_objectid 256 flags METADATA
<snip>
>From the dump-tree of the sprout device (taken before the fstrim).
fstrim used block-group 5242880 to find the related free space to free:
$ btrfs inspect dump-tree -e /dev/loop1 | grep -A1 BLOCK_GROUP
<snip>
item 1 key (5242880 BLOCK_GROUP_ITEM 8388608) itemoff 16226 itemsize 24
block group used 32768 chunk_objectid 256 flags METADATA
<snip>
BPF kernel tracing the fstrim command finds the missing block 5292032
within the range of the discarded blocks as below:
kprobe:btrfs_discard_extent {
printf("freeing start %llu end %llu num_bytes %llu:\n",
arg1, arg1+arg2, arg2);
}
freeing start 5259264 end 5406720 num_bytes 147456
<snip>
Fix this by avoiding the discard command to the readonly seed device.
Reported-by: Chris Murphy <lists(a)colorremedies.com>
CC: stable(a)vger.kernel.org # 4.4+
Reviewed-by: Filipe Manana <fdmanana(a)suse.com>
Signed-off-by: Anand Jain <anand.jain(a)oracle.com>
Signed-off-by: David Sterba <dsterba(a)suse.com>
diff --git a/fs/btrfs/extent-tree.c b/fs/btrfs/extent-tree.c
index 7a28314189b4..f1d15b68994a 100644
--- a/fs/btrfs/extent-tree.c
+++ b/fs/btrfs/extent-tree.c
@@ -1340,12 +1340,16 @@ int btrfs_discard_extent(struct btrfs_fs_info *fs_info, u64 bytenr,
stripe = bbio->stripes;
for (i = 0; i < bbio->num_stripes; i++, stripe++) {
u64 bytes;
+ struct btrfs_device *device = stripe->dev;
- if (!stripe->dev->bdev) {
+ if (!device->bdev) {
ASSERT(btrfs_test_opt(fs_info, DEGRADED));
continue;
}
+ if (!test_bit(BTRFS_DEV_STATE_WRITEABLE, &device->dev_state))
+ continue;
+
ret = do_discard_extent(stripe, &bytes);
if (!ret) {
discarded_bytes += bytes;
The following commit has been merged into the x86/urgent branch of tip:
Commit-ID: 8d651ee9c71bb12fc0c8eb2786b66cbe5aa3e43b
Gitweb: https://git.kernel.org/tip/8d651ee9c71bb12fc0c8eb2786b66cbe5aa3e43b
Author: Tom Lendacky <thomas.lendacky(a)amd.com>
AuthorDate: Tue, 08 Jun 2021 11:54:33 +02:00
Committer: Borislav Petkov <bp(a)suse.de>
CommitterDate: Tue, 08 Jun 2021 16:26:55 +02:00
x86/ioremap: Map EFI-reserved memory as encrypted for SEV
Some drivers require memory that is marked as EFI boot services
data. In order for this memory to not be re-used by the kernel
after ExitBootServices(), efi_mem_reserve() is used to preserve it
by inserting a new EFI memory descriptor and marking it with the
EFI_MEMORY_RUNTIME attribute.
Under SEV, memory marked with the EFI_MEMORY_RUNTIME attribute needs to
be mapped encrypted by Linux, otherwise the kernel might crash at boot
like below:
EFI Variables Facility v0.08 2004-May-17
general protection fault, probably for non-canonical address 0x3597688770a868b2: 0000 [#1] SMP NOPTI
CPU: 13 PID: 1 Comm: swapper/0 Not tainted 5.12.4-2-default #1 openSUSE Tumbleweed
Hardware name: QEMU Standard PC (Q35 + ICH9, 2009), BIOS 0.0.0 02/06/2015
RIP: 0010:efi_mokvar_entry_next
[...]
Call Trace:
efi_mokvar_sysfs_init
? efi_mokvar_table_init
do_one_initcall
? __kmalloc
kernel_init_freeable
? rest_init
kernel_init
ret_from_fork
Expand the __ioremap_check_other() function to additionally check for
this other type of boot data reserved at runtime and indicate that it
should be mapped encrypted for an SEV guest.
[ bp: Massage commit message. ]
Fixes: 58c909022a5a ("efi: Support for MOK variable config table")
Reported-by: Joerg Roedel <jroedel(a)suse.de>
Signed-off-by: Tom Lendacky <thomas.lendacky(a)amd.com>
Signed-off-by: Joerg Roedel <jroedel(a)suse.de>
Signed-off-by: Borislav Petkov <bp(a)suse.de>
Tested-by: Joerg Roedel <jroedel(a)suse.de>
Cc: <stable(a)vger.kernel.org> # 5.10+
Link: https://lkml.kernel.org/r/20210608095439.12668-2-joro@8bytes.org
---
arch/x86/mm/ioremap.c | 4 +++-
1 file changed, 3 insertions(+), 1 deletion(-)
diff --git a/arch/x86/mm/ioremap.c b/arch/x86/mm/ioremap.c
index 12c686c..60ade7d 100644
--- a/arch/x86/mm/ioremap.c
+++ b/arch/x86/mm/ioremap.c
@@ -118,7 +118,9 @@ static void __ioremap_check_other(resource_size_t addr, struct ioremap_desc *des
if (!IS_ENABLED(CONFIG_EFI))
return;
- if (efi_mem_type(addr) == EFI_RUNTIME_SERVICES_DATA)
+ if (efi_mem_type(addr) == EFI_RUNTIME_SERVICES_DATA ||
+ (efi_mem_type(addr) == EFI_BOOT_SERVICES_DATA &&
+ efi_mem_attributes(addr) & EFI_MEMORY_RUNTIME))
desc->flags |= IORES_MAP_ENCRYPTED;
}
We don't set the SB_BORN flag on submounts. This is wrong as these
superblocks are then considered as partially constructed or dying
in the rest of the code and can break some assumptions.
One such case is when you have a virtiofs filesystem with submounts
and you try to mount it again : virtio_fs_get_tree() tries to obtain
a superblock with sget_fc(). The logic in sget_fc() is to loop until
it has either found an existing matching superblock with SB_BORN set
or to create a brand new one. It is assumed that a superblock without
SB_BORN is transient and the loop is restarted. Forgetting to set
SB_BORN on submounts hence causes sget_fc() to retry forever.
Setting SB_BORN requires special care, i.e. a write barrier for
super_cache_count() which can check SB_BORN without taking any lock.
We should call vfs_get_tree() to deal with that but this requires
to have a proper ->get_tree() implementation for submounts, which
is a bigger piece of work. Go for a simple bug fix in the meatime.
Fixes: bf109c64040f ("fuse: implement crossmounts")
Cc: mreitz(a)redhat.com
Cc: stable(a)vger.kernel.org # v5.10+
Signed-off-by: Greg Kurz <groug(a)kaod.org>
---
fs/fuse/dir.c | 11 +++++++++++
1 file changed, 11 insertions(+)
diff --git a/fs/fuse/dir.c b/fs/fuse/dir.c
index 3fd1b71e546b..3fa8604c21d5 100644
--- a/fs/fuse/dir.c
+++ b/fs/fuse/dir.c
@@ -352,6 +352,17 @@ static struct vfsmount *fuse_dentry_automount(struct path *path)
sb->s_flags |= SB_ACTIVE;
fsc->root = dget(sb->s_root);
+
+ /*
+ * FIXME: setting SB_BORN requires a write barrier for
+ * super_cache_count(). We should actually come
+ * up with a proper ->get_tree() implementation
+ * for submounts and call vfs_get_tree() to take
+ * care of the write barrier.
+ */
+ smp_wmb();
+ sb->s_flags |= SB_BORN;
+
/* We are done configuring the superblock, so unlock it */
up_write(&sb->s_umount);
--
2.31.1