The patch below does not apply to the 4.19-stable tree.
If someone wants it applied there, or to any other stable or longterm
tree, then please email the backport, including the original git commit
id to <stable(a)vger.kernel.org>.
Possible dependencies:
54c3931957f6 ("tracing: hold caller_addr to hardirq_{enable,disable}_ip")
8b023accc8df ("lockdep: Fix -Wunused-parameter for _THIS_IP_")
ef9989afda73 ("kvm: add guest_state_{enter,exit}_irqoff()")
ed922739c919 ("KVM: Use interval tree to do fast hva lookup in memslots")
26b8345abc75 ("KVM: Resolve memslot ID via a hash table instead of via a static array")
1e8617d37fc3 ("KVM: Move WARN on invalid memslot index to update_memslots()")
4e4d30cb9b87 ("KVM: Resync only arch fields when slots_arch_lock gets reacquired")
c5b077549136 ("KVM: Convert the kvm->vcpus array to a xarray")
27592ae8dbe4 ("KVM: Move wiping of the kvm->vcpus array to common code")
bda44d844758 ("KVM: Ensure local memslot copies operate on up-to-date arch-specific data")
99cdc6c18c2d ("RISC-V: Add initial skeletal KVM support")
192ad3c27a48 ("Merge tag 'for-linus' of git://git.kernel.org/pub/scm/virt/kvm/kvm")
thanks,
greg k-h
------------------ original commit in Linus's tree ------------------
From 54c3931957f6a6194d5972eccc36d052964b2abe Mon Sep 17 00:00:00 2001
From: Yipeng Zou <zouyipeng(a)huawei.com>
Date: Thu, 1 Sep 2022 18:45:14 +0800
Subject: [PATCH] tracing: hold caller_addr to hardirq_{enable,disable}_ip
Currently, The arguments passing to lockdep_hardirqs_{on,off} was fixed
in CALLER_ADDR0.
The function trace_hardirqs_on_caller should have been intended to use
caller_addr to represent the address that caller wants to be traced.
For example, lockdep log in riscv showing the last {enabled,disabled} at
__trace_hardirqs_{on,off} all the time(if called by):
[ 57.853175] hardirqs last enabled at (2519): __trace_hardirqs_on+0xc/0x14
[ 57.853848] hardirqs last disabled at (2520): __trace_hardirqs_off+0xc/0x14
After use trace_hardirqs_xx_caller, we can get more effective information:
[ 53.781428] hardirqs last enabled at (2595): restore_all+0xe/0x66
[ 53.782185] hardirqs last disabled at (2596): ret_from_exception+0xa/0x10
Link: https://lkml.kernel.org/r/20220901104515.135162-2-zouyipeng@huawei.com
Cc: stable(a)vger.kernel.org
Fixes: c3bc8fd637a96 ("tracing: Centralize preemptirq tracepoints and unify their usage")
Signed-off-by: Yipeng Zou <zouyipeng(a)huawei.com>
Signed-off-by: Steven Rostedt (Google) <rostedt(a)goodmis.org>
diff --git a/kernel/trace/trace_preemptirq.c b/kernel/trace/trace_preemptirq.c
index 95b58bd757ce..1e130da1b742 100644
--- a/kernel/trace/trace_preemptirq.c
+++ b/kernel/trace/trace_preemptirq.c
@@ -95,14 +95,14 @@ __visible void trace_hardirqs_on_caller(unsigned long caller_addr)
}
lockdep_hardirqs_on_prepare();
- lockdep_hardirqs_on(CALLER_ADDR0);
+ lockdep_hardirqs_on(caller_addr);
}
EXPORT_SYMBOL(trace_hardirqs_on_caller);
NOKPROBE_SYMBOL(trace_hardirqs_on_caller);
__visible void trace_hardirqs_off_caller(unsigned long caller_addr)
{
- lockdep_hardirqs_off(CALLER_ADDR0);
+ lockdep_hardirqs_off(caller_addr);
if (!this_cpu_read(tracing_irq_cpu)) {
this_cpu_write(tracing_irq_cpu, 1);
The patch below does not apply to the 5.4-stable tree.
If someone wants it applied there, or to any other stable or longterm
tree, then please email the backport, including the original git commit
id to <stable(a)vger.kernel.org>.
Possible dependencies:
54c3931957f6 ("tracing: hold caller_addr to hardirq_{enable,disable}_ip")
8b023accc8df ("lockdep: Fix -Wunused-parameter for _THIS_IP_")
ef9989afda73 ("kvm: add guest_state_{enter,exit}_irqoff()")
ed922739c919 ("KVM: Use interval tree to do fast hva lookup in memslots")
26b8345abc75 ("KVM: Resolve memslot ID via a hash table instead of via a static array")
1e8617d37fc3 ("KVM: Move WARN on invalid memslot index to update_memslots()")
4e4d30cb9b87 ("KVM: Resync only arch fields when slots_arch_lock gets reacquired")
c5b077549136 ("KVM: Convert the kvm->vcpus array to a xarray")
27592ae8dbe4 ("KVM: Move wiping of the kvm->vcpus array to common code")
bda44d844758 ("KVM: Ensure local memslot copies operate on up-to-date arch-specific data")
99cdc6c18c2d ("RISC-V: Add initial skeletal KVM support")
192ad3c27a48 ("Merge tag 'for-linus' of git://git.kernel.org/pub/scm/virt/kvm/kvm")
thanks,
greg k-h
------------------ original commit in Linus's tree ------------------
From 54c3931957f6a6194d5972eccc36d052964b2abe Mon Sep 17 00:00:00 2001
From: Yipeng Zou <zouyipeng(a)huawei.com>
Date: Thu, 1 Sep 2022 18:45:14 +0800
Subject: [PATCH] tracing: hold caller_addr to hardirq_{enable,disable}_ip
Currently, The arguments passing to lockdep_hardirqs_{on,off} was fixed
in CALLER_ADDR0.
The function trace_hardirqs_on_caller should have been intended to use
caller_addr to represent the address that caller wants to be traced.
For example, lockdep log in riscv showing the last {enabled,disabled} at
__trace_hardirqs_{on,off} all the time(if called by):
[ 57.853175] hardirqs last enabled at (2519): __trace_hardirqs_on+0xc/0x14
[ 57.853848] hardirqs last disabled at (2520): __trace_hardirqs_off+0xc/0x14
After use trace_hardirqs_xx_caller, we can get more effective information:
[ 53.781428] hardirqs last enabled at (2595): restore_all+0xe/0x66
[ 53.782185] hardirqs last disabled at (2596): ret_from_exception+0xa/0x10
Link: https://lkml.kernel.org/r/20220901104515.135162-2-zouyipeng@huawei.com
Cc: stable(a)vger.kernel.org
Fixes: c3bc8fd637a96 ("tracing: Centralize preemptirq tracepoints and unify their usage")
Signed-off-by: Yipeng Zou <zouyipeng(a)huawei.com>
Signed-off-by: Steven Rostedt (Google) <rostedt(a)goodmis.org>
diff --git a/kernel/trace/trace_preemptirq.c b/kernel/trace/trace_preemptirq.c
index 95b58bd757ce..1e130da1b742 100644
--- a/kernel/trace/trace_preemptirq.c
+++ b/kernel/trace/trace_preemptirq.c
@@ -95,14 +95,14 @@ __visible void trace_hardirqs_on_caller(unsigned long caller_addr)
}
lockdep_hardirqs_on_prepare();
- lockdep_hardirqs_on(CALLER_ADDR0);
+ lockdep_hardirqs_on(caller_addr);
}
EXPORT_SYMBOL(trace_hardirqs_on_caller);
NOKPROBE_SYMBOL(trace_hardirqs_on_caller);
__visible void trace_hardirqs_off_caller(unsigned long caller_addr)
{
- lockdep_hardirqs_off(CALLER_ADDR0);
+ lockdep_hardirqs_off(caller_addr);
if (!this_cpu_read(tracing_irq_cpu)) {
this_cpu_write(tracing_irq_cpu, 1);
The patch below does not apply to the 5.10-stable tree.
If someone wants it applied there, or to any other stable or longterm
tree, then please email the backport, including the original git commit
id to <stable(a)vger.kernel.org>.
Possible dependencies:
54c3931957f6 ("tracing: hold caller_addr to hardirq_{enable,disable}_ip")
8b023accc8df ("lockdep: Fix -Wunused-parameter for _THIS_IP_")
ef9989afda73 ("kvm: add guest_state_{enter,exit}_irqoff()")
ed922739c919 ("KVM: Use interval tree to do fast hva lookup in memslots")
26b8345abc75 ("KVM: Resolve memslot ID via a hash table instead of via a static array")
1e8617d37fc3 ("KVM: Move WARN on invalid memslot index to update_memslots()")
4e4d30cb9b87 ("KVM: Resync only arch fields when slots_arch_lock gets reacquired")
c5b077549136 ("KVM: Convert the kvm->vcpus array to a xarray")
27592ae8dbe4 ("KVM: Move wiping of the kvm->vcpus array to common code")
bda44d844758 ("KVM: Ensure local memslot copies operate on up-to-date arch-specific data")
99cdc6c18c2d ("RISC-V: Add initial skeletal KVM support")
192ad3c27a48 ("Merge tag 'for-linus' of git://git.kernel.org/pub/scm/virt/kvm/kvm")
thanks,
greg k-h
------------------ original commit in Linus's tree ------------------
From 54c3931957f6a6194d5972eccc36d052964b2abe Mon Sep 17 00:00:00 2001
From: Yipeng Zou <zouyipeng(a)huawei.com>
Date: Thu, 1 Sep 2022 18:45:14 +0800
Subject: [PATCH] tracing: hold caller_addr to hardirq_{enable,disable}_ip
Currently, The arguments passing to lockdep_hardirqs_{on,off} was fixed
in CALLER_ADDR0.
The function trace_hardirqs_on_caller should have been intended to use
caller_addr to represent the address that caller wants to be traced.
For example, lockdep log in riscv showing the last {enabled,disabled} at
__trace_hardirqs_{on,off} all the time(if called by):
[ 57.853175] hardirqs last enabled at (2519): __trace_hardirqs_on+0xc/0x14
[ 57.853848] hardirqs last disabled at (2520): __trace_hardirqs_off+0xc/0x14
After use trace_hardirqs_xx_caller, we can get more effective information:
[ 53.781428] hardirqs last enabled at (2595): restore_all+0xe/0x66
[ 53.782185] hardirqs last disabled at (2596): ret_from_exception+0xa/0x10
Link: https://lkml.kernel.org/r/20220901104515.135162-2-zouyipeng@huawei.com
Cc: stable(a)vger.kernel.org
Fixes: c3bc8fd637a96 ("tracing: Centralize preemptirq tracepoints and unify their usage")
Signed-off-by: Yipeng Zou <zouyipeng(a)huawei.com>
Signed-off-by: Steven Rostedt (Google) <rostedt(a)goodmis.org>
diff --git a/kernel/trace/trace_preemptirq.c b/kernel/trace/trace_preemptirq.c
index 95b58bd757ce..1e130da1b742 100644
--- a/kernel/trace/trace_preemptirq.c
+++ b/kernel/trace/trace_preemptirq.c
@@ -95,14 +95,14 @@ __visible void trace_hardirqs_on_caller(unsigned long caller_addr)
}
lockdep_hardirqs_on_prepare();
- lockdep_hardirqs_on(CALLER_ADDR0);
+ lockdep_hardirqs_on(caller_addr);
}
EXPORT_SYMBOL(trace_hardirqs_on_caller);
NOKPROBE_SYMBOL(trace_hardirqs_on_caller);
__visible void trace_hardirqs_off_caller(unsigned long caller_addr)
{
- lockdep_hardirqs_off(CALLER_ADDR0);
+ lockdep_hardirqs_off(caller_addr);
if (!this_cpu_read(tracing_irq_cpu)) {
this_cpu_write(tracing_irq_cpu, 1);
The patch below does not apply to the 5.15-stable tree.
If someone wants it applied there, or to any other stable or longterm
tree, then please email the backport, including the original git commit
id to <stable(a)vger.kernel.org>.
Possible dependencies:
54c3931957f6 ("tracing: hold caller_addr to hardirq_{enable,disable}_ip")
8b023accc8df ("lockdep: Fix -Wunused-parameter for _THIS_IP_")
ef9989afda73 ("kvm: add guest_state_{enter,exit}_irqoff()")
ed922739c919 ("KVM: Use interval tree to do fast hva lookup in memslots")
26b8345abc75 ("KVM: Resolve memslot ID via a hash table instead of via a static array")
1e8617d37fc3 ("KVM: Move WARN on invalid memslot index to update_memslots()")
4e4d30cb9b87 ("KVM: Resync only arch fields when slots_arch_lock gets reacquired")
c5b077549136 ("KVM: Convert the kvm->vcpus array to a xarray")
27592ae8dbe4 ("KVM: Move wiping of the kvm->vcpus array to common code")
bda44d844758 ("KVM: Ensure local memslot copies operate on up-to-date arch-specific data")
99cdc6c18c2d ("RISC-V: Add initial skeletal KVM support")
thanks,
greg k-h
------------------ original commit in Linus's tree ------------------
From 54c3931957f6a6194d5972eccc36d052964b2abe Mon Sep 17 00:00:00 2001
From: Yipeng Zou <zouyipeng(a)huawei.com>
Date: Thu, 1 Sep 2022 18:45:14 +0800
Subject: [PATCH] tracing: hold caller_addr to hardirq_{enable,disable}_ip
Currently, The arguments passing to lockdep_hardirqs_{on,off} was fixed
in CALLER_ADDR0.
The function trace_hardirqs_on_caller should have been intended to use
caller_addr to represent the address that caller wants to be traced.
For example, lockdep log in riscv showing the last {enabled,disabled} at
__trace_hardirqs_{on,off} all the time(if called by):
[ 57.853175] hardirqs last enabled at (2519): __trace_hardirqs_on+0xc/0x14
[ 57.853848] hardirqs last disabled at (2520): __trace_hardirqs_off+0xc/0x14
After use trace_hardirqs_xx_caller, we can get more effective information:
[ 53.781428] hardirqs last enabled at (2595): restore_all+0xe/0x66
[ 53.782185] hardirqs last disabled at (2596): ret_from_exception+0xa/0x10
Link: https://lkml.kernel.org/r/20220901104515.135162-2-zouyipeng@huawei.com
Cc: stable(a)vger.kernel.org
Fixes: c3bc8fd637a96 ("tracing: Centralize preemptirq tracepoints and unify their usage")
Signed-off-by: Yipeng Zou <zouyipeng(a)huawei.com>
Signed-off-by: Steven Rostedt (Google) <rostedt(a)goodmis.org>
diff --git a/kernel/trace/trace_preemptirq.c b/kernel/trace/trace_preemptirq.c
index 95b58bd757ce..1e130da1b742 100644
--- a/kernel/trace/trace_preemptirq.c
+++ b/kernel/trace/trace_preemptirq.c
@@ -95,14 +95,14 @@ __visible void trace_hardirqs_on_caller(unsigned long caller_addr)
}
lockdep_hardirqs_on_prepare();
- lockdep_hardirqs_on(CALLER_ADDR0);
+ lockdep_hardirqs_on(caller_addr);
}
EXPORT_SYMBOL(trace_hardirqs_on_caller);
NOKPROBE_SYMBOL(trace_hardirqs_on_caller);
__visible void trace_hardirqs_off_caller(unsigned long caller_addr)
{
- lockdep_hardirqs_off(CALLER_ADDR0);
+ lockdep_hardirqs_off(caller_addr);
if (!this_cpu_read(tracing_irq_cpu)) {
this_cpu_write(tracing_irq_cpu, 1);
The patch below does not apply to the 5.4-stable tree.
If someone wants it applied there, or to any other stable or longterm
tree, then please email the backport, including the original git commit
id to <stable(a)vger.kernel.org>.
Possible dependencies:
47311db8e8f3 ("tracefs: Only clobber mode/uid/gid on remount if asked")
851e99ebeec3 ("tracefs: Set the group ownership in apply_options() not parse_options()")
thanks,
greg k-h
------------------ original commit in Linus's tree ------------------
From 47311db8e8f33011d90dee76b39c8886120cdda4 Mon Sep 17 00:00:00 2001
From: Brian Norris <briannorris(a)chromium.org>
Date: Fri, 26 Aug 2022 17:44:17 -0700
Subject: [PATCH] tracefs: Only clobber mode/uid/gid on remount if asked
Users may have explicitly configured their tracefs permissions; we
shouldn't overwrite those just because a second mount appeared.
Only clobber if the options were provided at mount time.
Note: the previous behavior was especially surprising in the presence of
automounted /sys/kernel/debug/tracing/.
Existing behavior:
## Pre-existing status: tracefs is 0755.
# stat -c '%A' /sys/kernel/tracing/
drwxr-xr-x
## (Re)trigger the automount.
# umount /sys/kernel/debug/tracing
# stat -c '%A' /sys/kernel/debug/tracing/.
drwx------
## Unexpected: the automount changed mode for other mount instances.
# stat -c '%A' /sys/kernel/tracing/
drwx------
New behavior (after this change):
## Pre-existing status: tracefs is 0755.
# stat -c '%A' /sys/kernel/tracing/
drwxr-xr-x
## (Re)trigger the automount.
# umount /sys/kernel/debug/tracing
# stat -c '%A' /sys/kernel/debug/tracing/.
drwxr-xr-x
## Expected: the automount does not change other mount instances.
# stat -c '%A' /sys/kernel/tracing/
drwxr-xr-x
Link: https://lkml.kernel.org/r/20220826174353.2.Iab6e5ea57963d6deca5311b27fb7226…
Cc: stable(a)vger.kernel.org
Fixes: 4282d60689d4f ("tracefs: Add new tracefs file system")
Signed-off-by: Brian Norris <briannorris(a)chromium.org>
Signed-off-by: Steven Rostedt (Google) <rostedt(a)goodmis.org>
diff --git a/fs/tracefs/inode.c b/fs/tracefs/inode.c
index 81d26abf486f..da85b3979195 100644
--- a/fs/tracefs/inode.c
+++ b/fs/tracefs/inode.c
@@ -141,6 +141,8 @@ struct tracefs_mount_opts {
kuid_t uid;
kgid_t gid;
umode_t mode;
+ /* Opt_* bitfield. */
+ unsigned int opts;
};
enum {
@@ -241,6 +243,7 @@ static int tracefs_parse_options(char *data, struct tracefs_mount_opts *opts)
kgid_t gid;
char *p;
+ opts->opts = 0;
opts->mode = TRACEFS_DEFAULT_MODE;
while ((p = strsep(&data, ",")) != NULL) {
@@ -275,24 +278,36 @@ static int tracefs_parse_options(char *data, struct tracefs_mount_opts *opts)
* but traditionally tracefs has ignored all mount options
*/
}
+
+ opts->opts |= BIT(token);
}
return 0;
}
-static int tracefs_apply_options(struct super_block *sb)
+static int tracefs_apply_options(struct super_block *sb, bool remount)
{
struct tracefs_fs_info *fsi = sb->s_fs_info;
struct inode *inode = d_inode(sb->s_root);
struct tracefs_mount_opts *opts = &fsi->mount_opts;
- inode->i_mode &= ~S_IALLUGO;
- inode->i_mode |= opts->mode;
+ /*
+ * On remount, only reset mode/uid/gid if they were provided as mount
+ * options.
+ */
+
+ if (!remount || opts->opts & BIT(Opt_mode)) {
+ inode->i_mode &= ~S_IALLUGO;
+ inode->i_mode |= opts->mode;
+ }
- inode->i_uid = opts->uid;
+ if (!remount || opts->opts & BIT(Opt_uid))
+ inode->i_uid = opts->uid;
- /* Set all the group ids to the mount option */
- set_gid(sb->s_root, opts->gid);
+ if (!remount || opts->opts & BIT(Opt_gid)) {
+ /* Set all the group ids to the mount option */
+ set_gid(sb->s_root, opts->gid);
+ }
return 0;
}
@@ -307,7 +322,7 @@ static int tracefs_remount(struct super_block *sb, int *flags, char *data)
if (err)
goto fail;
- tracefs_apply_options(sb);
+ tracefs_apply_options(sb, true);
fail:
return err;
@@ -359,7 +374,7 @@ static int trace_fill_super(struct super_block *sb, void *data, int silent)
sb->s_op = &tracefs_super_operations;
- tracefs_apply_options(sb);
+ tracefs_apply_options(sb, false);
return 0;
The patch below does not apply to the 5.10-stable tree.
If someone wants it applied there, or to any other stable or longterm
tree, then please email the backport, including the original git commit
id to <stable(a)vger.kernel.org>.
Possible dependencies:
47311db8e8f3 ("tracefs: Only clobber mode/uid/gid on remount if asked")
851e99ebeec3 ("tracefs: Set the group ownership in apply_options() not parse_options()")
thanks,
greg k-h
------------------ original commit in Linus's tree ------------------
From 47311db8e8f33011d90dee76b39c8886120cdda4 Mon Sep 17 00:00:00 2001
From: Brian Norris <briannorris(a)chromium.org>
Date: Fri, 26 Aug 2022 17:44:17 -0700
Subject: [PATCH] tracefs: Only clobber mode/uid/gid on remount if asked
Users may have explicitly configured their tracefs permissions; we
shouldn't overwrite those just because a second mount appeared.
Only clobber if the options were provided at mount time.
Note: the previous behavior was especially surprising in the presence of
automounted /sys/kernel/debug/tracing/.
Existing behavior:
## Pre-existing status: tracefs is 0755.
# stat -c '%A' /sys/kernel/tracing/
drwxr-xr-x
## (Re)trigger the automount.
# umount /sys/kernel/debug/tracing
# stat -c '%A' /sys/kernel/debug/tracing/.
drwx------
## Unexpected: the automount changed mode for other mount instances.
# stat -c '%A' /sys/kernel/tracing/
drwx------
New behavior (after this change):
## Pre-existing status: tracefs is 0755.
# stat -c '%A' /sys/kernel/tracing/
drwxr-xr-x
## (Re)trigger the automount.
# umount /sys/kernel/debug/tracing
# stat -c '%A' /sys/kernel/debug/tracing/.
drwxr-xr-x
## Expected: the automount does not change other mount instances.
# stat -c '%A' /sys/kernel/tracing/
drwxr-xr-x
Link: https://lkml.kernel.org/r/20220826174353.2.Iab6e5ea57963d6deca5311b27fb7226…
Cc: stable(a)vger.kernel.org
Fixes: 4282d60689d4f ("tracefs: Add new tracefs file system")
Signed-off-by: Brian Norris <briannorris(a)chromium.org>
Signed-off-by: Steven Rostedt (Google) <rostedt(a)goodmis.org>
diff --git a/fs/tracefs/inode.c b/fs/tracefs/inode.c
index 81d26abf486f..da85b3979195 100644
--- a/fs/tracefs/inode.c
+++ b/fs/tracefs/inode.c
@@ -141,6 +141,8 @@ struct tracefs_mount_opts {
kuid_t uid;
kgid_t gid;
umode_t mode;
+ /* Opt_* bitfield. */
+ unsigned int opts;
};
enum {
@@ -241,6 +243,7 @@ static int tracefs_parse_options(char *data, struct tracefs_mount_opts *opts)
kgid_t gid;
char *p;
+ opts->opts = 0;
opts->mode = TRACEFS_DEFAULT_MODE;
while ((p = strsep(&data, ",")) != NULL) {
@@ -275,24 +278,36 @@ static int tracefs_parse_options(char *data, struct tracefs_mount_opts *opts)
* but traditionally tracefs has ignored all mount options
*/
}
+
+ opts->opts |= BIT(token);
}
return 0;
}
-static int tracefs_apply_options(struct super_block *sb)
+static int tracefs_apply_options(struct super_block *sb, bool remount)
{
struct tracefs_fs_info *fsi = sb->s_fs_info;
struct inode *inode = d_inode(sb->s_root);
struct tracefs_mount_opts *opts = &fsi->mount_opts;
- inode->i_mode &= ~S_IALLUGO;
- inode->i_mode |= opts->mode;
+ /*
+ * On remount, only reset mode/uid/gid if they were provided as mount
+ * options.
+ */
+
+ if (!remount || opts->opts & BIT(Opt_mode)) {
+ inode->i_mode &= ~S_IALLUGO;
+ inode->i_mode |= opts->mode;
+ }
- inode->i_uid = opts->uid;
+ if (!remount || opts->opts & BIT(Opt_uid))
+ inode->i_uid = opts->uid;
- /* Set all the group ids to the mount option */
- set_gid(sb->s_root, opts->gid);
+ if (!remount || opts->opts & BIT(Opt_gid)) {
+ /* Set all the group ids to the mount option */
+ set_gid(sb->s_root, opts->gid);
+ }
return 0;
}
@@ -307,7 +322,7 @@ static int tracefs_remount(struct super_block *sb, int *flags, char *data)
if (err)
goto fail;
- tracefs_apply_options(sb);
+ tracefs_apply_options(sb, true);
fail:
return err;
@@ -359,7 +374,7 @@ static int trace_fill_super(struct super_block *sb, void *data, int silent)
sb->s_op = &tracefs_super_operations;
- tracefs_apply_options(sb);
+ tracefs_apply_options(sb, false);
return 0;
This patch series backports a few VM preemption_status, steal_time and
PV TLB flushing fixes to 5.10 stable kernel.
Most of the changes backport cleanly except i had to work around a few
becauseof missing support/APIs in 5.10 kernel. I have captured those in
the changelog as well in the individual patches.
Changelog
- Use mark_page_dirty_in_slot api without kvm argument (KVM: x86: Fix
recording of guest steal time / preempted status)
- Avoid checking for xen_msr and SEV-ES conditions (KVM: x86:
do not set st->preempted when going back to user space)
- Use VCPU_STAT macro to expose preemption_reported and
preemption_other fields (KVM: x86: do not report a vCPU as preempted
outside instruction boundaries)
David Woodhouse (2):
KVM: x86: Fix recording of guest steal time / preempted status
KVM: Fix steal time asm constraints
Lai Jiangshan (1):
KVM: x86: Ensure PV TLB flush tracepoint reflects KVM behavior
Paolo Bonzini (5):
KVM: x86: do not set st->preempted when going back to user space
KVM: x86: do not report a vCPU as preempted outside instruction
boundaries
KVM: x86: revalidate steal time cache if MSR value changes
KVM: x86: do not report preemption if the steal time cache is stale
KVM: x86: move guest_pv_has out of user_access section
Sean Christopherson (1):
KVM: x86: Remove obsolete disabling of page faults in
kvm_arch_vcpu_put()
arch/x86/include/asm/kvm_host.h | 5 +-
arch/x86/kvm/svm/svm.c | 2 +
arch/x86/kvm/vmx/vmx.c | 1 +
arch/x86/kvm/x86.c | 164 ++++++++++++++++++++++----------
4 files changed, 122 insertions(+), 50 deletions(-)
--
2.37.1
Unsanitized pages trigger WARN_ON() unconditionally, which can panic the
whole computer, if /proc/sys/kernel/panic_on_warn is set.
In sgx_init(), if misc_register() fails or misc_register() succeeds but
neither sgx_drv_init() nor sgx_vepc_init() succeeds, then ksgxd will be
prematurely stopped. This may leave unsanitized pages, which will result a
false warning.
Refine __sgx_sanitize_pages() to return:
1. Zero when the sanitization process is complete or ksgxd has been
requested to stop.
2. The number of unsanitized pages otherwise.
Link: https://lore.kernel.org/linux-sgx/20220825051827.246698-1-jarkko@kernel.org…
Fixes: 51ab30eb2ad4 ("x86/sgx: Replace section->init_laundry_list with sgx_dirty_page_list")
Cc: stable(a)vger.kernel.org # v5.13+
Reported-by: Paul Menzel <pmenzel(a)molgen.mpg.de>
Signed-off-by: Jarkko Sakkinen <jarkko(a)kernel.org>
---
v9:
- Remove left_dirty declaratio from ksgxd().
- Update commit message.
v8:
- Discard changes that are not relevant for the stable fix. This
does absolutely minimum to address the bug:
https://lore.kernel.org/linux-sgx/a5fa56bdc57d6472a306bd8d795afc674b724538.…
v7:
- Rewrote commit message.
- Do not return -ECANCELED on premature stop. Instead use zero both
premature stop and complete sanitization.
v6:
- Address Reinette's feedback:
https://lore.kernel.org/linux-sgx/Yw6%2FiTzSdSw%2FY%2FVO@kernel.org/
v5:
- Add the klog dump and sysctl option to the commit message.
v4:
- Explain expectations for dirty_page_list in the function header, instead
of an inline comment.
- Improve commit message to explain the conditions better.
- Return the number of pages left dirty to ksgxd() and print warning after
the 2nd call, if there are any.
v3:
- Remove WARN_ON().
- Tuned comments and the commit message a bit.
v2:
- Replaced WARN_ON() with optional pr_info() inside
__sgx_sanitize_pages().
- Rewrote the commit message.
- Added the fixes tag.
---
arch/x86/kernel/cpu/sgx/main.c | 15 +++++++++------
1 file changed, 9 insertions(+), 6 deletions(-)
diff --git a/arch/x86/kernel/cpu/sgx/main.c b/arch/x86/kernel/cpu/sgx/main.c
index 515e2a5f25bb..0aad028f04d4 100644
--- a/arch/x86/kernel/cpu/sgx/main.c
+++ b/arch/x86/kernel/cpu/sgx/main.c
@@ -49,9 +49,13 @@ static LIST_HEAD(sgx_dirty_page_list);
* Reset post-kexec EPC pages to the uninitialized state. The pages are removed
* from the input list, and made available for the page allocator. SECS pages
* prepending their children in the input list are left intact.
+ *
+ * Return 0 when sanitization was successful or kthread was stopped, and the
+ * number of unsanitized pages otherwise.
*/
-static void __sgx_sanitize_pages(struct list_head *dirty_page_list)
+static unsigned long __sgx_sanitize_pages(struct list_head *dirty_page_list)
{
+ unsigned long left_dirty = 0;
struct sgx_epc_page *page;
LIST_HEAD(dirty);
int ret;
@@ -59,7 +63,7 @@ static void __sgx_sanitize_pages(struct list_head *dirty_page_list)
/* dirty_page_list is thread-local, no need for a lock: */
while (!list_empty(dirty_page_list)) {
if (kthread_should_stop())
- return;
+ return 0;
page = list_first_entry(dirty_page_list, struct sgx_epc_page, list);
@@ -92,12 +96,14 @@ static void __sgx_sanitize_pages(struct list_head *dirty_page_list)
} else {
/* The page is not yet clean - move to the dirty list. */
list_move_tail(&page->list, &dirty);
+ left_dirty++;
}
cond_resched();
}
list_splice(&dirty, dirty_page_list);
+ return left_dirty;
}
static bool sgx_reclaimer_age(struct sgx_epc_page *epc_page)
@@ -395,10 +401,7 @@ static int ksgxd(void *p)
* required for SECS pages, whose child pages blocked EREMOVE.
*/
__sgx_sanitize_pages(&sgx_dirty_page_list);
- __sgx_sanitize_pages(&sgx_dirty_page_list);
-
- /* sanity check: */
- WARN_ON(!list_empty(&sgx_dirty_page_list));
+ WARN_ON(__sgx_sanitize_pages(&sgx_dirty_page_list));
while (!kthread_should_stop()) {
if (try_to_freeze())
--
2.37.2
The following commit has been merged into the x86/urgent branch of tip:
Commit-ID: 81fa6fd13b5c43601fba8486f2385dbd7c1935e2
Gitweb: https://git.kernel.org/tip/81fa6fd13b5c43601fba8486f2385dbd7c1935e2
Author: Haitao Huang <haitao.huang(a)linux.intel.com>
AuthorDate: Tue, 06 Sep 2022 03:02:21 +03:00
Committer: Dave Hansen <dave.hansen(a)linux.intel.com>
CommitterDate: Thu, 08 Sep 2022 13:28:31 -07:00
x86/sgx: Handle VA page allocation failure for EAUG on PF.
VM_FAULT_NOPAGE is expected behaviour for -EBUSY failure path, when
augmenting a page, as this means that the reclaimer thread has been
triggered, and the intention is just to round-trip in ring-3, and
retry with a new page fault.
Fixes: 5a90d2c3f5ef ("x86/sgx: Support adding of pages to an initialized enclave")
Signed-off-by: Haitao Huang <haitao.huang(a)linux.intel.com>
Signed-off-by: Jarkko Sakkinen <jarkko(a)kernel.org>
Signed-off-by: Dave Hansen <dave.hansen(a)linux.intel.com>
Reviewed-by: Reinette Chatre <reinette.chatre(a)intel.com>
Tested-by: Vijay Dhanraj <vijay.dhanraj(a)intel.com>
Cc: stable(a)vger.kernel.org
Link: https://lkml.kernel.org/r/20220906000221.34286-3-jarkko@kernel.org
---
arch/x86/kernel/cpu/sgx/encl.c | 5 ++++-
1 file changed, 4 insertions(+), 1 deletion(-)
diff --git a/arch/x86/kernel/cpu/sgx/encl.c b/arch/x86/kernel/cpu/sgx/encl.c
index 24c1bb8..8bdeae2 100644
--- a/arch/x86/kernel/cpu/sgx/encl.c
+++ b/arch/x86/kernel/cpu/sgx/encl.c
@@ -344,8 +344,11 @@ static vm_fault_t sgx_encl_eaug_page(struct vm_area_struct *vma,
}
va_page = sgx_encl_grow(encl, false);
- if (IS_ERR(va_page))
+ if (IS_ERR(va_page)) {
+ if (PTR_ERR(va_page) == -EBUSY)
+ vmret = VM_FAULT_NOPAGE;
goto err_out_epc;
+ }
if (va_page)
list_add(&va_page->list, &encl->va_pages);