[
This is the non-RFC version.
It went through and passed all my tests. If there's no objections
I'm going to include this in my pull request. I still have patches
in my INBOX that may still be included, so I need to run those through
my tests as well, so a pull request wont be immediate.
]
Nicolai Stange discovered that Live Kernel Patching can have unforseen
consequences if tracing is enabled when there are functions that are
patched. The reason being, is that Live Kernel patching is built on top
of ftrace, which will have the patched functions call the live kernel
trampoline directly, and that trampoline will modify the regs->ip address
to return to the patched function.
But in the transition between changing the call to the customized
trampoline, the tracing code is needed to have its handler called
an well, so the function fentry location must be changed from calling
the live kernel patching trampoline, to the ftrace_reg_caller trampoline
which will iterate through all the registered ftrace handlers for
that function.
During this transition, a break point is added to do the live code
modifications. But if that break point is hit, it just skips calling
any handler, and makes the call site act as a nop. For tracing, the
worse that can happen is that you miss a function being traced, but
for live kernel patching the affects are more severe, as the old buggy
function is now called.
To solve this, an int3_emulate_call() is created for x86_64 to allow
ftrace on x86_64 to emulate the call to ftrace_regs_caller() which will
make sure all the registered handlers to that function are still called.
And this keeps live kernel patching happy!
To mimimize the changes, and to avoid controversial patches, this
only changes x86_64. Due to the way x86_32 implements the regs->sp
the complexity of emulating calls on that platform is too much for
stable patches, and live kernel patching does not support x86_32 anyway.
Josh Poimboeuf (1):
x86_64: Add gap to int3 to allow for call emulation
Peter Zijlstra (2):
x86_64: Allow breakpoints to emulate call instructions
ftrace/x86_64: Emulate call function while updating in breakpoint handler
----
arch/x86/entry/entry_64.S | 18 ++++++++++++++++--
arch/x86/include/asm/text-patching.h | 28 ++++++++++++++++++++++++++++
arch/x86/kernel/ftrace.c | 32 +++++++++++++++++++++++++++-----
3 files changed, 71 insertions(+), 7 deletions(-)
Nicolai Stange discovered that Live Kernel Patching can have unforseen
consequences if tracing is enabled when there are functions that are
patched. The reason being, is that Live Kernel patching is built on top
of ftrace, which will have the patched functions call the live kernel
trampoline directly, and that trampoline will modify the regs->ip address
to return to the patched function.
But in the transition between changing the call to the customized
trampoline, the tracing code is needed to have its handler called
an well, so the function fentry location must be changed from calling
the live kernel patching trampoline, to the ftrace_reg_caller trampoline
which will iterate through all the registered ftrace handlers for
that function.
During this transition, a break point is added to do the live code
modifications. But if that break point is hit, it just skips calling
any handler, and makes the call site act as a nop. For tracing, the
worse that can happen is that you miss a function being traced, but
for live kernel patching the affects are more severe, as the old buggy
function is now called.
To solve this, an int3_emulate_call() is created for x86_64 to allow
ftrace on x86_64 to emulate the call to ftrace_regs_caller() which will
make sure all the registered handlers to that function are still called.
And this keeps live kernel patching happy!
To mimimize the changes, and to avoid controversial patches, this
only changes x86_64. Due to the way x86_32 implements the regs->sp
the complexity of emulating calls on that platform is too much for
stable patches, and live kernel patching does not support x86_32 anyway.
Josh Poimboeuf (1):
x86_64: Add gap to int3 to allow for call emulation
Peter Zijlstra (2):
x86_64: Allow breakpoints to emulate call functions
ftrace/x86_64: Emulate call function while updating in breakpoint handler
----
arch/x86/entry/entry_64.S | 18 ++++++++++++++++--
arch/x86/include/asm/text-patching.h | 22 ++++++++++++++++++++++
arch/x86/kernel/ftrace.c | 32 +++++++++++++++++++++++++++-----
3 files changed, 65 insertions(+), 7 deletions(-)
The kheaders archive consisting of the kernel headers used for compiling
bpf programs is in /proc. However there is concern that moving it here
will make it permanent. Let us move it to /sys/kernel as discussed [1].
[1] https://lore.kernel.org/patchwork/patch/1067310/#1265969
Suggested-by: Steven Rostedt <rostedt(a)goodmis.org>
Signed-off-by: Joel Fernandes (Google) <joel(a)joelfernandes.org>
---
This patch applies on top of the previous patch that was applied to the
driver tree:
https://lore.kernel.org/patchwork/patch/1067310/
v1->v2: Fixed some kconfig nits (Masami).
init/Kconfig | 16 ++++-----
kernel/Makefile | 4 +--
kernel/{gen_ikh_data.sh => gen_kheaders.sh} | 2 +-
kernel/kheaders.c | 40 +++++++++------------
4 files changed, 26 insertions(+), 36 deletions(-)
rename kernel/{gen_ikh_data.sh => gen_kheaders.sh} (98%)
diff --git a/init/Kconfig b/init/Kconfig
index 26a364a95b57..c3661991b089 100644
--- a/init/Kconfig
+++ b/init/Kconfig
@@ -579,15 +579,13 @@ config IKCONFIG_PROC
This option enables access to the kernel configuration file
through /proc/config.gz.
-config IKHEADERS_PROC
- tristate "Enable kernel header artifacts through /proc/kheaders.tar.xz"
- depends on PROC_FS
- help
- This option enables access to the kernel header and other artifacts that
- are generated during the build process. These can be used to build eBPF
- tracing programs, or similar programs. If you build the headers as a
- module, a module called kheaders.ko is built which can be loaded on-demand
- to get access to the headers.
+config IKHEADERS
+ tristate "Enable kernel headers through /sys/kernel/kheaders.tar.xz"
+ help
+ This option enables access to the in-kernel headers that are generated during
+ the build process. These can be used to build eBPF tracing programs,
+ or similar programs. If you build the headers as a module, a module called
+ kheaders.ko is built which can be loaded on-demand to get access to headers.
config LOG_BUF_SHIFT
int "Kernel log buffer size (16 => 64KB, 17 => 128KB)"
diff --git a/kernel/Makefile b/kernel/Makefile
index 12399614c350..b32a558fae2f 100644
--- a/kernel/Makefile
+++ b/kernel/Makefile
@@ -70,7 +70,7 @@ obj-$(CONFIG_UTS_NS) += utsname.o
obj-$(CONFIG_USER_NS) += user_namespace.o
obj-$(CONFIG_PID_NS) += pid_namespace.o
obj-$(CONFIG_IKCONFIG) += configs.o
-obj-$(CONFIG_IKHEADERS_PROC) += kheaders.o
+obj-$(CONFIG_IKHEADERS) += kheaders.o
obj-$(CONFIG_SMP) += stop_machine.o
obj-$(CONFIG_KPROBES_SANITY_TEST) += test_kprobes.o
obj-$(CONFIG_AUDIT) += audit.o auditfilter.o
@@ -126,7 +126,7 @@ $(obj)/config_data.gz: $(KCONFIG_CONFIG) FORCE
$(obj)/kheaders.o: $(obj)/kheaders_data.tar.xz
quiet_cmd_genikh = CHK $(obj)/kheaders_data.tar.xz
-cmd_genikh = $(srctree)/kernel/gen_ikh_data.sh $@
+cmd_genikh = $(srctree)/kernel/gen_kheaders.sh $@
$(obj)/kheaders_data.tar.xz: FORCE
$(call cmd,genikh)
diff --git a/kernel/gen_ikh_data.sh b/kernel/gen_kheaders.sh
similarity index 98%
rename from kernel/gen_ikh_data.sh
rename to kernel/gen_kheaders.sh
index 591a94f7b387..581b83534587 100755
--- a/kernel/gen_ikh_data.sh
+++ b/kernel/gen_kheaders.sh
@@ -2,7 +2,7 @@
# SPDX-License-Identifier: GPL-2.0
# This script generates an archive consisting of kernel headers
-# for CONFIG_IKHEADERS_PROC.
+# for CONFIG_IKHEADERS.
set -e
spath="$(dirname "$(readlink -f "$0")")"
kroot="$spath/.."
diff --git a/kernel/kheaders.c b/kernel/kheaders.c
index 70ae6052920d..6a16f8f6898d 100644
--- a/kernel/kheaders.c
+++ b/kernel/kheaders.c
@@ -8,9 +8,8 @@
#include <linux/kernel.h>
#include <linux/module.h>
-#include <linux/proc_fs.h>
+#include <linux/kobject.h>
#include <linux/init.h>
-#include <linux/uaccess.h>
/*
* Define kernel_headers_data and kernel_headers_data_end, within which the
@@ -31,39 +30,32 @@ extern char kernel_headers_data;
extern char kernel_headers_data_end;
static ssize_t
-ikheaders_read_current(struct file *file, char __user *buf,
- size_t len, loff_t *offset)
+ikheaders_read(struct file *file, struct kobject *kobj,
+ struct bin_attribute *bin_attr,
+ char *buf, loff_t off, size_t len)
{
- return simple_read_from_buffer(buf, len, offset,
- &kernel_headers_data,
- &kernel_headers_data_end -
- &kernel_headers_data);
+ memcpy(buf, &kernel_headers_data + off, len);
+ return len;
}
-static const struct file_operations ikheaders_file_ops = {
- .read = ikheaders_read_current,
- .llseek = default_llseek,
+static struct bin_attribute kheaders_attr __ro_after_init = {
+ .attr = {
+ .name = "kheaders.tar.xz",
+ .mode = S_IRUGO,
+ },
+ .read = &ikheaders_read,
};
static int __init ikheaders_init(void)
{
- struct proc_dir_entry *entry;
-
- /* create the current headers file */
- entry = proc_create("kheaders.tar.xz", S_IRUGO, NULL,
- &ikheaders_file_ops);
- if (!entry)
- return -ENOMEM;
-
- proc_set_size(entry,
- &kernel_headers_data_end -
- &kernel_headers_data);
- return 0;
+ kheaders_attr.size = (&kernel_headers_data_end -
+ &kernel_headers_data);
+ return sysfs_create_bin_file(kernel_kobj, &kheaders_attr);
}
static void __exit ikheaders_cleanup(void)
{
- remove_proc_entry("kheaders.tar.xz", NULL);
+ sysfs_remove_bin_file(kernel_kobj, &kheaders_attr);
}
module_init(ikheaders_init);
--
2.21.0.1020.gf2820cf01a-goog
From: Po-Hsu Lin <po-hsu.lin(a)canonical.com>
[ Upstream commit 30c04d796b693e22405c38e9b78e9a364e4c77e6 ]
The run_netsocktests will be marked as passed regardless the actual test
result from the ./socket:
selftests: net: run_netsocktests
========================================
--------------------
running socket test
--------------------
[FAIL]
ok 1..6 selftests: net: run_netsocktests [PASS]
This is because the test script itself has been successfully executed.
Fix this by exit 1 when the test failed.
Signed-off-by: Po-Hsu Lin <po-hsu.lin(a)canonical.com>
Signed-off-by: David S. Miller <davem(a)davemloft.net>
Signed-off-by: Sasha Levin <sashal(a)kernel.org>
---
tools/testing/selftests/net/run_netsocktests | 2 +-
1 file changed, 1 insertion(+), 1 deletion(-)
diff --git a/tools/testing/selftests/net/run_netsocktests b/tools/testing/selftests/net/run_netsocktests
index c09a682df56a..19486dab2379 100644
--- a/tools/testing/selftests/net/run_netsocktests
+++ b/tools/testing/selftests/net/run_netsocktests
@@ -6,7 +6,7 @@ echo "--------------------"
./socket
if [ $? -ne 0 ]; then
echo "[FAIL]"
+ exit 1
else
echo "[PASS]"
fi
-
--
2.20.1
From: Po-Hsu Lin <po-hsu.lin(a)canonical.com>
[ Upstream commit 30c04d796b693e22405c38e9b78e9a364e4c77e6 ]
The run_netsocktests will be marked as passed regardless the actual test
result from the ./socket:
selftests: net: run_netsocktests
========================================
--------------------
running socket test
--------------------
[FAIL]
ok 1..6 selftests: net: run_netsocktests [PASS]
This is because the test script itself has been successfully executed.
Fix this by exit 1 when the test failed.
Signed-off-by: Po-Hsu Lin <po-hsu.lin(a)canonical.com>
Signed-off-by: David S. Miller <davem(a)davemloft.net>
Signed-off-by: Sasha Levin <sashal(a)kernel.org>
---
tools/testing/selftests/net/run_netsocktests | 2 +-
1 file changed, 1 insertion(+), 1 deletion(-)
diff --git a/tools/testing/selftests/net/run_netsocktests b/tools/testing/selftests/net/run_netsocktests
index 16058bbea7a8..c195b4478662 100755
--- a/tools/testing/selftests/net/run_netsocktests
+++ b/tools/testing/selftests/net/run_netsocktests
@@ -6,7 +6,7 @@ echo "--------------------"
./socket
if [ $? -ne 0 ]; then
echo "[FAIL]"
+ exit 1
else
echo "[PASS]"
fi
-
--
2.20.1
From: Po-Hsu Lin <po-hsu.lin(a)canonical.com>
[ Upstream commit 30c04d796b693e22405c38e9b78e9a364e4c77e6 ]
The run_netsocktests will be marked as passed regardless the actual test
result from the ./socket:
selftests: net: run_netsocktests
========================================
--------------------
running socket test
--------------------
[FAIL]
ok 1..6 selftests: net: run_netsocktests [PASS]
This is because the test script itself has been successfully executed.
Fix this by exit 1 when the test failed.
Signed-off-by: Po-Hsu Lin <po-hsu.lin(a)canonical.com>
Signed-off-by: David S. Miller <davem(a)davemloft.net>
Signed-off-by: Sasha Levin <sashal(a)kernel.org>
---
tools/testing/selftests/net/run_netsocktests | 2 +-
1 file changed, 1 insertion(+), 1 deletion(-)
diff --git a/tools/testing/selftests/net/run_netsocktests b/tools/testing/selftests/net/run_netsocktests
index 16058bbea7a8..c195b4478662 100755
--- a/tools/testing/selftests/net/run_netsocktests
+++ b/tools/testing/selftests/net/run_netsocktests
@@ -6,7 +6,7 @@ echo "--------------------"
./socket
if [ $? -ne 0 ]; then
echo "[FAIL]"
+ exit 1
else
echo "[PASS]"
fi
-
--
2.20.1