On Thu, Mar 07, 2019 at 01:59:12PM +0900, Masahiro Yamada wrote: [snip]
[2]
The shell script keeps running even when an error occurs.
If any line in a shell script fails, probably it went already wrong.
I highly recommend to add 'set -e' at the very beginning of your shell script.
It will propagate the error to Make.
Ok I will consider making it -e. But please note that, the script itself does not have any bugs.
Heh. I do not know why you are so confident with your code, though.
OK, let's say this script has no bug.
But, the script may fail for a different reason. For example, what if cpio is not installed on the user's machine.
In such a situation, this script will produce empty kernel/kheaders_data.tar.xz, but the build system will continue running, and succeed.
Don't you think it will better to let the build immediately fail than letting the user debug a wrongly created image ?
Yes, I do think it is better and I am doing that in the next revision already. It is appended below.
[5] strip-comments.pl is short enough, and I do not assume any other user.
IMHO, it would be cleaner to embed the one-liner perl into the shell script, for example, like follows:
find $cpio_dir -type f -print0 | xargs -0 -P8 -n1 perl -pi -e 'BEGIN {undef $/;}; s//*((?!SPDX).)*?*///smg;'
This does not work. I already tried it. But I guess I could generate the script on the fly.
I tested my code, and it worked for me.
perl -e <command_line>
is useful to run short code directly.
Sorry, previous attempts to use perl -pie didn't work but I tried your one liner and it does work. Below is the updated patch with this and all the other nits addressed including the module.lds one.
About module.lds for ia64 architecture, sorry to miss that - I did not test all architectures, just arm64 and x86 which I have hardware for. Interestingly kbuild robot didn't catch it either. Anyway it is addressed below.
Thanks a lot for all the great review.
---8<-----------------------
From 22cfd523f14c3b7e13fa05fc1c483a4d368e00d1 Mon Sep 17 00:00:00 2001
From: "Joel Fernandes (Google)" joel@joelfernandes.org Date: Fri, 18 Jan 2019 15:52:41 -0500 Subject: [PATCH v4.1] Provide in-kernel headers for making it easy to extend the kernel
Introduce in-kernel headers and other artifacts which are made available as an archive through proc (/proc/kheaders.tar.xz file). This archive makes it possible to build kernel modules, run eBPF programs, and other tracing programs that need to extend the kernel for tracing purposes without any dependency on the file system having headers and build artifacts.
On Android and embedded systems, it is common to switch kernels but not have kernel headers available on the file system. Raw kernel headers also cannot be copied into the filesystem like they can be on other distros, due to licensing and other issues. There's no linux-headers package on Android. Further once a different kernel is booted, any headers stored on the file system will no longer be useful. By storing the headers as a compressed archive within the kernel, we can avoid these issues that have been a hindrance for a long time.
The feature is also buildable as a module just in case the user desires it not being part of the kernel image. This makes it possible to load and unload the headers on demand. A tracing program, or a kernel module builder can load the module, do its operations, and then unload the module to save kernel memory. The total memory needed is 3.8MB.
The code to read the headers is based on /proc/config.gz code and uses the same technique to embed the headers.
To build a module, the below steps have been tested on an x86 machine: modprobe kheaders rm -rf $HOME/headers mkdir -p $HOME/headers tar -xvf /proc/kheaders.tar.xz -C $HOME/headers >/dev/null cd my-kernel-module make -C $HOME/headers M=$(pwd) modules rmmod kheaders
Additional notes: (1) external modules must be built on the same arch as the host that built vmlinux. This can be done either in a qemu emulated chroot on the target, or natively. This is due to host arch dependency of kernel scripts.
(2) A limitation of module building with this is, since Module.symvers is not available in the archive due to a cyclic dependency with building of the archive into the kernel or module binaries, the modules built using the archive will not contain symbol versioning (modversion). This is usually not an issue since the idea of this patch is to build a kernel module on the fly and load it into the same kernel. An appropriate warning is already printed by the kernel to alert the user of modules not having modversions when built using the archive. For building with modversions, the user can use traditional header packages. For our tracing usecases, we build modules on the fly with this so it is not a concern.
(3) I have left IKHD_ST and IKHD_ED markers as is to facilitate future patches that would extract the headers from a kernel or module image.
Tested-by: qais.yousef@arm.com Tested-by: dietmar.eggemann@arm.com Tested-by: linux@manojrajarao.com Signed-off-by: Joel Fernandes (Google) joel@joelfernandes.org diff-note-start
Changes since v4: - added module.lds if ia64 otherwise ia64 may fail to build. - added clean-files rule to Makefile - removed strip-comments script and doing it inline - added set -e to header generated to die on errorsr - fixed a minor issue where find command was noisy. - added Tested-by tags from ARM folks. - TODO: several more Masahiro comments.
Changes since v3: - Blank tar was being generated because of a one line I forgot to push. It is updated now. - Added module.lds since arm64 needs it to build modules.
Changes since v2: (Thanks to Masahiro Yamada for several excellent suggestions) - Added support for out of tree builds. - Added incremental build support bringing down build time of incremental builds from 50 seconds to 5 seconds. - Fixed various small nits / cleanups. - clean ups to kheaders.c pointed by Alexey Dobriyan. - Fixed MODULE_LICENSE in test module and kheaders.c - Dropped Module.symvers from archive due to circular dependency.
Changes since v1: - removed IKH_EXTRA variable, not needed (Masahiro Yamada) - small fix ups to selftest - added target to main Makefile etc - added MODULE_LICENSE to test module - made selftest more quiet
Changes since RFC: Both changes bring size down to 3.8MB: - use xz for compression - strip comments except SPDX lines - Call out the module name in Kconfig - Also added selftests in second patch to ensure headers are always working.
Signed-off-by: Joel Fernandes (Google) joel@joelfernandes.org --- Documentation/dontdiff | 1 + init/Kconfig | 11 ++++++ kernel/.gitignore | 3 ++ kernel/Makefile | 36 ++++++++++++++++++ kernel/kheaders.c | 72 ++++++++++++++++++++++++++++++++++++ scripts/gen_ikh_data.sh | 81 +++++++++++++++++++++++++++++++++++++++++ 6 files changed, 204 insertions(+) create mode 100644 kernel/kheaders.c create mode 100755 scripts/gen_ikh_data.sh
diff --git a/Documentation/dontdiff b/Documentation/dontdiff index 2228fcc8e29f..05a2319ee2a2 100644 --- a/Documentation/dontdiff +++ b/Documentation/dontdiff @@ -151,6 +151,7 @@ int8.c kallsyms kconfig keywords.c +kheaders_data.h* ksym.c* ksym.h* kxgettext diff --git a/init/Kconfig b/init/Kconfig index c9386a365eea..63ff0990ae55 100644 --- a/init/Kconfig +++ b/init/Kconfig @@ -563,6 +563,17 @@ config IKCONFIG_PROC This option enables access to the kernel configuration file through /proc/config.gz.
+config IKHEADERS_PROC + tristate "Enable kernel header artifacts through /proc/kheaders.tar.xz" + select BUILD_BIN2C + depends on PROC_FS + help + This option enables access to the kernel header and other artifacts that + are generated during the build process. These can be used to build kernel + modules, and other in-kernel programs such as those generated by eBPF + and systemtap tools. If you build the headers as a module, a module + called kheaders.ko is built which can be loaded to get access to them. + config LOG_BUF_SHIFT int "Kernel log buffer size (16 => 64KB, 17 => 128KB)" range 12 25 diff --git a/kernel/.gitignore b/kernel/.gitignore index b3097bde4e9c..484018945e93 100644 --- a/kernel/.gitignore +++ b/kernel/.gitignore @@ -3,5 +3,8 @@ # config_data.h config_data.gz +kheaders.md5 +kheaders_data.h +kheaders_data.tar.xz timeconst.h hz.bc diff --git a/kernel/Makefile b/kernel/Makefile index 6aa7543bcdb2..2c7f5e9a2e9f 100644 --- a/kernel/Makefile +++ b/kernel/Makefile @@ -70,6 +70,7 @@ obj-$(CONFIG_UTS_NS) += utsname.o obj-$(CONFIG_USER_NS) += user_namespace.o obj-$(CONFIG_PID_NS) += pid_namespace.o obj-$(CONFIG_IKCONFIG) += configs.o +obj-$(CONFIG_IKHEADERS_PROC) += kheaders.o obj-$(CONFIG_SMP) += stop_machine.o obj-$(CONFIG_KPROBES_SANITY_TEST) += test_kprobes.o obj-$(CONFIG_AUDIT) += audit.o auditfilter.o @@ -130,3 +131,38 @@ filechk_ikconfiggz = \ targets += config_data.h $(obj)/config_data.h: $(obj)/config_data.gz FORCE $(call filechk,ikconfiggz) + +# Build a list of in-kernel headers for building kernel modules +ikh_file_list := include/ +ikh_file_list += arch/$(SRCARCH)/Makefile +ikh_file_list += arch/$(SRCARCH)/include/ +ikh_file_list += arch/$(SRCARCH)/module.lds +ikh_file_list += arch/$(SRCARCH)/kernel/module.lds +ikh_file_list += scripts/ +ikh_file_list += Makefile + +# Things we need from the $objtree. "OBJDIR" is for the gen_ikh_data.sh +# script to identify that this comes from the $objtree directory +ikh_file_list += OBJDIR/scripts/ +ikh_file_list += OBJDIR/include/ +ikh_file_list += OBJDIR/arch/$(SRCARCH)/include/ +ifeq ($(CONFIG_STACK_VALIDATION), y) +ikh_file_list += OBJDIR/tools/objtool/objtool +endif + +$(obj)/kheaders.o: $(obj)/kheaders_data.h + +quiet_cmd_genikh = GEN $(obj)/kheaders_data.tar.xz +cmd_genikh = $(srctree)/scripts/gen_ikh_data.sh $@ $(ikh_file_list) +$(obj)/kheaders_data.tar.xz: FORCE + $(call cmd,genikh) + +filechk_ikheadersxz = \ + echo "static const char kernel_headers_data[] __used = KH_MAGIC_START"; \ + cat $< | scripts/bin2c; \ + echo "KH_MAGIC_END;" + +$(obj)/kheaders_data.h: $(obj)/kheaders_data.tar.xz FORCE + $(call filechk,ikheadersxz) + +clean-files := kheaders_data.tar.xz kheaders_data.h kheaders.md5 diff --git a/kernel/kheaders.c b/kernel/kheaders.c new file mode 100644 index 000000000000..46a6358301e5 --- /dev/null +++ b/kernel/kheaders.c @@ -0,0 +1,72 @@ +// SPDX-License-Identifier: GPL-2.0 +/* + * kernel/kheaders.c + * Provide headers and artifacts needed to build kernel modules. + * (Borrowed code from kernel/configs.c) + */ + +#include <linux/kernel.h> +#include <linux/module.h> +#include <linux/proc_fs.h> +#include <linux/init.h> +#include <linux/uaccess.h> + +/* + * Define kernel_headers_data and kernel_headers_data_size, which contains the + * compressed kernel headers. The file is first compressed with xz and then + * bounded by two eight byte magic numbers to allow extraction from a binary + * kernel image: + * + * IKHD_ST + * <image> + * IKHD_ED + */ +#define KH_MAGIC_START "IKHD_ST" +#define KH_MAGIC_END "IKHD_ED" +#include "kheaders_data.h" + + +#define KH_MAGIC_SIZE (sizeof(KH_MAGIC_START) - 1) +#define kernel_headers_data_size \ + (sizeof(kernel_headers_data) - 1 - KH_MAGIC_SIZE * 2) + +static ssize_t +ikheaders_read_current(struct file *file, char __user *buf, + size_t len, loff_t *offset) +{ + return simple_read_from_buffer(buf, len, offset, + kernel_headers_data + KH_MAGIC_SIZE, + kernel_headers_data_size); +} + +static const struct file_operations ikheaders_file_ops = { + .read = ikheaders_read_current, + .llseek = default_llseek, +}; + +static int __init ikheaders_init(void) +{ + struct proc_dir_entry *entry; + + /* create the current headers file */ + entry = proc_create("kheaders.tar.xz", S_IRUGO, NULL, + &ikheaders_file_ops); + if (!entry) + return -ENOMEM; + + proc_set_size(entry, kernel_headers_data_size); + + return 0; +} + +static void __exit ikheaders_cleanup(void) +{ + remove_proc_entry("kheaders.tar.xz", NULL); +} + +module_init(ikheaders_init); +module_exit(ikheaders_cleanup); + +MODULE_LICENSE("GPL v2"); +MODULE_AUTHOR("Joel Fernandes"); +MODULE_DESCRIPTION("Echo the kernel header artifacts used to build the kernel"); diff --git a/scripts/gen_ikh_data.sh b/scripts/gen_ikh_data.sh new file mode 100755 index 000000000000..bcd694e9a105 --- /dev/null +++ b/scripts/gen_ikh_data.sh @@ -0,0 +1,81 @@ +#!/bin/bash +# SPDX-License-Identifier: GPL-2.0 +set -e + +spath="$(dirname "$(readlink -f "$0")")" +kroot="$spath/.." +outdir="$(pwd)" +tarfile=$1 +cpio_dir=$outdir/$tarfile.tmp + +file_list=${@:2} + +src_file_list="" +for f in $file_list; do + if [ ! -f "$kroot/$f" ] && [ ! -d "$kroot/$f" ]; then continue; fi + src_file_list="$src_file_list $(echo $f | grep -v OBJDIR)" +done + +obj_file_list="" +for f in $file_list; do + f=$(echo $f | grep OBJDIR | sed -e 's/OBJDIR///g') + if [ ! -f $f ] && [ ! -d $f ]; then continue; fi + obj_file_list="$obj_file_list $f"; +done + +# Support incremental builds by skipping archive generation +# if timestamps of files being archived are not changed. + +# This block is useful for debugging the incremental builds. +# Uncomment it for debugging. +# iter=1 +# if [ ! -f /tmp/iter ]; then echo 1 > /tmp/iter; +# else; iter=$(($(cat /tmp/iter) + 1)); fi +# find $src_file_list -type f | xargs ls -lR > /tmp/src-ls-$iter +# find $obj_file_list -type f | xargs ls -lR > /tmp/obj-ls-$iter + +# modules.order and include/generated/compile.h are ignored because these are +# touched even when none of the source files changed. This causes pointless +# regeneration, so let us ignore them for md5 calculation. +pushd $kroot > /dev/null +src_files_md5="$(find $src_file_list -type f ! -name modules.order | + grep -v "include/generated/compile.h" | + xargs ls -lR | md5sum | cut -d ' ' -f1)" +popd > /dev/null +obj_files_md5="$(find $obj_file_list -type f ! -name modules.order | + grep -v "include/generated/compile.h" | + xargs ls -lR | md5sum | cut -d ' ' -f1)" + +if [ -f $tarfile ]; then tarfile_md5="$(md5sum $tarfile | cut -d ' ' -f1)"; fi +if [ -f kernel/kheaders.md5 ] && + [ "$(cat kernel/kheaders.md5|head -1)" == "$src_files_md5" ] && + [ "$(cat kernel/kheaders.md5|head -2|tail -1)" == "$obj_files_md5" ] && + [ "$(cat kernel/kheaders.md5|tail -1)" == "$tarfile_md5" ]; then + exit +fi + +rm -rf $cpio_dir +mkdir $cpio_dir + +pushd $kroot > /dev/null +for f in $src_file_list; + do find "$f" ! -name "*.c" ! -name "*.o" ! -name "*.cmd" ! -name ".*"; +done | cpio --quiet -pd $cpio_dir +popd > /dev/null + +# The second CPIO can complain if files already exist which can +# happen with out of tree builds. Just silence CPIO for now. +for f in $obj_file_list; + do find "$f" ! -name "*.c" ! -name "*.o" ! -name "*.cmd" ! -name ".*"; +done | cpio --quiet -pd $cpio_dir >/dev/null 2>&1 + +find $cpio_dir -type f -print0 | + xargs -0 -P8 -n1 perl -pi -e 'BEGIN {undef $/;}; s//*((?!SPDX).)*?*///smg;' + +tar -Jcf $tarfile -C $cpio_dir/ . > /dev/null + +echo "$src_files_md5" > kernel/kheaders.md5 +echo "$obj_files_md5" >> kernel/kheaders.md5 +echo "$(md5sum $tarfile | cut -d ' ' -f1)" >> kernel/kheaders.md5 + +rm -rf $cpio_dir