Hello all,
This patch series targets a long-standing BPF usability issue - the lack of general cross-compilation support - by enabling cross-endian usage of libbpf and bpftool, as well as supporting cross-endian build targets for selftests/bpf.
Benefits include improved BPF development and testing for embedded systems based on e.g. big-endian MIPS, more build options e.g for s390x systems, and better accessibility to the very latest test tools e.g. 'test_progs'.
Initial development and testing used mips64, since this arch makes switching the build byte-order trivial and is thus very handy for A/B testing. However, it lacks some key features (bpf2bpf call, kfuncs, etc) making for poor selftests/bpf coverage.
Final testing takes the kernel and selftests/bpf cross-built from x86_64 to s390x, and runs the result under QEMU/s390x. That same configuration could also be used on kernel-patches/bpf CI for regression testing endian support or perhaps load-sharing s390x builds across x86_64 systems.
This thread includes some background regarding testing on QEMU/s390x and the generally favourable results: https://lore.kernel.org/bpf/ZsEcsaa3juxxQBUf@kodidev-ubuntu/
Feedback and suggestions are welcome!
Best regards, Tony
Changelog: --------- v3 -> v4: - fix a use-after-free ELF data-handling error causing rare CI failures - move bswap functions for func/line/core-relo records to internal header - use bswap functions also for info blobs in light skeleton
v2 -> v3: (feedback from Andrii) - improve some log and commit message formatting - restructure BTF.ext endianness safety checks and byte-swapping - use BTF.ext info record definitions for swapping, require BTF v1 - follow BTF API implementation more closely for BTF.ext - explicitly reject loading non-native endianness program into kernel - simplify linker output byte-order setting - drop redundant safety checks during linking - simplify endianness macro and improve blob setup code for light skel - no unexpected test failures after cross-compiling x86_64 -> s390x
v1 -> v2: - fixed a light skeleton bug causing test_progs 'map_ptr' failure - simplified some BTF.ext related endianness logic - remove an 'inline' usage related to CI checkpatch failure - improve some formatting noted by checkpatch warnings - unexpected 'test_progs' failures drop 3 -> 2 (x86_64 to s390x cross)
Tony Ambardar (8): libbpf: Improve log message formatting libbpf: Fix header comment typos for BTF.ext libbpf: Fix output .symtab byte-order during linking libbpf: Support BTF.ext loading and output in either endianness libbpf: Support opening bpf objects of either endianness libbpf: Support linking bpf objects of either endianness libbpf: Support creating light skeleton of either endianness selftests/bpf: Support cross-endian building
tools/lib/bpf/bpf_gen_internal.h | 1 + tools/lib/bpf/btf.c | 196 ++++++++++++++++++++++++--- tools/lib/bpf/btf.h | 3 + tools/lib/bpf/btf_dump.c | 2 +- tools/lib/bpf/btf_relocate.c | 2 +- tools/lib/bpf/gen_loader.c | 187 +++++++++++++++++++------ tools/lib/bpf/libbpf.c | 54 ++++++-- tools/lib/bpf/libbpf.map | 2 + tools/lib/bpf/libbpf_internal.h | 48 ++++++- tools/lib/bpf/linker.c | 92 ++++++++++--- tools/lib/bpf/relo_core.c | 2 +- tools/lib/bpf/skel_internal.h | 3 +- tools/testing/selftests/bpf/Makefile | 7 +- 13 files changed, 502 insertions(+), 97 deletions(-)
Fix missing newlines and extraneous terminal spaces in messages.
Signed-off-by: Tony Ambardar tony.ambardar@gmail.com --- tools/lib/bpf/btf.c | 6 +++--- tools/lib/bpf/btf_dump.c | 2 +- tools/lib/bpf/btf_relocate.c | 2 +- tools/lib/bpf/libbpf.c | 4 ++-- tools/lib/bpf/relo_core.c | 2 +- 5 files changed, 8 insertions(+), 8 deletions(-)
diff --git a/tools/lib/bpf/btf.c b/tools/lib/bpf/btf.c index 32c00db3b91b..f5081de86ee0 100644 --- a/tools/lib/bpf/btf.c +++ b/tools/lib/bpf/btf.c @@ -2940,7 +2940,7 @@ static int btf_ext_setup_info(struct btf_ext *btf_ext,
/* If no records, return failure now so .BTF.ext won't be used. */ if (!info_left) { - pr_debug("%s section in .BTF.ext has no records", ext_sec->desc); + pr_debug("%s section in .BTF.ext has no records\n", ext_sec->desc); return -EINVAL; }
@@ -3028,7 +3028,7 @@ static int btf_ext_parse_hdr(__u8 *data, __u32 data_size)
if (data_size < offsetofend(struct btf_ext_header, hdr_len) || data_size < hdr->hdr_len) { - pr_debug("BTF.ext header not found"); + pr_debug("BTF.ext header not found\n"); return -EINVAL; }
@@ -3290,7 +3290,7 @@ int btf__dedup(struct btf *btf, const struct btf_dedup_opts *opts)
d = btf_dedup_new(btf, opts); if (IS_ERR(d)) { - pr_debug("btf_dedup_new failed: %ld", PTR_ERR(d)); + pr_debug("btf_dedup_new failed: %ld\n", PTR_ERR(d)); return libbpf_err(-EINVAL); }
diff --git a/tools/lib/bpf/btf_dump.c b/tools/lib/bpf/btf_dump.c index 894860111ddb..18cbcf342f2b 100644 --- a/tools/lib/bpf/btf_dump.c +++ b/tools/lib/bpf/btf_dump.c @@ -1304,7 +1304,7 @@ static void btf_dump_emit_type_decl(struct btf_dump *d, __u32 id, * chain, restore stack, emit warning, and try to * proceed nevertheless */ - pr_warn("not enough memory for decl stack:%d", err); + pr_warn("not enough memory for decl stack: %d\n", err); d->decl_stack_cnt = stack_start; return; } diff --git a/tools/lib/bpf/btf_relocate.c b/tools/lib/bpf/btf_relocate.c index 4f7399d85eab..b72f83e15156 100644 --- a/tools/lib/bpf/btf_relocate.c +++ b/tools/lib/bpf/btf_relocate.c @@ -428,7 +428,7 @@ static int btf_relocate_rewrite_strs(struct btf_relocate *r, __u32 i) } else { off = r->str_map[*str_off]; if (!off) { - pr_warn("string '%s' [offset %u] is not mapped to base BTF", + pr_warn("string '%s' [offset %u] is not mapped to base BTF\n", btf__str_by_offset(r->btf, off), *str_off); return -ENOENT; } diff --git a/tools/lib/bpf/libbpf.c b/tools/lib/bpf/libbpf.c index d3a542649e6b..0226d3b50709 100644 --- a/tools/lib/bpf/libbpf.c +++ b/tools/lib/bpf/libbpf.c @@ -12755,7 +12755,7 @@ struct bpf_link *bpf_program__attach_freplace(const struct bpf_program *prog, }
if (prog->type != BPF_PROG_TYPE_EXT) { - pr_warn("prog '%s': only BPF_PROG_TYPE_EXT can attach as freplace", + pr_warn("prog '%s': only BPF_PROG_TYPE_EXT can attach as freplace\n", prog->name); return libbpf_err_ptr(-EINVAL); } @@ -13829,7 +13829,7 @@ int bpf_object__open_subskeleton(struct bpf_object_subskeleton *s) map_type = btf__type_by_id(btf, map_type_id);
if (!btf_is_datasec(map_type)) { - pr_warn("type for map '%1$s' is not a datasec: %2$s", + pr_warn("type for map '%1$s' is not a datasec: %2$s\n", bpf_map__name(map), __btf_kind_str(btf_kind(map_type))); return libbpf_err(-EINVAL); diff --git a/tools/lib/bpf/relo_core.c b/tools/lib/bpf/relo_core.c index 63a4d5ad12d1..7632e9d41827 100644 --- a/tools/lib/bpf/relo_core.c +++ b/tools/lib/bpf/relo_core.c @@ -1339,7 +1339,7 @@ int bpf_core_calc_relo_insn(const char *prog_name, cands->cands[i].id, cand_spec); if (err < 0) { bpf_core_format_spec(spec_buf, sizeof(spec_buf), cand_spec); - pr_warn("prog '%s': relo #%d: error matching candidate #%d %s: %d\n ", + pr_warn("prog '%s': relo #%d: error matching candidate #%d %s: %d\n", prog_name, relo_idx, i, spec_buf, err); return err; }
Mention struct btf_ext_info_sec rather than non-existent btf_sec_func_info in BTF.ext struct documentation.
Signed-off-by: Tony Ambardar tony.ambardar@gmail.com --- tools/lib/bpf/libbpf_internal.h | 4 ++-- 1 file changed, 2 insertions(+), 2 deletions(-)
diff --git a/tools/lib/bpf/libbpf_internal.h b/tools/lib/bpf/libbpf_internal.h index 408df59e0771..8cda511a1982 100644 --- a/tools/lib/bpf/libbpf_internal.h +++ b/tools/lib/bpf/libbpf_internal.h @@ -448,11 +448,11 @@ struct btf_ext_info { * * The func_info subsection layout: * record size for struct bpf_func_info in the func_info subsection - * struct btf_sec_func_info for section #1 + * struct btf_ext_info_sec for section #1 * a list of bpf_func_info records for section #1 * where struct bpf_func_info mimics one in include/uapi/linux/bpf.h * but may not be identical - * struct btf_sec_func_info for section #2 + * struct btf_ext_info_sec for section #2 * a list of bpf_func_info records for section #2 * ...... *
Object linking output data uses the default ELF_T_BYTE type for '.symtab' section data, which disables any libelf-based translation. Explicitly set the ELF_T_SYM type for output to restore libelf's byte-order conversion, noting that input '.symtab' data is already correctly translated.
Fixes: faf6ed321cf6 ("libbpf: Add BPF static linker APIs") Signed-off-by: Tony Ambardar tony.ambardar@gmail.com --- tools/lib/bpf/linker.c | 2 ++ 1 file changed, 2 insertions(+)
diff --git a/tools/lib/bpf/linker.c b/tools/lib/bpf/linker.c index 9cd3d4109788..7489306cd6f7 100644 --- a/tools/lib/bpf/linker.c +++ b/tools/lib/bpf/linker.c @@ -396,6 +396,8 @@ static int init_output_elf(struct bpf_linker *linker, const char *file) pr_warn_elf("failed to create SYMTAB data"); return -EINVAL; } + /* Ensure libelf translates byte-order of symbol records */ + sec->data->d_type = ELF_T_SYM;
str_off = strset__add_str(linker->strtab_strs, sec->sec_name); if (str_off < 0)
On Fri, 2024-08-30 at 00:29 -0700, Tony Ambardar wrote:
Object linking output data uses the default ELF_T_BYTE type for '.symtab' section data, which disables any libelf-based translation. Explicitly set the ELF_T_SYM type for output to restore libelf's byte-order conversion, noting that input '.symtab' data is already correctly translated.
Fixes: faf6ed321cf6 ("libbpf: Add BPF static linker APIs") Signed-off-by: Tony Ambardar tony.ambardar@gmail.com
tools/lib/bpf/linker.c | 2 ++ 1 file changed, 2 insertions(+)
diff --git a/tools/lib/bpf/linker.c b/tools/lib/bpf/linker.c index 9cd3d4109788..7489306cd6f7 100644 --- a/tools/lib/bpf/linker.c +++ b/tools/lib/bpf/linker.c @@ -396,6 +396,8 @@ static int init_output_elf(struct bpf_linker *linker, const char *file) pr_warn_elf("failed to create SYMTAB data"); return -EINVAL; }
- /* Ensure libelf translates byte-order of symbol records */
- sec->data->d_type = ELF_T_SYM;
I tried grepping through libelf to find out how this affects things, and identified that it is primarily used by elfutils/libelf/gelf_xlatetof.c:gelf_xlatetof(), which is an interface function and we don't seem to use it. It is also used by dwfl_* functions while applying relocations, but we don't use that either.
Could you please elaborate a bit on effects of this change?
str_off = strset__add_str(linker->strtab_strs, sec->sec_name); if (str_off < 0)
On Fri, Aug 30, 2024 at 03:15:10PM -0700, Eduard Zingerman wrote:
On Fri, 2024-08-30 at 00:29 -0700, Tony Ambardar wrote:
Object linking output data uses the default ELF_T_BYTE type for '.symtab' section data, which disables any libelf-based translation. Explicitly set the ELF_T_SYM type for output to restore libelf's byte-order conversion, noting that input '.symtab' data is already correctly translated.
Fixes: faf6ed321cf6 ("libbpf: Add BPF static linker APIs") Signed-off-by: Tony Ambardar tony.ambardar@gmail.com
tools/lib/bpf/linker.c | 2 ++ 1 file changed, 2 insertions(+)
diff --git a/tools/lib/bpf/linker.c b/tools/lib/bpf/linker.c index 9cd3d4109788..7489306cd6f7 100644 --- a/tools/lib/bpf/linker.c +++ b/tools/lib/bpf/linker.c @@ -396,6 +396,8 @@ static int init_output_elf(struct bpf_linker *linker, const char *file) pr_warn_elf("failed to create SYMTAB data"); return -EINVAL; }
- /* Ensure libelf translates byte-order of symbol records */
- sec->data->d_type = ELF_T_SYM;
I tried grepping through libelf to find out how this affects things, and identified that it is primarily used by elfutils/libelf/gelf_xlatetof.c:gelf_xlatetof(), which is an interface function and we don't seem to use it. It is also used by dwfl_* functions while applying relocations, but we don't use that either.
Right, gelf_xlatetof() is exposed for _explicit_ user conversions, but libelf still does translations implicitly for known section record types, based on the ELF file's byte-order metadata. The idea is that ELF data loaded in memory will be native-endianness for accessibility, but output in the original endianness at rest/in a file, all transparently.
We try to follow the same idea in libbpf when opening and writing .BTF and .BTF.ext data (e.g. see the *_raw_data() funcs).
Could you please elaborate a bit on effects of this change?
When linking objects of either endianness, libelf can translate the input files based on ELF headers (endianness and type ELF_T_SYM) and allows us to process .symtab data. When writing out the linked file however, we create a new .symtab section in init_output_elf() but leave it as default ELT_T_BYTE type, which undergoes no translation and leaves .symtab always in native byte-order regardless of target endianness.
See also 61e8aeda9398 ("bpf: Fix libelf endian handling in resolv_btfids") and related links for a similar example and explanations. Hope that helps.
Cheers, Tony
str_off = strset__add_str(linker->strtab_strs, sec->sec_name); if (str_off < 0)
Support for handling BTF data of either endianness was added in [1], but did not include BTF.ext data for lack of use cases. Later, support for static linking [2] provided a use case, but this feature and later ones were restricted to native-endian usage.
Add support for BTF.ext handling in either endianness. Convert BTF.ext data to native endianness when read into memory for further processing, and support raw data access that restores the original byte-order for output. Add internal header functions for byte-swapping func, line, and core info records.
Add new API functions btf_ext__endianness() and btf_ext__set_endianness() for query and setting byte-order, as already exist for BTF data.
[1] 3289959b97ca ("libbpf: Support BTF loading and raw data output in both endianness") [2] 8fd27bf69b86 ("libbpf: Add BPF static linker BTF and BTF.ext support")
Signed-off-by: Tony Ambardar tony.ambardar@gmail.com --- tools/lib/bpf/btf.c | 192 +++++++++++++++++++++++++++++--- tools/lib/bpf/btf.h | 3 + tools/lib/bpf/libbpf.map | 2 + tools/lib/bpf/libbpf_internal.h | 33 ++++++ 4 files changed, 214 insertions(+), 16 deletions(-)
diff --git a/tools/lib/bpf/btf.c b/tools/lib/bpf/btf.c index f5081de86ee0..064cfe126c09 100644 --- a/tools/lib/bpf/btf.c +++ b/tools/lib/bpf/btf.c @@ -3022,25 +3022,102 @@ static int btf_ext_setup_core_relos(struct btf_ext *btf_ext) return btf_ext_setup_info(btf_ext, ¶m); }
-static int btf_ext_parse_hdr(__u8 *data, __u32 data_size) +/* Swap byte-order of BTF.ext header with any endianness */ +static void btf_ext_bswap_hdr(struct btf_ext *btf_ext, __u32 hdr_len) { - const struct btf_ext_header *hdr = (struct btf_ext_header *)data; + struct btf_ext_header *h = btf_ext->hdr;
- if (data_size < offsetofend(struct btf_ext_header, hdr_len) || - data_size < hdr->hdr_len) { - pr_debug("BTF.ext header not found\n"); + h->magic = bswap_16(h->magic); + h->hdr_len = bswap_32(h->hdr_len); + h->func_info_off = bswap_32(h->func_info_off); + h->func_info_len = bswap_32(h->func_info_len); + h->line_info_off = bswap_32(h->line_info_off); + h->line_info_len = bswap_32(h->line_info_len); + + if (hdr_len < offsetofend(struct btf_ext_header, core_relo_len)) + return; + + h->core_relo_off = bswap_32(h->core_relo_off); + h->core_relo_len = bswap_32(h->core_relo_len); +} + +/* Swap byte-order of a generic info subsection */ +static void info_subsec_bswap(const struct btf_ext_header *hdr, bool native, + __u32 off, __u32 len, anon_info_bswap_fn_t bswap) +{ + __u32 left, i, *rs, rec_size, num_info; + struct btf_ext_info_sec *si; + void *p; + + if (len == 0) + return; + + rs = (void *)hdr + hdr->hdr_len + off; /* record size */ + si = (void *)rs + sizeof(__u32); /* sec info #1 */ + rec_size = native ? *rs : bswap_32(*rs); + *rs = bswap_32(*rs); + left = len - sizeof(__u32); + while (left > 0) { + num_info = native ? si->num_info : bswap_32(si->num_info); + si->sec_name_off = bswap_32(si->sec_name_off); + si->num_info = bswap_32(si->num_info); + left -= offsetof(struct btf_ext_info_sec, data); + p = si->data; + for (i = 0; i < num_info; i++) /* list of records */ + p += bswap(p); + si = p; + left -= rec_size * num_info; + } +} + +/* + * Swap endianness of the whole info segment in a BTF.ext data section: + * - requires BTF.ext header data in native byte order + * - only support info structs from BTF version 1 + * - native: current info data is native endianness + */ +static void btf_ext_bswap_info(struct btf_ext *btf_ext, bool native) +{ + const struct btf_ext_header *hdr = btf_ext->hdr; + + /* Swap func_info subsection byte-order */ + info_subsec_bswap(hdr, native, hdr->func_info_off, hdr->func_info_len, + (anon_info_bswap_fn_t)bpf_func_info_bswap); + + /* Swap line_info subsection byte-order */ + info_subsec_bswap(hdr, native, hdr->line_info_off, hdr->line_info_len, + (anon_info_bswap_fn_t)bpf_line_info_bswap); + + /* Swap core_relo subsection byte-order (if present) */ + if (hdr->hdr_len < offsetofend(struct btf_ext_header, core_relo_len)) + return; + + info_subsec_bswap(hdr, native, hdr->core_relo_off, hdr->core_relo_len, + (anon_info_bswap_fn_t)bpf_core_relo_bswap); +} + +/* Validate hdr data & info sections, convert to native endianness */ +static int btf_ext_parse(struct btf_ext *btf_ext) +{ + struct btf_ext_header *hdr = btf_ext->hdr; + __u32 hdr_len, info_size, data_size = btf_ext->data_size; + + if (data_size < offsetofend(struct btf_ext_header, hdr_len)) { + pr_debug("BTF.ext header too short\n"); return -EINVAL; }
+ hdr_len = hdr->hdr_len; if (hdr->magic == bswap_16(BTF_MAGIC)) { - pr_warn("BTF.ext in non-native endianness is not supported\n"); - return -ENOTSUP; + btf_ext->swapped_endian = true; + hdr_len = bswap_32(hdr_len); } else if (hdr->magic != BTF_MAGIC) { pr_debug("Invalid BTF.ext magic:%x\n", hdr->magic); return -EINVAL; }
- if (hdr->version != BTF_VERSION) { + /* Ensure known version of structs, current BTF_VERSION == 1 */ + if (hdr->version != 1) { pr_debug("Unsupported BTF.ext version:%u\n", hdr->version); return -ENOTSUP; } @@ -3050,11 +3127,42 @@ static int btf_ext_parse_hdr(__u8 *data, __u32 data_size) return -ENOTSUP; }
- if (data_size == hdr->hdr_len) { + if (data_size < hdr_len) { + pr_debug("BTF.ext header not found\n"); + return -EINVAL; + } else if (data_size == hdr_len) { pr_debug("BTF.ext has no data\n"); return -EINVAL; }
+ /* Verify mandatory hdr info details present */ + if (hdr_len < offsetofend(struct btf_ext_header, line_info_len)) { + pr_warn("BTF.ext header missing func_info, line_info\n"); + return -EINVAL; + } + + /* Keep hdr native byte-order in memory for introspection */ + if (btf_ext->swapped_endian) + btf_ext_bswap_hdr(btf_ext, hdr_len); + + /* Basic info section consistency checks*/ + info_size = btf_ext->data_size - hdr_len; + if (info_size & 0x03) { + pr_warn("BTF.ext info size not 4-byte multiple\n"); + return -EINVAL; + } + info_size -= hdr->func_info_len + hdr->line_info_len; + if (hdr_len >= offsetofend(struct btf_ext_header, core_relo_len)) + info_size -= hdr->core_relo_len; + if (info_size) { + pr_warn("BTF.ext info size mismatch with header data\n"); + return -EINVAL; + } + + /* Keep infos native byte-order in memory for introspection */ + if (btf_ext->swapped_endian) + btf_ext_bswap_info(btf_ext, !btf_ext->swapped_endian); + return 0; }
@@ -3066,6 +3174,7 @@ void btf_ext__free(struct btf_ext *btf_ext) free(btf_ext->line_info.sec_idxs); free(btf_ext->core_relo_info.sec_idxs); free(btf_ext->data); + free(btf_ext->data_swapped); free(btf_ext); }
@@ -3086,15 +3195,10 @@ struct btf_ext *btf_ext__new(const __u8 *data, __u32 size) } memcpy(btf_ext->data, data, size);
- err = btf_ext_parse_hdr(btf_ext->data, size); + err = btf_ext_parse(btf_ext); if (err) goto done;
- if (btf_ext->hdr->hdr_len < offsetofend(struct btf_ext_header, line_info_len)) { - err = -EINVAL; - goto done; - } - err = btf_ext_setup_func_info(btf_ext); if (err) goto done; @@ -3119,15 +3223,71 @@ struct btf_ext *btf_ext__new(const __u8 *data, __u32 size) return btf_ext; }
+static void *btf_ext_raw_data(const struct btf_ext *btf_ext_ro, __u32 *size, + bool swap_endian) +{ + struct btf_ext *btf_ext = (struct btf_ext *)btf_ext_ro; + const __u32 data_sz = btf_ext->data_size; + void *data; + + data = swap_endian ? btf_ext->data_swapped : btf_ext->data; + if (data) { + *size = data_sz; + return data; + } + + data = calloc(1, data_sz); + if (!data) + return NULL; + memcpy(data, btf_ext->data, data_sz); + + if (swap_endian) { + btf_ext_bswap_info(btf_ext, true); + btf_ext_bswap_hdr(btf_ext, btf_ext->hdr->hdr_len); + btf_ext->data_swapped = data; + } + + *size = data_sz; + return data; +} + const void *btf_ext__raw_data(const struct btf_ext *btf_ext, __u32 *size) { + __u32 data_sz; + void *data; + + data = btf_ext_raw_data(btf_ext, &data_sz, btf_ext->swapped_endian); + if (!data) + return errno = ENOMEM, NULL; + *size = btf_ext->data_size; - return btf_ext->data; + return data; }
__attribute__((alias("btf_ext__raw_data"))) const void *btf_ext__get_raw_data(const struct btf_ext *btf_ext, __u32 *size);
+enum btf_endianness btf_ext__endianness(const struct btf_ext *btf_ext) +{ + if (is_host_big_endian()) + return btf_ext->swapped_endian ? BTF_LITTLE_ENDIAN : BTF_BIG_ENDIAN; + else + return btf_ext->swapped_endian ? BTF_BIG_ENDIAN : BTF_LITTLE_ENDIAN; +} + +int btf_ext__set_endianness(struct btf_ext *btf_ext, enum btf_endianness endian) +{ + if (endian != BTF_LITTLE_ENDIAN && endian != BTF_BIG_ENDIAN) + return libbpf_err(-EINVAL); + + btf_ext->swapped_endian = is_host_big_endian() != (endian == BTF_BIG_ENDIAN); + + if (!btf_ext->swapped_endian) { + free(btf_ext->data_swapped); + btf_ext->data_swapped = NULL; + } + return 0; +}
struct btf_dedup;
diff --git a/tools/lib/bpf/btf.h b/tools/lib/bpf/btf.h index b68d216837a9..e3cf91687c78 100644 --- a/tools/lib/bpf/btf.h +++ b/tools/lib/bpf/btf.h @@ -167,6 +167,9 @@ LIBBPF_API const char *btf__str_by_offset(const struct btf *btf, __u32 offset); LIBBPF_API struct btf_ext *btf_ext__new(const __u8 *data, __u32 size); LIBBPF_API void btf_ext__free(struct btf_ext *btf_ext); LIBBPF_API const void *btf_ext__raw_data(const struct btf_ext *btf_ext, __u32 *size); +LIBBPF_API enum btf_endianness btf_ext__endianness(const struct btf_ext *btf_ext); +LIBBPF_API int btf_ext__set_endianness(struct btf_ext *btf_ext, + enum btf_endianness endian);
LIBBPF_API int btf__find_str(struct btf *btf, const char *s); LIBBPF_API int btf__add_str(struct btf *btf, const char *s); diff --git a/tools/lib/bpf/libbpf.map b/tools/lib/bpf/libbpf.map index 8f0d9ea3b1b4..5c17632807b6 100644 --- a/tools/lib/bpf/libbpf.map +++ b/tools/lib/bpf/libbpf.map @@ -421,6 +421,8 @@ LIBBPF_1.5.0 { global: btf__distill_base; btf__relocate; + btf_ext__endianness; + btf_ext__set_endianness; bpf_map__autoattach; bpf_map__set_autoattach; bpf_program__attach_sockmap; diff --git a/tools/lib/bpf/libbpf_internal.h b/tools/lib/bpf/libbpf_internal.h index 8cda511a1982..81d375015c2b 100644 --- a/tools/lib/bpf/libbpf_internal.h +++ b/tools/lib/bpf/libbpf_internal.h @@ -484,6 +484,8 @@ struct btf_ext { struct btf_ext_header *hdr; void *data; }; + void *data_swapped; + bool swapped_endian; struct btf_ext_info func_info; struct btf_ext_info line_info; struct btf_ext_info core_relo_info; @@ -511,6 +513,37 @@ struct bpf_line_info_min { __u32 line_col; };
+/* Functions/typedef to help byte-swap info records, returning their size */ + +typedef int (*anon_info_bswap_fn_t)(void *); + +static inline int bpf_func_info_bswap(struct bpf_func_info *i) +{ + i->insn_off = bswap_32(i->insn_off); + i->type_id = bswap_32(i->type_id); + return sizeof(*i); +} + +static inline int bpf_line_info_bswap(struct bpf_line_info *i) +{ + i->insn_off = bswap_32(i->insn_off); + i->file_name_off = bswap_32(i->file_name_off); + i->line_off = bswap_32(i->line_off); + i->line_col = bswap_32(i->line_col); + return sizeof(*i); +} + +static inline int bpf_core_relo_bswap(struct bpf_core_relo *i) +{ + _Static_assert(sizeof(i->kind) == sizeof(__u32), + "enum bpf_core_relo_kind is not 32-bit\n"); + i->insn_off = bswap_32(i->insn_off); + i->type_id = bswap_32(i->type_id); + i->access_str_off = bswap_32(i->access_str_off); + i->kind = bswap_32(i->kind); + return sizeof(*i); +} + enum btf_field_iter_kind { BTF_FIELD_ITER_IDS, BTF_FIELD_ITER_STRS,
On Fri, Aug 30, 2024 at 12:30 AM Tony Ambardar tony.ambardar@gmail.com wrote:
Support for handling BTF data of either endianness was added in [1], but did not include BTF.ext data for lack of use cases. Later, support for static linking [2] provided a use case, but this feature and later ones were restricted to native-endian usage.
Add support for BTF.ext handling in either endianness. Convert BTF.ext data to native endianness when read into memory for further processing, and support raw data access that restores the original byte-order for output. Add internal header functions for byte-swapping func, line, and core info records.
Add new API functions btf_ext__endianness() and btf_ext__set_endianness() for query and setting byte-order, as already exist for BTF data.
[1] 3289959b97ca ("libbpf: Support BTF loading and raw data output in both endianness") [2] 8fd27bf69b86 ("libbpf: Add BPF static linker BTF and BTF.ext support")
Signed-off-by: Tony Ambardar tony.ambardar@gmail.com
tools/lib/bpf/btf.c | 192 +++++++++++++++++++++++++++++--- tools/lib/bpf/btf.h | 3 + tools/lib/bpf/libbpf.map | 2 + tools/lib/bpf/libbpf_internal.h | 33 ++++++ 4 files changed, 214 insertions(+), 16 deletions(-)
diff --git a/tools/lib/bpf/btf.c b/tools/lib/bpf/btf.c index f5081de86ee0..064cfe126c09 100644 --- a/tools/lib/bpf/btf.c +++ b/tools/lib/bpf/btf.c @@ -3022,25 +3022,102 @@ static int btf_ext_setup_core_relos(struct btf_ext *btf_ext) return btf_ext_setup_info(btf_ext, ¶m); }
-static int btf_ext_parse_hdr(__u8 *data, __u32 data_size) +/* Swap byte-order of BTF.ext header with any endianness */ +static void btf_ext_bswap_hdr(struct btf_ext *btf_ext, __u32 hdr_len) {
const struct btf_ext_header *hdr = (struct btf_ext_header *)data;
struct btf_ext_header *h = btf_ext->hdr;
if (data_size < offsetofend(struct btf_ext_header, hdr_len) ||
data_size < hdr->hdr_len) {
pr_debug("BTF.ext header not found\n");
h->magic = bswap_16(h->magic);
h->hdr_len = bswap_32(h->hdr_len);
h->func_info_off = bswap_32(h->func_info_off);
h->func_info_len = bswap_32(h->func_info_len);
h->line_info_off = bswap_32(h->line_info_off);
h->line_info_len = bswap_32(h->line_info_len);
if (hdr_len < offsetofend(struct btf_ext_header, core_relo_len))
return;
h->core_relo_off = bswap_32(h->core_relo_off);
h->core_relo_len = bswap_32(h->core_relo_len);
+}
+/* Swap byte-order of a generic info subsection */ +static void info_subsec_bswap(const struct btf_ext_header *hdr, bool native,
__u32 off, __u32 len, anon_info_bswap_fn_t bswap)
ok, so I'm not a fan of this bswap callback, tbh. Also, we don't really enforce that each kind of record has exact size we expect (i.e., bpf_line_info_min and bpf_func_info_min shouldn't be "min" for byte-swapped case, it should be exact).
How about this slight modification: split byte swapping of sections/subsection metadata, so we adjust record size, sec_name_off and num_info separately from adjusting each record.
Once this swapping is done we:
a) validate record size for each section is expected (according to its type, of course) b) we can then use for_each_btf_ext_sec() and for_each_btf_ext_rec() macro (which assume proper in-memory metadata byte order) and then hard-code swapping of each record fields
No callbacks.
This has also a benefit of not needing this annoying "bool native" flag when producing raw bytes. We just ensure proper order of operation:
a) swap records b) swap metadata (so just mirrored order from initialization)
WDYT?
pw-bot: cr
+{
__u32 left, i, *rs, rec_size, num_info;
struct btf_ext_info_sec *si;
void *p;
if (len == 0)
return;
rs = (void *)hdr + hdr->hdr_len + off; /* record size */
si = (void *)rs + sizeof(__u32); /* sec info #1 */
rec_size = native ? *rs : bswap_32(*rs);
*rs = bswap_32(*rs);
left = len - sizeof(__u32);
while (left > 0) {
num_info = native ? si->num_info : bswap_32(si->num_info);
si->sec_name_off = bswap_32(si->sec_name_off);
si->num_info = bswap_32(si->num_info);
left -= offsetof(struct btf_ext_info_sec, data);
p = si->data;
for (i = 0; i < num_info; i++) /* list of records */
p += bswap(p);
si = p;
left -= rec_size * num_info;
nit: extra space here
}
+}
[...]
On Fri, Aug 30, 2024 at 02:14:19PM -0700, Andrii Nakryiko wrote:
On Fri, Aug 30, 2024 at 12:30 AM Tony Ambardar tony.ambardar@gmail.com wrote:
Support for handling BTF data of either endianness was added in [1], but did not include BTF.ext data for lack of use cases. Later, support for static linking [2] provided a use case, but this feature and later ones were restricted to native-endian usage.
Add support for BTF.ext handling in either endianness. Convert BTF.ext data to native endianness when read into memory for further processing, and support raw data access that restores the original byte-order for output. Add internal header functions for byte-swapping func, line, and core info records.
Add new API functions btf_ext__endianness() and btf_ext__set_endianness() for query and setting byte-order, as already exist for BTF data.
[1] 3289959b97ca ("libbpf: Support BTF loading and raw data output in both endianness") [2] 8fd27bf69b86 ("libbpf: Add BPF static linker BTF and BTF.ext support")
Signed-off-by: Tony Ambardar tony.ambardar@gmail.com
tools/lib/bpf/btf.c | 192 +++++++++++++++++++++++++++++--- tools/lib/bpf/btf.h | 3 + tools/lib/bpf/libbpf.map | 2 + tools/lib/bpf/libbpf_internal.h | 33 ++++++ 4 files changed, 214 insertions(+), 16 deletions(-)
diff --git a/tools/lib/bpf/btf.c b/tools/lib/bpf/btf.c index f5081de86ee0..064cfe126c09 100644 --- a/tools/lib/bpf/btf.c +++ b/tools/lib/bpf/btf.c @@ -3022,25 +3022,102 @@ static int btf_ext_setup_core_relos(struct btf_ext *btf_ext) return btf_ext_setup_info(btf_ext, ¶m); }
-static int btf_ext_parse_hdr(__u8 *data, __u32 data_size) +/* Swap byte-order of BTF.ext header with any endianness */ +static void btf_ext_bswap_hdr(struct btf_ext *btf_ext, __u32 hdr_len) {
const struct btf_ext_header *hdr = (struct btf_ext_header *)data;
struct btf_ext_header *h = btf_ext->hdr;
if (data_size < offsetofend(struct btf_ext_header, hdr_len) ||
data_size < hdr->hdr_len) {
pr_debug("BTF.ext header not found\n");
h->magic = bswap_16(h->magic);
h->hdr_len = bswap_32(h->hdr_len);
h->func_info_off = bswap_32(h->func_info_off);
h->func_info_len = bswap_32(h->func_info_len);
h->line_info_off = bswap_32(h->line_info_off);
h->line_info_len = bswap_32(h->line_info_len);
if (hdr_len < offsetofend(struct btf_ext_header, core_relo_len))
return;
h->core_relo_off = bswap_32(h->core_relo_off);
h->core_relo_len = bswap_32(h->core_relo_len);
+}
+/* Swap byte-order of a generic info subsection */ +static void info_subsec_bswap(const struct btf_ext_header *hdr, bool native,
__u32 off, __u32 len, anon_info_bswap_fn_t bswap)
ok, so I'm not a fan of this bswap callback, tbh. Also, we don't really enforce that each kind of record has exact size we expect (i.e., bpf_line_info_min and bpf_func_info_min shouldn't be "min" for byte-swapped case, it should be exact).
How about this slight modification: split byte swapping of sections/subsection metadata, so we adjust record size, sec_name_off and num_info separately from adjusting each record.
Hmmm, the bulk of code needed is to parse the metadata, with only 2 lines used to go through records. Splitting per above would add unnecessary duplication it seems, no?
Once this swapping is done we:
a) validate record size for each section is expected (according to its type, of course)
This is a good point I overlooked, and needs doing in any case.
b) we can then use for_each_btf_ext_sec() and for_each_btf_ext_rec() macro (which assume proper in-memory metadata byte order) and then hard-code swapping of each record fields
How easily can we use these macros? Consider the current call chain:
btf_ext__new btf_ext_parse btf_ext_bswap_hdr (1) btf_ext_bswap_info (2) btf_ext_setup_func_info btf_ext_setup_line_info btf_ext_setup_core_relos (3) btf_ext__raw_data btf_ext_bswap_info (4) btf_ext_bswap_hdr
The macros iterate on 'struct btf_ext_info' instances in 'struct btf_ext' but these are only set up after (3) it seems and unavailable at (2). I suppose they could be used with some sort of kludge but unsure how well they'll work.
No callbacks.
This has also a benefit of not needing this annoying "bool native" flag when producing raw bytes. We just ensure proper order of operation:
a) swap records b) swap metadata (so just mirrored order from initialization)
How does that work? If we split up btf_ext_bswap_info(), after (1) btf_ext->swapped_endian is set and btf_ext->hdr->magic is swapped, so at (2) it's not possible to tell the current info data byte order without some hinting.
But maybe if we defer setting btf_ext->swapped_endian until after (b) we can drop the "bool native" thanks to symmetry breaking. Let me check.
WDYT?
Adding a record_size check is definitely needed.
But I have trouble seeing how splitting bswap of info metadata/records would yield something simpler and cleaner than the callbacks. What if they were passed via a descriptor, as in btf_ext_setup_func_info()? I think I need to play around with this a while and see..
It would also help me if you'd elaborate on the drawbacks you see of using callbacks, given I see then in other parts of libbpf.
pw-bot: cr
+{
__u32 left, i, *rs, rec_size, num_info;
struct btf_ext_info_sec *si;
void *p;
if (len == 0)
return;
rs = (void *)hdr + hdr->hdr_len + off; /* record size */
si = (void *)rs + sizeof(__u32); /* sec info #1 */
rec_size = native ? *rs : bswap_32(*rs);
*rs = bswap_32(*rs);
left = len - sizeof(__u32);
while (left > 0) {
num_info = native ? si->num_info : bswap_32(si->num_info);
si->sec_name_off = bswap_32(si->sec_name_off);
si->num_info = bswap_32(si->num_info);
left -= offsetof(struct btf_ext_info_sec, data);
p = si->data;
for (i = 0; i < num_info; i++) /* list of records */
p += bswap(p);
si = p;
left -= rec_size * num_info;
nit: extra space here
Fixed, thanks.
}
+}
[...]
On Mon, Sep 2, 2024 at 1:19 AM Tony Ambardar tony.ambardar@gmail.com wrote:
On Fri, Aug 30, 2024 at 02:14:19PM -0700, Andrii Nakryiko wrote:
On Fri, Aug 30, 2024 at 12:30 AM Tony Ambardar tony.ambardar@gmail.com wrote:
Support for handling BTF data of either endianness was added in [1], but did not include BTF.ext data for lack of use cases. Later, support for static linking [2] provided a use case, but this feature and later ones were restricted to native-endian usage.
Add support for BTF.ext handling in either endianness. Convert BTF.ext data to native endianness when read into memory for further processing, and support raw data access that restores the original byte-order for output. Add internal header functions for byte-swapping func, line, and core info records.
Add new API functions btf_ext__endianness() and btf_ext__set_endianness() for query and setting byte-order, as already exist for BTF data.
[1] 3289959b97ca ("libbpf: Support BTF loading and raw data output in both endianness") [2] 8fd27bf69b86 ("libbpf: Add BPF static linker BTF and BTF.ext support")
Signed-off-by: Tony Ambardar tony.ambardar@gmail.com
tools/lib/bpf/btf.c | 192 +++++++++++++++++++++++++++++--- tools/lib/bpf/btf.h | 3 + tools/lib/bpf/libbpf.map | 2 + tools/lib/bpf/libbpf_internal.h | 33 ++++++ 4 files changed, 214 insertions(+), 16 deletions(-)
diff --git a/tools/lib/bpf/btf.c b/tools/lib/bpf/btf.c index f5081de86ee0..064cfe126c09 100644 --- a/tools/lib/bpf/btf.c +++ b/tools/lib/bpf/btf.c @@ -3022,25 +3022,102 @@ static int btf_ext_setup_core_relos(struct btf_ext *btf_ext) return btf_ext_setup_info(btf_ext, ¶m); }
-static int btf_ext_parse_hdr(__u8 *data, __u32 data_size) +/* Swap byte-order of BTF.ext header with any endianness */ +static void btf_ext_bswap_hdr(struct btf_ext *btf_ext, __u32 hdr_len) {
const struct btf_ext_header *hdr = (struct btf_ext_header *)data;
struct btf_ext_header *h = btf_ext->hdr;
if (data_size < offsetofend(struct btf_ext_header, hdr_len) ||
data_size < hdr->hdr_len) {
pr_debug("BTF.ext header not found\n");
h->magic = bswap_16(h->magic);
h->hdr_len = bswap_32(h->hdr_len);
h->func_info_off = bswap_32(h->func_info_off);
h->func_info_len = bswap_32(h->func_info_len);
h->line_info_off = bswap_32(h->line_info_off);
h->line_info_len = bswap_32(h->line_info_len);
if (hdr_len < offsetofend(struct btf_ext_header, core_relo_len))
return;
h->core_relo_off = bswap_32(h->core_relo_off);
h->core_relo_len = bswap_32(h->core_relo_len);
+}
+/* Swap byte-order of a generic info subsection */ +static void info_subsec_bswap(const struct btf_ext_header *hdr, bool native,
__u32 off, __u32 len, anon_info_bswap_fn_t bswap)
ok, so I'm not a fan of this bswap callback, tbh. Also, we don't really enforce that each kind of record has exact size we expect (i.e., bpf_line_info_min and bpf_func_info_min shouldn't be "min" for byte-swapped case, it should be exact).
How about this slight modification: split byte swapping of sections/subsection metadata, so we adjust record size, sec_name_off and num_info separately from adjusting each record.
Hmmm, the bulk of code needed is to parse the metadata, with only 2 lines used to go through records. Splitting per above would add unnecessary duplication it seems, no?
Once this swapping is done we:
a) validate record size for each section is expected (according to its type, of course)
This is a good point I overlooked, and needs doing in any case.
b) we can then use for_each_btf_ext_sec() and for_each_btf_ext_rec() macro (which assume proper in-memory metadata byte order) and then hard-code swapping of each record fields
How easily can we use these macros? Consider the current call chain:
Not that easily, turns out, because a) it acquires data pointer implicitly, which makes it hard for btf_ext_raw_data() and b) it accesses sec->num_info and doesn't cache it, so we'd need an extra local variable to keep it if we were to swap it in raw data.
btf_ext__new btf_ext_parse btf_ext_bswap_hdr (1) btf_ext_bswap_info (2) btf_ext_setup_func_info btf_ext_setup_line_info btf_ext_setup_core_relos (3)
btf_ext__raw_data btf_ext_bswap_info (4) btf_ext_bswap_hdr
The macros iterate on 'struct btf_ext_info' instances in 'struct btf_ext' but these are only set up after (3) it seems and unavailable at (2). I suppose they could be used with some sort of kludge but unsure how well they'll work.
No callbacks.
This has also a benefit of not needing this annoying "bool native" flag when producing raw bytes. We just ensure proper order of operation:
a) swap records b) swap metadata (so just mirrored order from initialization)
How does that work? If we split up btf_ext_bswap_info(), after (1) btf_ext->swapped_endian is set and btf_ext->hdr->magic is swapped, so at (2) it's not possible to tell the current info data byte order without some hinting.
But maybe if we defer setting btf_ext->swapped_endian until after (b) we can drop the "bool native" thanks to symmetry breaking. Let me check.
WDYT?
Adding a record_size check is definitely needed.
But I have trouble seeing how splitting bswap of info metadata/records would yield something simpler and cleaner than the callbacks. What if they were passed via a descriptor, as in btf_ext_setup_func_info()? I think I need to play around with this a while and see..
It would also help me if you'd elaborate on the drawbacks you see of using callbacks, given I see then in other parts of libbpf.
I replied to your latest patches. I generally dislike callbacks as they make following the code harder. If it's possible to not use callbacks with reasonable simplicity, I'll always go for that. But it's ok, given those existing iteration macros are a bit assuming about data and its endianness, it's hard to use them.
pw-bot: cr
+{
__u32 left, i, *rs, rec_size, num_info;
struct btf_ext_info_sec *si;
void *p;
if (len == 0)
return;
rs = (void *)hdr + hdr->hdr_len + off; /* record size */
si = (void *)rs + sizeof(__u32); /* sec info #1 */
rec_size = native ? *rs : bswap_32(*rs);
*rs = bswap_32(*rs);
left = len - sizeof(__u32);
while (left > 0) {
num_info = native ? si->num_info : bswap_32(si->num_info);
si->sec_name_off = bswap_32(si->sec_name_off);
si->num_info = bswap_32(si->num_info);
left -= offsetof(struct btf_ext_info_sec, data);
p = si->data;
for (i = 0; i < num_info; i++) /* list of records */
p += bswap(p);
si = p;
left -= rec_size * num_info;
nit: extra space here
Fixed, thanks.
}
+}
[...]
On Fri, 2024-08-30 at 00:29 -0700, Tony Ambardar wrote:
[...]
@@ -3050,11 +3127,42 @@ static int btf_ext_parse_hdr(__u8 *data, __u32 data_size) return -ENOTSUP; }
- if (data_size == hdr->hdr_len) {
- if (data_size < hdr_len) {
pr_debug("BTF.ext header not found\n");
return -EINVAL;
- } else if (data_size == hdr_len) { pr_debug("BTF.ext has no data\n"); return -EINVAL; }
- /* Verify mandatory hdr info details present */
- if (hdr_len < offsetofend(struct btf_ext_header, line_info_len)) {
pr_warn("BTF.ext header missing func_info, line_info\n");
return -EINVAL;
- }
- /* Keep hdr native byte-order in memory for introspection */
- if (btf_ext->swapped_endian)
btf_ext_bswap_hdr(btf_ext, hdr_len);
- /* Basic info section consistency checks*/
- info_size = btf_ext->data_size - hdr_len;
- if (info_size & 0x03) {
pr_warn("BTF.ext info size not 4-byte multiple\n");
return -EINVAL;
- }
- info_size -= hdr->func_info_len + hdr->line_info_len;
- if (hdr_len >= offsetofend(struct btf_ext_header, core_relo_len))
info_size -= hdr->core_relo_len;
nit: Since we are checking this, maybe also check that sections do not overlap? Also, why disallowing gaps between sections?
- if (info_size) {
pr_warn("BTF.ext info size mismatch with header data\n");
return -EINVAL;
- }
- /* Keep infos native byte-order in memory for introspection */
- if (btf_ext->swapped_endian)
btf_ext_bswap_info(btf_ext, !btf_ext->swapped_endian);
- return 0;
}
[...]
@@ -3119,15 +3223,71 @@ struct btf_ext *btf_ext__new(const __u8 *data, __u32 size) return btf_ext; } +static void *btf_ext_raw_data(const struct btf_ext *btf_ext_ro, __u32 *size,
bool swap_endian)
+{
- struct btf_ext *btf_ext = (struct btf_ext *)btf_ext_ro;
- const __u32 data_sz = btf_ext->data_size;
- void *data;
- data = swap_endian ? btf_ext->data_swapped : btf_ext->data;
- if (data) {
*size = data_sz;
return data;
- }
- data = calloc(1, data_sz);
- if (!data)
return NULL;
- memcpy(data, btf_ext->data, data_sz);
- if (swap_endian) {
btf_ext_bswap_info(btf_ext, true);
btf_ext_bswap_hdr(btf_ext, btf_ext->hdr->hdr_len);
btf_ext->data_swapped = data;
- }
Nit: I don't like how this function is organized: - if btf_ext->data can't be NULL swap_endian == true at this point; - if btf_ext->data can be NULL and swap_endian == false pointer to `data` would be lost.
I assume that btf_ext->data can't be null, basing on the btf_ext__new(), but function body is a bit confusing.
- *size = data_sz;
- return data;
+}
[...]
On Fri, Aug 30, 2024 at 05:15:06PM -0700, Eduard Zingerman wrote:
On Fri, 2024-08-30 at 00:29 -0700, Tony Ambardar wrote:
[...]
@@ -3050,11 +3127,42 @@ static int btf_ext_parse_hdr(__u8 *data, __u32 data_size) return -ENOTSUP; }
- if (data_size == hdr->hdr_len) {
- if (data_size < hdr_len) {
pr_debug("BTF.ext header not found\n");
return -EINVAL;
- } else if (data_size == hdr_len) { pr_debug("BTF.ext has no data\n"); return -EINVAL; }
- /* Verify mandatory hdr info details present */
- if (hdr_len < offsetofend(struct btf_ext_header, line_info_len)) {
pr_warn("BTF.ext header missing func_info, line_info\n");
return -EINVAL;
- }
- /* Keep hdr native byte-order in memory for introspection */
- if (btf_ext->swapped_endian)
btf_ext_bswap_hdr(btf_ext, hdr_len);
- /* Basic info section consistency checks*/
- info_size = btf_ext->data_size - hdr_len;
- if (info_size & 0x03) {
pr_warn("BTF.ext info size not 4-byte multiple\n");
return -EINVAL;
- }
- info_size -= hdr->func_info_len + hdr->line_info_len;
- if (hdr_len >= offsetofend(struct btf_ext_header, core_relo_len))
info_size -= hdr->core_relo_len;
nit: Since we are checking this, maybe also check that sections do not overlap? Also, why disallowing gaps between sections?
- if (info_size) {
pr_warn("BTF.ext info size mismatch with header data\n");
return -EINVAL;
- }
- /* Keep infos native byte-order in memory for introspection */
- if (btf_ext->swapped_endian)
btf_ext_bswap_info(btf_ext, !btf_ext->swapped_endian);
- return 0;
}
[...]
@@ -3119,15 +3223,71 @@ struct btf_ext *btf_ext__new(const __u8 *data, __u32 size) return btf_ext; } +static void *btf_ext_raw_data(const struct btf_ext *btf_ext_ro, __u32 *size,
bool swap_endian)
+{
- struct btf_ext *btf_ext = (struct btf_ext *)btf_ext_ro;
- const __u32 data_sz = btf_ext->data_size;
- void *data;
- data = swap_endian ? btf_ext->data_swapped : btf_ext->data;
- if (data) {
*size = data_sz;
return data;
- }
- data = calloc(1, data_sz);
- if (!data)
return NULL;
- memcpy(data, btf_ext->data, data_sz);
- if (swap_endian) {
btf_ext_bswap_info(btf_ext, true);
btf_ext_bswap_hdr(btf_ext, btf_ext->hdr->hdr_len);
btf_ext->data_swapped = data;
- }
Nit: I don't like how this function is organized: - if btf_ext->data can't be NULL swap_endian == true at this point; - if btf_ext->data can be NULL and swap_endian == false pointer to `data` would be lost.
I assume that btf_ext->data can't be null, basing on the btf_ext__new(), but function body is a bit confusing.
Hi Eduard,
Sorry, I saw this earlier but dropped my reply by mistake I think. You're right that btf_ext->data can't be null, and the awkwardness above is a holdover from trying to use the btf_raw_data() code, where it _can_ be null. I've rewritten it to be clearer for the next v6 series, which also reuses existing info sec validation and drops the extra code you referred to further above.
Thanks, Tony
- *size = data_sz;
- return data;
+}
[...]
Allow bpf_object__open() to access files of either endianness, and convert included BPF programs to native byte-order in-memory for introspection. Loading BPF objects of non-native byte-order is still disallowed however.
Signed-off-by: Tony Ambardar tony.ambardar@gmail.com --- tools/lib/bpf/libbpf.c | 49 +++++++++++++++++++++++++++------ tools/lib/bpf/libbpf_internal.h | 11 ++++++++ 2 files changed, 52 insertions(+), 8 deletions(-)
diff --git a/tools/lib/bpf/libbpf.c b/tools/lib/bpf/libbpf.c index 0226d3b50709..aa52870b1967 100644 --- a/tools/lib/bpf/libbpf.c +++ b/tools/lib/bpf/libbpf.c @@ -694,6 +694,8 @@ struct bpf_object { /* Information when doing ELF related work. Only valid if efile.elf is not NULL */ struct elf_state efile;
+ unsigned char byteorder; + struct btf *btf; struct btf_ext *btf_ext;
@@ -940,6 +942,21 @@ bpf_object__add_programs(struct bpf_object *obj, Elf_Data *sec_data, return 0; }
+static void bpf_object_bswap_progs(struct bpf_object *obj) +{ + struct bpf_program *prog = obj->programs; + struct bpf_insn *insn; + int p, i; + + for (p = 0; p < obj->nr_programs; p++, prog++) { + insn = prog->insns; + for (i = 0; i < prog->insns_cnt; i++, insn++) + bpf_insn_bswap(insn); + pr_debug("prog '%s': converted %zu insns to native byte order\n", + prog->name, prog->insns_cnt); + } +} + static const struct btf_member * find_member_by_offset(const struct btf_type *t, __u32 bit_offset) { @@ -1506,6 +1523,7 @@ static void bpf_object__elf_finish(struct bpf_object *obj)
elf_end(obj->efile.elf); obj->efile.elf = NULL; + obj->efile.ehdr = NULL; obj->efile.symbols = NULL; obj->efile.arena_data = NULL;
@@ -1571,6 +1589,18 @@ static int bpf_object__elf_init(struct bpf_object *obj) goto errout; }
+ /* Validate ELF object endianness... */ + if (ehdr->e_ident[EI_DATA] != ELFDATA2LSB && + ehdr->e_ident[EI_DATA] != ELFDATA2MSB) { + err = -LIBBPF_ERRNO__ENDIAN; + pr_warn("elf: '%s' has unknown byte order\n", obj->path); + goto errout; + } + /* and preserve outside lifetime of bpf_object_open() */ + obj->byteorder = ehdr->e_ident[EI_DATA]; + + + if (elf_getshdrstrndx(elf, &obj->efile.shstrndx)) { pr_warn("elf: failed to get section names section index for %s: %s\n", obj->path, elf_errmsg(-1)); @@ -1599,19 +1629,15 @@ static int bpf_object__elf_init(struct bpf_object *obj) return err; }
-static int bpf_object__check_endianness(struct bpf_object *obj) +static bool is_native_endianness(struct bpf_object *obj) { #if __BYTE_ORDER__ == __ORDER_LITTLE_ENDIAN__ - if (obj->efile.ehdr->e_ident[EI_DATA] == ELFDATA2LSB) - return 0; + return obj->byteorder == ELFDATA2LSB; #elif __BYTE_ORDER__ == __ORDER_BIG_ENDIAN__ - if (obj->efile.ehdr->e_ident[EI_DATA] == ELFDATA2MSB) - return 0; + return obj->byteorder == ELFDATA2MSB; #else # error "Unrecognized __BYTE_ORDER__" #endif - pr_warn("elf: endianness mismatch in %s.\n", obj->path); - return -LIBBPF_ERRNO__ENDIAN; }
static int @@ -3953,6 +3979,10 @@ static int bpf_object__elf_collect(struct bpf_object *obj) return -LIBBPF_ERRNO__FORMAT; }
+ /* change BPF program insns to native endianness for introspection */ + if (!is_native_endianness(obj)) + bpf_object_bswap_progs(obj); + /* sort BPF programs by section name and in-section instruction offset * for faster search */ @@ -7992,7 +8022,6 @@ static struct bpf_object *bpf_object_open(const char *path, const void *obj_buf, }
err = bpf_object__elf_init(obj); - err = err ? : bpf_object__check_endianness(obj); err = err ? : bpf_object__elf_collect(obj); err = err ? : bpf_object__collect_externs(obj); err = err ? : bpf_object_fixup_btf(obj); @@ -8500,6 +8529,10 @@ static int bpf_object_load(struct bpf_object *obj, int extra_log_level, const ch
if (obj->gen_loader) bpf_gen__init(obj->gen_loader, extra_log_level, obj->nr_programs, obj->nr_maps); + else if (!is_native_endianness(obj)) { + pr_warn("object '%s' is not native endianness\n", obj->name); + return libbpf_err(-LIBBPF_ERRNO__ENDIAN); + }
err = bpf_object_prepare_token(obj); err = err ? : bpf_object__probe_loading(obj); diff --git a/tools/lib/bpf/libbpf_internal.h b/tools/lib/bpf/libbpf_internal.h index 81d375015c2b..f32e3e8378a5 100644 --- a/tools/lib/bpf/libbpf_internal.h +++ b/tools/lib/bpf/libbpf_internal.h @@ -11,6 +11,7 @@
#include <stdlib.h> #include <limits.h> +#include <byteswap.h> #include <errno.h> #include <linux/err.h> #include <fcntl.h> @@ -621,6 +622,16 @@ static inline bool is_ldimm64_insn(struct bpf_insn *insn) return insn->code == (BPF_LD | BPF_IMM | BPF_DW); }
+static inline void bpf_insn_bswap(struct bpf_insn *insn) +{ + __u8 tmp_reg = insn->dst_reg; + + insn->dst_reg = insn->src_reg; + insn->src_reg = tmp_reg; + insn->off = bswap_16(insn->off); + insn->imm = bswap_32(insn->imm); +} + /* Unconditionally dup FD, ensuring it doesn't use [0, 2] range. * Original FD is not closed or altered in any other way. * Preserves original FD value, if it's invalid (negative).
On Fri, Aug 30, 2024 at 12:30 AM Tony Ambardar tony.ambardar@gmail.com wrote:
Allow bpf_object__open() to access files of either endianness, and convert included BPF programs to native byte-order in-memory for introspection. Loading BPF objects of non-native byte-order is still disallowed however.
Signed-off-by: Tony Ambardar tony.ambardar@gmail.com
tools/lib/bpf/libbpf.c | 49 +++++++++++++++++++++++++++------ tools/lib/bpf/libbpf_internal.h | 11 ++++++++ 2 files changed, 52 insertions(+), 8 deletions(-)
[...]
/* Validate ELF object endianness... */
if (ehdr->e_ident[EI_DATA] != ELFDATA2LSB &&
ehdr->e_ident[EI_DATA] != ELFDATA2MSB) {
err = -LIBBPF_ERRNO__ENDIAN;
pr_warn("elf: '%s' has unknown byte order\n", obj->path);
goto errout;
}
/* and preserve outside lifetime of bpf_object_open() */
what does it mean "preserve outside lifetime" ?
obj->byteorder = ehdr->e_ident[EI_DATA];
why so many empty lines?..
if (elf_getshdrstrndx(elf, &obj->efile.shstrndx)) { pr_warn("elf: failed to get section names section index for %s: %s\n", obj->path, elf_errmsg(-1));
[...]
err = bpf_object__elf_init(obj);
err = err ? : bpf_object__check_endianness(obj); err = err ? : bpf_object__elf_collect(obj); err = err ? : bpf_object__collect_externs(obj); err = err ? : bpf_object_fixup_btf(obj);
@@ -8500,6 +8529,10 @@ static int bpf_object_load(struct bpf_object *obj, int extra_log_level, const ch
if (obj->gen_loader) bpf_gen__init(obj->gen_loader, extra_log_level, obj->nr_programs, obj->nr_maps);
nit: add {} around if, both sides should either have or not have {}
else if (!is_native_endianness(obj)) {
pr_warn("object '%s' is not native endianness\n", obj->name);
"object '%s': load is not supported in non-native endianness\n"
return libbpf_err(-LIBBPF_ERRNO__ENDIAN);
} err = bpf_object_prepare_token(obj); err = err ? : bpf_object__probe_loading(obj);
[...]
On Fri, 2024-08-30 at 14:25 -0700, Andrii Nakryiko wrote:
[...]
err = bpf_object__elf_init(obj);
err = err ? : bpf_object__check_endianness(obj); err = err ? : bpf_object__elf_collect(obj); err = err ? : bpf_object__collect_externs(obj); err = err ? : bpf_object_fixup_btf(obj);
@@ -8500,6 +8529,10 @@ static int bpf_object_load(struct bpf_object *obj, int extra_log_level, const ch
if (obj->gen_loader) bpf_gen__init(obj->gen_loader, extra_log_level, obj->nr_programs, obj->nr_maps);
nit: add {} around if, both sides should either have or not have {}
else if (!is_native_endianness(obj)) {
pr_warn("object '%s' is not native endianness\n", obj->name);
"object '%s': load is not supported in non-native endianness\n"
return libbpf_err(-LIBBPF_ERRNO__ENDIAN);
}
Silly question: why load is allowed to proceed for non-native endianness when obj->gen_loader is set?
On Fri, Aug 30, 2024 at 06:26:10PM -0700, Eduard Zingerman wrote:
On Fri, 2024-08-30 at 14:25 -0700, Andrii Nakryiko wrote:
[...]
err = bpf_object__elf_init(obj);
err = err ? : bpf_object__check_endianness(obj); err = err ? : bpf_object__elf_collect(obj); err = err ? : bpf_object__collect_externs(obj); err = err ? : bpf_object_fixup_btf(obj);
@@ -8500,6 +8529,10 @@ static int bpf_object_load(struct bpf_object *obj, int extra_log_level, const ch
if (obj->gen_loader) bpf_gen__init(obj->gen_loader, extra_log_level, obj->nr_programs, obj->nr_maps);
nit: add {} around if, both sides should either have or not have {}
else if (!is_native_endianness(obj)) {
pr_warn("object '%s' is not native endianness\n", obj->name);
"object '%s': load is not supported in non-native endianness\n"
return libbpf_err(-LIBBPF_ERRNO__ENDIAN);
}
Silly question: why load is allowed to proceed for non-native endianness when obj->gen_loader is set?
Not silly, had similar questions. Having obj->gen_loader set means "light skeleton" is being generated, where it tries to eliminate dependency on libbpf by skeleton code. In this mode, the code doesn't load anything but instead tracks "what would libbpf do" so it can later write a pure BPF loader program. Alexei will correct me or elaborate as needed I hope.
Unconditionally blocking on non-native endianness would break light skel.
On Fri, Aug 30, 2024 at 02:25:54PM -0700, Andrii Nakryiko wrote:
On Fri, Aug 30, 2024 at 12:30 AM Tony Ambardar tony.ambardar@gmail.com wrote:
Allow bpf_object__open() to access files of either endianness, and convert included BPF programs to native byte-order in-memory for introspection. Loading BPF objects of non-native byte-order is still disallowed however.
Signed-off-by: Tony Ambardar tony.ambardar@gmail.com
tools/lib/bpf/libbpf.c | 49 +++++++++++++++++++++++++++------ tools/lib/bpf/libbpf_internal.h | 11 ++++++++ 2 files changed, 52 insertions(+), 8 deletions(-)
[...]
/* Validate ELF object endianness... */
if (ehdr->e_ident[EI_DATA] != ELFDATA2LSB &&
ehdr->e_ident[EI_DATA] != ELFDATA2MSB) {
err = -LIBBPF_ERRNO__ENDIAN;
pr_warn("elf: '%s' has unknown byte order\n", obj->path);
goto errout;
}
/* and preserve outside lifetime of bpf_object_open() */
what does it mean "preserve outside lifetime" ?
bpf_object_open() freed ELF data on exit but didn't zero obj->efile.ehdr, leading to unpredictable use-after-free problems in is_native_endianness(). This is part of the fix but should be clearer e.g. "save after ELF data freed...". Will update.
obj->byteorder = ehdr->e_ident[EI_DATA];
why so many empty lines?..
I'm blind? Fixed, thanks.
if (elf_getshdrstrndx(elf, &obj->efile.shstrndx)) { pr_warn("elf: failed to get section names section index for %s: %s\n", obj->path, elf_errmsg(-1));
[...]
err = bpf_object__elf_init(obj);
err = err ? : bpf_object__check_endianness(obj); err = err ? : bpf_object__elf_collect(obj); err = err ? : bpf_object__collect_externs(obj); err = err ? : bpf_object_fixup_btf(obj);
@@ -8500,6 +8529,10 @@ static int bpf_object_load(struct bpf_object *obj, int extra_log_level, const ch
if (obj->gen_loader) bpf_gen__init(obj->gen_loader, extra_log_level, obj->nr_programs, obj->nr_maps);
nit: add {} around if, both sides should either have or not have {}
OK, done.
else if (!is_native_endianness(obj)) {
pr_warn("object '%s' is not native endianness\n", obj->name);
"object '%s': load is not supported in non-native endianness\n"
Clearer, will update.
return libbpf_err(-LIBBPF_ERRNO__ENDIAN);
} err = bpf_object_prepare_token(obj); err = err ? : bpf_object__probe_loading(obj);
[...]
On Fri, 2024-08-30 at 00:29 -0700, Tony Ambardar wrote:
[...]
@@ -940,6 +942,21 @@ bpf_object__add_programs(struct bpf_object *obj, Elf_Data *sec_data, return 0; } +static void bpf_object_bswap_progs(struct bpf_object *obj) +{
- struct bpf_program *prog = obj->programs;
- struct bpf_insn *insn;
- int p, i;
- for (p = 0; p < obj->nr_programs; p++, prog++) {
insn = prog->insns;
for (i = 0; i < prog->insns_cnt; i++, insn++)
bpf_insn_bswap(insn);
pr_debug("prog '%s': converted %zu insns to native byte order\n",
prog->name, prog->insns_cnt);
Nit: pr_debug already printed available programs at this point, maybe move this call outside of both loops?
- }
+}
static const struct btf_member * find_member_by_offset(const struct btf_type *t, __u32 bit_offset) {
[...]
On Fri, Aug 30, 2024 at 06:16:25PM -0700, Eduard Zingerman wrote:
On Fri, 2024-08-30 at 00:29 -0700, Tony Ambardar wrote:
[...]
@@ -940,6 +942,21 @@ bpf_object__add_programs(struct bpf_object *obj, Elf_Data *sec_data, return 0; } +static void bpf_object_bswap_progs(struct bpf_object *obj) +{
- struct bpf_program *prog = obj->programs;
- struct bpf_insn *insn;
- int p, i;
- for (p = 0; p < obj->nr_programs; p++, prog++) {
insn = prog->insns;
for (i = 0; i < prog->insns_cnt; i++, insn++)
bpf_insn_bswap(insn);
pr_debug("prog '%s': converted %zu insns to native byte order\n",
prog->name, prog->insns_cnt);
Nit: pr_debug already printed available programs at this point, maybe move this call outside of both loops?
Good point. Will update to summarize # of programs converted instead.
- }
+}
static const struct btf_member * find_member_by_offset(const struct btf_type *t, __u32 bit_offset) {
[...]
Allow static linking object files of either endianness, checking that input files have consistent byte-order, and setting output endianness from input.
Linking requires in-memory processing of programs, relocations, sections, etc. in native endianness, and output conversion to target byte-order. This is enabled by built-in ELF translation and recent BTF/BTF.ext endianness functions. Further add local functions for swapping byte-order of sections containing BPF insns.
Signed-off-by: Tony Ambardar tony.ambardar@gmail.com --- tools/lib/bpf/linker.c | 90 ++++++++++++++++++++++++++++++++++-------- 1 file changed, 74 insertions(+), 16 deletions(-)
diff --git a/tools/lib/bpf/linker.c b/tools/lib/bpf/linker.c index 7489306cd6f7..bd97da68eed6 100644 --- a/tools/lib/bpf/linker.c +++ b/tools/lib/bpf/linker.c @@ -135,6 +135,7 @@ struct bpf_linker { int fd; Elf *elf; Elf64_Ehdr *elf_hdr; + bool swapped_endian;
/* Output sections metadata */ struct dst_sec *secs; @@ -324,13 +325,8 @@ static int init_output_elf(struct bpf_linker *linker, const char *file)
linker->elf_hdr->e_machine = EM_BPF; linker->elf_hdr->e_type = ET_REL; -#if __BYTE_ORDER__ == __ORDER_LITTLE_ENDIAN__ - linker->elf_hdr->e_ident[EI_DATA] = ELFDATA2LSB; -#elif __BYTE_ORDER__ == __ORDER_BIG_ENDIAN__ - linker->elf_hdr->e_ident[EI_DATA] = ELFDATA2MSB; -#else -#error "Unknown __BYTE_ORDER__" -#endif + /* Set unknown ELF endianness, assign later from input files */ + linker->elf_hdr->e_ident[EI_DATA] = ELFDATANONE;
/* STRTAB */ /* initialize strset with an empty string to conform to ELF */ @@ -541,19 +537,21 @@ static int linker_load_obj_file(struct bpf_linker *linker, const char *filename, const struct bpf_linker_file_opts *opts, struct src_obj *obj) { -#if __BYTE_ORDER__ == __ORDER_LITTLE_ENDIAN__ - const int host_endianness = ELFDATA2LSB; -#elif __BYTE_ORDER__ == __ORDER_BIG_ENDIAN__ - const int host_endianness = ELFDATA2MSB; -#else -#error "Unknown __BYTE_ORDER__" -#endif int err = 0; Elf_Scn *scn; Elf_Data *data; Elf64_Ehdr *ehdr; Elf64_Shdr *shdr; struct src_sec *sec; + unsigned char obj_byteorder; + unsigned char link_byteorder = linker->elf_hdr->e_ident[EI_DATA]; +#if __BYTE_ORDER__ == __ORDER_LITTLE_ENDIAN__ + const unsigned char host_byteorder = ELFDATA2LSB; +#elif __BYTE_ORDER__ == __ORDER_BIG_ENDIAN__ + const unsigned char host_byteorder = ELFDATA2MSB; +#else +#error "Unknown __BYTE_ORDER__" +#endif
pr_debug("linker: adding object file '%s'...\n", filename);
@@ -579,11 +577,25 @@ static int linker_load_obj_file(struct bpf_linker *linker, const char *filename, pr_warn_elf("failed to get ELF header for %s", filename); return err; } - if (ehdr->e_ident[EI_DATA] != host_endianness) { + + /* Linker output endianness set by first input object */ + obj_byteorder = ehdr->e_ident[EI_DATA]; + if (obj_byteorder != ELFDATA2LSB && obj_byteorder != ELFDATA2MSB) { + err = -EOPNOTSUPP; + pr_warn("unknown byte order of ELF file %s\n", filename); + return err; + } + if (link_byteorder == ELFDATANONE) { + linker->elf_hdr->e_ident[EI_DATA] = obj_byteorder; + linker->swapped_endian = obj_byteorder != host_byteorder; + pr_debug("linker: set %s-endian output byte order\n", + obj_byteorder == ELFDATA2MSB ? "big" : "little"); + } else if (link_byteorder != obj_byteorder) { err = -EOPNOTSUPP; - pr_warn_elf("unsupported byte order of ELF file %s", filename); + pr_warn("byte order mismatch with ELF file %s\n", filename); return err; } + if (ehdr->e_type != ET_REL || ehdr->e_machine != EM_BPF || ehdr->e_ident[EI_CLASS] != ELFCLASS64) { @@ -1111,6 +1123,27 @@ static bool sec_content_is_same(struct dst_sec *dst_sec, struct src_sec *src_sec return true; }
+static bool is_exec_sec(struct dst_sec *sec) +{ + if (!sec || sec->ephemeral) + return false; + return (sec->shdr->sh_type == SHT_PROGBITS) && + (sec->shdr->sh_flags & SHF_EXECINSTR); +} + +static int exec_sec_bswap(void *raw_data, int size) +{ + const int insn_cnt = size / sizeof(struct bpf_insn); + struct bpf_insn *insn = raw_data; + int i; + + if (size % sizeof(struct bpf_insn)) + return -EINVAL; + for (i = 0; i < insn_cnt; i++, insn++) + bpf_insn_bswap(insn); + return 0; +} + static int extend_sec(struct bpf_linker *linker, struct dst_sec *dst, struct src_sec *src) { void *tmp; @@ -1170,6 +1203,10 @@ static int extend_sec(struct bpf_linker *linker, struct dst_sec *dst, struct src memset(dst->raw_data + dst->sec_sz, 0, dst_align_sz - dst->sec_sz); /* now copy src data at a properly aligned offset */ memcpy(dst->raw_data + dst_align_sz, src->data->d_buf, src->shdr->sh_size); + + /* convert added bpf insns to native byte-order */ + if (linker->swapped_endian && is_exec_sec(dst)) + exec_sec_bswap(dst->raw_data + dst_align_sz, src->shdr->sh_size); }
dst->sec_sz = dst_final_sz; @@ -2630,6 +2667,10 @@ int bpf_linker__finalize(struct bpf_linker *linker) if (!sec->scn) continue;
+ /* restore sections with bpf insns to target byte-order */ + if (linker->swapped_endian && is_exec_sec(sec)) + exec_sec_bswap(sec->raw_data, sec->sec_sz); + sec->data->d_buf = sec->raw_data; }
@@ -2698,6 +2739,7 @@ static int emit_elf_data_sec(struct bpf_linker *linker, const char *sec_name,
static int finalize_btf(struct bpf_linker *linker) { + enum btf_endianness link_endianness; LIBBPF_OPTS(btf_dedup_opts, opts); struct btf *btf = linker->btf; const void *raw_data; @@ -2742,6 +2784,22 @@ static int finalize_btf(struct bpf_linker *linker) return err; }
+ /* Set .BTF and .BTF.ext output byte order */ + link_endianness = linker->elf_hdr->e_ident[EI_DATA] == ELFDATA2MSB ? + BTF_BIG_ENDIAN : BTF_LITTLE_ENDIAN; + err = btf__set_endianness(linker->btf, link_endianness); + if (err) { + pr_warn("failed to set .BTF output endianness: %d\n", err); + return err; + } + if (linker->btf_ext) { + err = btf_ext__set_endianness(linker->btf_ext, link_endianness); + if (err) { + pr_warn("failed to set .BTF.ext output endianness: %d\n", err); + return err; + } + } + /* Emit .BTF section */ raw_data = btf__raw_data(linker->btf, &raw_sz); if (!raw_data)
On Fri, Aug 30, 2024 at 12:30 AM Tony Ambardar tony.ambardar@gmail.com wrote:
Allow static linking object files of either endianness, checking that input files have consistent byte-order, and setting output endianness from input.
Linking requires in-memory processing of programs, relocations, sections, etc. in native endianness, and output conversion to target byte-order. This is enabled by built-in ELF translation and recent BTF/BTF.ext endianness functions. Further add local functions for swapping byte-order of sections containing BPF insns.
Signed-off-by: Tony Ambardar tony.ambardar@gmail.com
tools/lib/bpf/linker.c | 90 ++++++++++++++++++++++++++++++++++-------- 1 file changed, 74 insertions(+), 16 deletions(-)
[...]
+static bool is_exec_sec(struct dst_sec *sec) +{
if (!sec || sec->ephemeral)
return false;
return (sec->shdr->sh_type == SHT_PROGBITS) &&
(sec->shdr->sh_flags & SHF_EXECINSTR);
+}
+static int exec_sec_bswap(void *raw_data, int size) +{
const int insn_cnt = size / sizeof(struct bpf_insn);
struct bpf_insn *insn = raw_data;
int i;
if (size % sizeof(struct bpf_insn))
return -EINVAL;
this shouldn't be checked here, it should be assumed this is valid and was ensured by the caller. And make exec_sec_bswap() a void function, please.
for (i = 0; i < insn_cnt; i++, insn++)
bpf_insn_bswap(insn);
return 0;
+}
static int extend_sec(struct bpf_linker *linker, struct dst_sec *dst, struct src_sec *src) { void *tmp; @@ -1170,6 +1203,10 @@ static int extend_sec(struct bpf_linker *linker, struct dst_sec *dst, struct src memset(dst->raw_data + dst->sec_sz, 0, dst_align_sz - dst->sec_sz); /* now copy src data at a properly aligned offset */ memcpy(dst->raw_data + dst_align_sz, src->data->d_buf, src->shdr->sh_size);
the check for size % sizeof(struct bpf_insn) should be somewhere here (if is_exec_sec()), right?
/* convert added bpf insns to native byte-order */
if (linker->swapped_endian && is_exec_sec(dst))
exec_sec_bswap(dst->raw_data + dst_align_sz, src->shdr->sh_size); } dst->sec_sz = dst_final_sz;
@@ -2630,6 +2667,10 @@ int bpf_linker__finalize(struct bpf_linker *linker) if (!sec->scn) continue;
but no need to check here, we know it's correct, if we got all the way here
/* restore sections with bpf insns to target byte-order */
if (linker->swapped_endian && is_exec_sec(sec))
exec_sec_bswap(sec->raw_data, sec->sec_sz);
sec->data->d_buf = sec->raw_data; }
@@ -2698,6 +2739,7 @@ static int emit_elf_data_sec(struct bpf_linker *linker, const char *sec_name,
static int finalize_btf(struct bpf_linker *linker) {
enum btf_endianness link_endianness; LIBBPF_OPTS(btf_dedup_opts, opts); struct btf *btf = linker->btf; const void *raw_data;
@@ -2742,6 +2784,22 @@ static int finalize_btf(struct bpf_linker *linker) return err; }
/* Set .BTF and .BTF.ext output byte order */
link_endianness = linker->elf_hdr->e_ident[EI_DATA] == ELFDATA2MSB ?
BTF_BIG_ENDIAN : BTF_LITTLE_ENDIAN;
err = btf__set_endianness(linker->btf, link_endianness);
if (err) {
pr_warn("failed to set .BTF output endianness: %d\n", err);
return err;
}
link_endianness is always well-formed enum, there is no need to check errors, here and for btf_ext__set_endianness, please drop both
if (linker->btf_ext) {
err = btf_ext__set_endianness(linker->btf_ext, link_endianness);
if (err) {
pr_warn("failed to set .BTF.ext output endianness: %d\n", err);
return err;
}
}
/* Emit .BTF section */ raw_data = btf__raw_data(linker->btf, &raw_sz); if (!raw_data)
-- 2.34.1
On Fri, Aug 30, 2024 at 02:25:07PM -0700, Andrii Nakryiko wrote:
On Fri, Aug 30, 2024 at 12:30 AM Tony Ambardar tony.ambardar@gmail.com wrote:
Allow static linking object files of either endianness, checking that input files have consistent byte-order, and setting output endianness from input.
Linking requires in-memory processing of programs, relocations, sections, etc. in native endianness, and output conversion to target byte-order. This is enabled by built-in ELF translation and recent BTF/BTF.ext endianness functions. Further add local functions for swapping byte-order of sections containing BPF insns.
Signed-off-by: Tony Ambardar tony.ambardar@gmail.com
tools/lib/bpf/linker.c | 90 ++++++++++++++++++++++++++++++++++-------- 1 file changed, 74 insertions(+), 16 deletions(-)
[...]
+static bool is_exec_sec(struct dst_sec *sec) +{
if (!sec || sec->ephemeral)
return false;
return (sec->shdr->sh_type == SHT_PROGBITS) &&
(sec->shdr->sh_flags & SHF_EXECINSTR);
+}
+static int exec_sec_bswap(void *raw_data, int size) +{
const int insn_cnt = size / sizeof(struct bpf_insn);
struct bpf_insn *insn = raw_data;
int i;
if (size % sizeof(struct bpf_insn))
return -EINVAL;
this shouldn't be checked here, it should be assumed this is valid and was ensured by the caller. And make exec_sec_bswap() a void function, please.
for (i = 0; i < insn_cnt; i++, insn++)
bpf_insn_bswap(insn);
return 0;
+}
static int extend_sec(struct bpf_linker *linker, struct dst_sec *dst, struct src_sec *src) { void *tmp; @@ -1170,6 +1203,10 @@ static int extend_sec(struct bpf_linker *linker, struct dst_sec *dst, struct src memset(dst->raw_data + dst->sec_sz, 0, dst_align_sz - dst->sec_sz); /* now copy src data at a properly aligned offset */ memcpy(dst->raw_data + dst_align_sz, src->data->d_buf, src->shdr->sh_size);
the check for size % sizeof(struct bpf_insn) should be somewhere here (if is_exec_sec()), right?
/* convert added bpf insns to native byte-order */
if (linker->swapped_endian && is_exec_sec(dst))
exec_sec_bswap(dst->raw_data + dst_align_sz, src->shdr->sh_size); } dst->sec_sz = dst_final_sz;
@@ -2630,6 +2667,10 @@ int bpf_linker__finalize(struct bpf_linker *linker) if (!sec->scn) continue;
but no need to check here, we know it's correct, if we got all the way here
I did a pass earlier to reorganize sanity checks and remove redundant ones, and realized linker_sanity_check_elf() already does what we want but overlooked dropping this from exec_sec_bswap(). Will do so now. Thanks for catching!
/* restore sections with bpf insns to target byte-order */
if (linker->swapped_endian && is_exec_sec(sec))
exec_sec_bswap(sec->raw_data, sec->sec_sz);
sec->data->d_buf = sec->raw_data; }
@@ -2698,6 +2739,7 @@ static int emit_elf_data_sec(struct bpf_linker *linker, const char *sec_name,
static int finalize_btf(struct bpf_linker *linker) {
enum btf_endianness link_endianness; LIBBPF_OPTS(btf_dedup_opts, opts); struct btf *btf = linker->btf; const void *raw_data;
@@ -2742,6 +2784,22 @@ static int finalize_btf(struct bpf_linker *linker) return err; }
/* Set .BTF and .BTF.ext output byte order */
link_endianness = linker->elf_hdr->e_ident[EI_DATA] == ELFDATA2MSB ?
BTF_BIG_ENDIAN : BTF_LITTLE_ENDIAN;
err = btf__set_endianness(linker->btf, link_endianness);
if (err) {
pr_warn("failed to set .BTF output endianness: %d\n", err);
return err;
}
link_endianness is always well-formed enum, there is no need to check errors, here and for btf_ext__set_endianness, please drop both
Right, makes sense.
if (linker->btf_ext) {
err = btf_ext__set_endianness(linker->btf_ext, link_endianness);
if (err) {
pr_warn("failed to set .BTF.ext output endianness: %d\n", err);
return err;
}
}
/* Emit .BTF section */ raw_data = btf__raw_data(linker->btf, &raw_sz); if (!raw_data)
-- 2.34.1
Track target endianness in 'struct bpf_gen' and process in-memory data in native byte-order, but on finalization convert the embedded loader BPF insns to target endianness.
The light skeleton also includes a target-accessed data blob which is heterogeneous and thus difficult to convert to target byte-order on finalization. Add support functions to convert data to target endianness as it is added to the blob.
Also add additional debug logging for data blob structure details and skeleton loading.
Signed-off-by: Tony Ambardar tony.ambardar@gmail.com --- tools/lib/bpf/bpf_gen_internal.h | 1 + tools/lib/bpf/gen_loader.c | 187 +++++++++++++++++++++++-------- tools/lib/bpf/libbpf.c | 1 + tools/lib/bpf/skel_internal.h | 3 +- 4 files changed, 147 insertions(+), 45 deletions(-)
diff --git a/tools/lib/bpf/bpf_gen_internal.h b/tools/lib/bpf/bpf_gen_internal.h index fdf44403ff36..6ff963a491d9 100644 --- a/tools/lib/bpf/bpf_gen_internal.h +++ b/tools/lib/bpf/bpf_gen_internal.h @@ -34,6 +34,7 @@ struct bpf_gen { void *data_cur; void *insn_start; void *insn_cur; + bool swapped_endian; ssize_t cleanup_label; __u32 nr_progs; __u32 nr_maps; diff --git a/tools/lib/bpf/gen_loader.c b/tools/lib/bpf/gen_loader.c index cf3323fd47b8..4374399bc3f8 100644 --- a/tools/lib/bpf/gen_loader.c +++ b/tools/lib/bpf/gen_loader.c @@ -401,6 +401,15 @@ int bpf_gen__finish(struct bpf_gen *gen, int nr_progs, int nr_maps) opts->insns_sz = gen->insn_cur - gen->insn_start; opts->data = gen->data_start; opts->data_sz = gen->data_cur - gen->data_start; + + /* use target endianness for embedded loader */ + if (gen->swapped_endian) { + struct bpf_insn *insn = (struct bpf_insn *)opts->insns; + int insn_cnt = opts->insns_sz / sizeof(struct bpf_insn); + + for (i = 0; i < insn_cnt; i++) + bpf_insn_bswap(insn++); + } } return gen->error; } @@ -414,6 +423,31 @@ void bpf_gen__free(struct bpf_gen *gen) free(gen); }
+/* + * Fields of bpf_attr are set to values in native byte-order before being + * written to the target-bound data blob, and may need endian conversion. + * This macro allows providing the correct value in situ more simply than + * writing a separate converter for *all fields* of *all records* included + * in union bpf_attr. Note that sizeof(rval) should match the assignment + * target to avoid runtime problems. + */ +#define tgt_endian(rval) ({ \ + typeof(rval) _val; \ + if (!gen->swapped_endian) \ + _val = (rval); \ + else { \ + switch (sizeof(rval)) { \ + case 1: _val = (rval); break; \ + case 2: _val = bswap_16(rval); break; \ + case 4: _val = bswap_32(rval); break; \ + case 8: _val = bswap_64(rval); break; \ + default:_val = (rval); \ + pr_warn("unsupported bswap size!\n"); \ + } \ + } \ + _val; \ +}) + void bpf_gen__load_btf(struct bpf_gen *gen, const void *btf_raw_data, __u32 btf_raw_size) { @@ -422,11 +456,13 @@ void bpf_gen__load_btf(struct bpf_gen *gen, const void *btf_raw_data, union bpf_attr attr;
memset(&attr, 0, attr_size); - pr_debug("gen: load_btf: size %d\n", btf_raw_size); btf_data = add_data(gen, btf_raw_data, btf_raw_size); + pr_debug("gen: load_btf: off %d size %d\n", btf_data, btf_raw_size);
- attr.btf_size = btf_raw_size; + attr.btf_size = tgt_endian(btf_raw_size); btf_load_attr = add_data(gen, &attr, attr_size); + pr_debug("gen: load_btf: btf_load_attr: off %d size %d\n", + btf_load_attr, attr_size);
/* populate union bpf_attr with user provided log details */ move_ctx2blob(gen, attr_field(btf_load_attr, btf_log_level), 4, @@ -457,28 +493,30 @@ void bpf_gen__map_create(struct bpf_gen *gen, union bpf_attr attr;
memset(&attr, 0, attr_size); - attr.map_type = map_type; - attr.key_size = key_size; - attr.value_size = value_size; - attr.map_flags = map_attr->map_flags; - attr.map_extra = map_attr->map_extra; + attr.map_type = tgt_endian(map_type); + attr.key_size = tgt_endian(key_size); + attr.value_size = tgt_endian(value_size); + attr.map_flags = tgt_endian(map_attr->map_flags); + attr.map_extra = tgt_endian(map_attr->map_extra); if (map_name) libbpf_strlcpy(attr.map_name, map_name, sizeof(attr.map_name)); - attr.numa_node = map_attr->numa_node; - attr.map_ifindex = map_attr->map_ifindex; - attr.max_entries = max_entries; - attr.btf_key_type_id = map_attr->btf_key_type_id; - attr.btf_value_type_id = map_attr->btf_value_type_id; + attr.numa_node = tgt_endian(map_attr->numa_node); + attr.map_ifindex = tgt_endian(map_attr->map_ifindex); + attr.max_entries = tgt_endian(max_entries); + attr.btf_key_type_id = tgt_endian(map_attr->btf_key_type_id); + attr.btf_value_type_id = tgt_endian(map_attr->btf_value_type_id);
pr_debug("gen: map_create: %s idx %d type %d value_type_id %d\n", - attr.map_name, map_idx, map_type, attr.btf_value_type_id); + map_name, map_idx, map_type, map_attr->btf_value_type_id);
map_create_attr = add_data(gen, &attr, attr_size); - if (attr.btf_value_type_id) + pr_debug("gen: map_create: map_create_attr: off %d size %d\n", + map_create_attr, attr_size); + if (map_attr->btf_value_type_id) /* populate union bpf_attr with btf_fd saved in the stack earlier */ move_stack2blob(gen, attr_field(map_create_attr, btf_fd), 4, stack_off(btf_fd)); - switch (attr.map_type) { + switch (map_type) { case BPF_MAP_TYPE_ARRAY_OF_MAPS: case BPF_MAP_TYPE_HASH_OF_MAPS: move_stack2blob(gen, attr_field(map_create_attr, inner_map_fd), 4, @@ -498,8 +536,8 @@ void bpf_gen__map_create(struct bpf_gen *gen, /* emit MAP_CREATE command */ emit_sys_bpf(gen, BPF_MAP_CREATE, map_create_attr, attr_size); debug_ret(gen, "map_create %s idx %d type %d value_size %d value_btf_id %d", - attr.map_name, map_idx, map_type, value_size, - attr.btf_value_type_id); + map_name, map_idx, map_type, value_size, + map_attr->btf_value_type_id); emit_check_err(gen); /* remember map_fd in the stack, if successful */ if (map_idx < 0) { @@ -784,12 +822,12 @@ static void emit_relo_ksym_typeless(struct bpf_gen *gen, emit_ksym_relo_log(gen, relo, kdesc->ref); }
-static __u32 src_reg_mask(void) +static __u32 src_reg_mask(struct bpf_gen *gen) { -#if defined(__LITTLE_ENDIAN_BITFIELD) - return 0x0f; /* src_reg,dst_reg,... */ -#elif defined(__BIG_ENDIAN_BITFIELD) - return 0xf0; /* dst_reg,src_reg,... */ +#if defined(__LITTLE_ENDIAN_BITFIELD) /* src_reg,dst_reg,... */ + return gen->swapped_endian ? 0xf0 : 0x0f; +#elif defined(__BIG_ENDIAN_BITFIELD) /* dst_reg,src_reg,... */ + return gen->swapped_endian ? 0x0f : 0xf0; #else #error "Unsupported bit endianness, cannot proceed" #endif @@ -840,7 +878,7 @@ static void emit_relo_ksym_btf(struct bpf_gen *gen, struct ksym_relo_desc *relo, emit(gen, BPF_JMP_IMM(BPF_JA, 0, 0, 3)); clear_src_reg: /* clear bpf_object__relocate_data's src_reg assignment, otherwise we get a verifier failure */ - reg_mask = src_reg_mask(); + reg_mask = src_reg_mask(gen); emit(gen, BPF_LDX_MEM(BPF_B, BPF_REG_9, BPF_REG_8, offsetofend(struct bpf_insn, code))); emit(gen, BPF_ALU32_IMM(BPF_AND, BPF_REG_9, reg_mask)); emit(gen, BPF_STX_MEM(BPF_B, BPF_REG_8, BPF_REG_9, offsetofend(struct bpf_insn, code))); @@ -931,11 +969,33 @@ static void cleanup_relos(struct bpf_gen *gen, int insns) cleanup_core_relo(gen); }
+/* Convert func, line, or core relo info records to target endianness, + * checking the blob size is consistent with 32-bit fields. + */ +static void info_blob_bswap(struct bpf_gen *gen, int info_off, int info_cnt, + anon_info_bswap_fn_t bswap) +{ + void *p = gen->data_start + info_off; + int i; + + if (!gen->swapped_endian) + return; + + for (i = 0; i < info_cnt; i++) + p += bswap(p); +} + void bpf_gen__prog_load(struct bpf_gen *gen, enum bpf_prog_type prog_type, const char *prog_name, const char *license, struct bpf_insn *insns, size_t insn_cnt, struct bpf_prog_load_opts *load_attr, int prog_idx) { + int func_info_tot_sz = load_attr->func_info_cnt * + load_attr->func_info_rec_size; + int line_info_tot_sz = load_attr->line_info_cnt * + load_attr->line_info_rec_size; + int core_relo_tot_sz = gen->core_relo_cnt * + sizeof(struct bpf_core_relo); int prog_load_attr, license_off, insns_off, func_info, line_info, core_relos; int attr_size = offsetofend(union bpf_attr, core_relo_rec_size); union bpf_attr attr; @@ -947,32 +1007,63 @@ void bpf_gen__prog_load(struct bpf_gen *gen, license_off = add_data(gen, license, strlen(license) + 1); /* add insns to blob of bytes */ insns_off = add_data(gen, insns, insn_cnt * sizeof(struct bpf_insn)); + pr_debug("gen: prog_load: license off %d insn off %d\n", + license_off, insns_off);
- attr.prog_type = prog_type; - attr.expected_attach_type = load_attr->expected_attach_type; - attr.attach_btf_id = load_attr->attach_btf_id; - attr.prog_ifindex = load_attr->prog_ifindex; - attr.kern_version = 0; - attr.insn_cnt = (__u32)insn_cnt; - attr.prog_flags = load_attr->prog_flags; - - attr.func_info_rec_size = load_attr->func_info_rec_size; - attr.func_info_cnt = load_attr->func_info_cnt; - func_info = add_data(gen, load_attr->func_info, - attr.func_info_cnt * attr.func_info_rec_size); + /* convert blob insns to target endianness */ + if (gen->swapped_endian) { + struct bpf_insn *insn = gen->data_start + insns_off; + int i;
- attr.line_info_rec_size = load_attr->line_info_rec_size; - attr.line_info_cnt = load_attr->line_info_cnt; - line_info = add_data(gen, load_attr->line_info, - attr.line_info_cnt * attr.line_info_rec_size); + for (i = 0; i < insn_cnt; i++, insn++) + bpf_insn_bswap(insn); + }
- attr.core_relo_rec_size = sizeof(struct bpf_core_relo); - attr.core_relo_cnt = gen->core_relo_cnt; - core_relos = add_data(gen, gen->core_relos, - attr.core_relo_cnt * attr.core_relo_rec_size); + attr.prog_type = tgt_endian(prog_type); + attr.expected_attach_type = tgt_endian(load_attr->expected_attach_type); + attr.attach_btf_id = tgt_endian(load_attr->attach_btf_id); + attr.prog_ifindex = tgt_endian(load_attr->prog_ifindex); + attr.kern_version = 0; + attr.insn_cnt = tgt_endian((__u32)insn_cnt); + attr.prog_flags = tgt_endian(load_attr->prog_flags); + + attr.func_info_rec_size = tgt_endian(load_attr->func_info_rec_size); + attr.func_info_cnt = tgt_endian(load_attr->func_info_cnt); + func_info = add_data(gen, load_attr->func_info, func_info_tot_sz); + pr_debug("gen: prog_load: func_info: off %d cnt %d rec size %d\n", + func_info, load_attr->func_info_cnt, + load_attr->func_info_rec_size); + + /* convert func_info blob to target endianness */ + info_blob_bswap(gen, func_info, load_attr->func_info_cnt, + (anon_info_bswap_fn_t)bpf_func_info_bswap); + + attr.line_info_rec_size = tgt_endian(load_attr->line_info_rec_size); + attr.line_info_cnt = tgt_endian(load_attr->line_info_cnt); + line_info = add_data(gen, load_attr->line_info, line_info_tot_sz); + pr_debug("gen: prog_load: line_info: off %d cnt %d rec size %d\n", + line_info, load_attr->line_info_cnt, + load_attr->line_info_rec_size); + + /* convert line_info blob to target endianness */ + info_blob_bswap(gen, line_info, load_attr->line_info_cnt, + (anon_info_bswap_fn_t)bpf_line_info_bswap); + + attr.core_relo_rec_size = tgt_endian((__u32)sizeof(struct bpf_core_relo)); + attr.core_relo_cnt = tgt_endian(gen->core_relo_cnt); + core_relos = add_data(gen, gen->core_relos, core_relo_tot_sz); + pr_debug("gen: prog_load: core_relos: off %d cnt %d rec size %zd\n", + core_relos, gen->core_relo_cnt, + sizeof(struct bpf_core_relo)); + + /* convert core_relo info blob to target endianness */ + info_blob_bswap(gen, core_relos, gen->core_relo_cnt, + (anon_info_bswap_fn_t)bpf_core_relo_bswap);
libbpf_strlcpy(attr.prog_name, prog_name, sizeof(attr.prog_name)); prog_load_attr = add_data(gen, &attr, attr_size); + pr_debug("gen: prog_load: prog_load_attr: off %d size %d\n", + prog_load_attr, attr_size);
/* populate union bpf_attr with a pointer to license */ emit_rel_store(gen, attr_field(prog_load_attr, license), license_off); @@ -1068,6 +1159,8 @@ void bpf_gen__map_update_elem(struct bpf_gen *gen, int map_idx, void *pvalue, emit(gen, BPF_EMIT_CALL(BPF_FUNC_probe_read_kernel));
map_update_attr = add_data(gen, &attr, attr_size); + pr_debug("gen: map_update_elem: map_update_attr: off %d size %d\n", + map_update_attr, attr_size); move_blob2blob(gen, attr_field(map_update_attr, map_fd), 4, blob_fd_array_off(gen, map_idx)); emit_rel_store(gen, attr_field(map_update_attr, key), key); @@ -1084,14 +1177,18 @@ void bpf_gen__populate_outer_map(struct bpf_gen *gen, int outer_map_idx, int slo int attr_size = offsetofend(union bpf_attr, flags); int map_update_attr, key; union bpf_attr attr; + int tgt_slot;
memset(&attr, 0, attr_size); pr_debug("gen: populate_outer_map: outer %d key %d inner %d\n", outer_map_idx, slot, inner_map_idx);
- key = add_data(gen, &slot, sizeof(slot)); + tgt_slot = tgt_endian(slot); + key = add_data(gen, &tgt_slot, sizeof(tgt_slot));
map_update_attr = add_data(gen, &attr, attr_size); + pr_debug("gen: populate_outer_map: map_update_attr: off %d size %d\n", + map_update_attr, attr_size); move_blob2blob(gen, attr_field(map_update_attr, map_fd), 4, blob_fd_array_off(gen, outer_map_idx)); emit_rel_store(gen, attr_field(map_update_attr, key), key); @@ -1114,6 +1211,8 @@ void bpf_gen__map_freeze(struct bpf_gen *gen, int map_idx) memset(&attr, 0, attr_size); pr_debug("gen: map_freeze: idx %d\n", map_idx); map_freeze_attr = add_data(gen, &attr, attr_size); + pr_debug("gen: map_freeze: map_update_attr: off %d size %d\n", + map_freeze_attr, attr_size); move_blob2blob(gen, attr_field(map_freeze_attr, map_fd), 4, blob_fd_array_off(gen, map_idx)); /* emit MAP_FREEZE command */ diff --git a/tools/lib/bpf/libbpf.c b/tools/lib/bpf/libbpf.c index aa52870b1967..1ba73c27973c 100644 --- a/tools/lib/bpf/libbpf.c +++ b/tools/lib/bpf/libbpf.c @@ -9124,6 +9124,7 @@ int bpf_object__gen_loader(struct bpf_object *obj, struct gen_loader_opts *opts) if (!gen) return -ENOMEM; gen->opts = opts; + gen->swapped_endian = !is_native_endianness(obj); obj->gen_loader = gen; return 0; } diff --git a/tools/lib/bpf/skel_internal.h b/tools/lib/bpf/skel_internal.h index 1e82ab06c3eb..67e8477ecb5b 100644 --- a/tools/lib/bpf/skel_internal.h +++ b/tools/lib/bpf/skel_internal.h @@ -351,10 +351,11 @@ static inline int bpf_load_and_run(struct bpf_load_and_run_opts *opts) attr.test.ctx_size_in = opts->ctx->sz; err = skel_sys_bpf(BPF_PROG_RUN, &attr, test_run_attr_sz); if (err < 0 || (int)attr.test.retval < 0) { - opts->errstr = "failed to execute loader prog"; if (err < 0) { + opts->errstr = "failed to execute loader prog"; set_err; } else { + opts->errstr = "error returned by loader prog"; err = (int)attr.test.retval; #ifndef __KERNEL__ errno = -err;
On Fri, Aug 30, 2024 at 12:30 AM Tony Ambardar tony.ambardar@gmail.com wrote:
Track target endianness in 'struct bpf_gen' and process in-memory data in native byte-order, but on finalization convert the embedded loader BPF insns to target endianness.
The light skeleton also includes a target-accessed data blob which is heterogeneous and thus difficult to convert to target byte-order on finalization. Add support functions to convert data to target endianness as it is added to the blob.
Also add additional debug logging for data blob structure details and skeleton loading.
Signed-off-by: Tony Ambardar tony.ambardar@gmail.com
tools/lib/bpf/bpf_gen_internal.h | 1 + tools/lib/bpf/gen_loader.c | 187 +++++++++++++++++++++++-------- tools/lib/bpf/libbpf.c | 1 + tools/lib/bpf/skel_internal.h | 3 +- 4 files changed, 147 insertions(+), 45 deletions(-)
diff --git a/tools/lib/bpf/bpf_gen_internal.h b/tools/lib/bpf/bpf_gen_internal.h index fdf44403ff36..6ff963a491d9 100644 --- a/tools/lib/bpf/bpf_gen_internal.h +++ b/tools/lib/bpf/bpf_gen_internal.h @@ -34,6 +34,7 @@ struct bpf_gen { void *data_cur; void *insn_start; void *insn_cur;
bool swapped_endian; ssize_t cleanup_label; __u32 nr_progs; __u32 nr_maps;
diff --git a/tools/lib/bpf/gen_loader.c b/tools/lib/bpf/gen_loader.c index cf3323fd47b8..4374399bc3f8 100644 --- a/tools/lib/bpf/gen_loader.c +++ b/tools/lib/bpf/gen_loader.c @@ -401,6 +401,15 @@ int bpf_gen__finish(struct bpf_gen *gen, int nr_progs, int nr_maps) opts->insns_sz = gen->insn_cur - gen->insn_start; opts->data = gen->data_start; opts->data_sz = gen->data_cur - gen->data_start;
/* use target endianness for embedded loader */
if (gen->swapped_endian) {
struct bpf_insn *insn = (struct bpf_insn *)opts->insns;
int insn_cnt = opts->insns_sz / sizeof(struct bpf_insn);
for (i = 0; i < insn_cnt; i++)
bpf_insn_bswap(insn++);
} } return gen->error;
} @@ -414,6 +423,31 @@ void bpf_gen__free(struct bpf_gen *gen) free(gen); }
+/*
- Fields of bpf_attr are set to values in native byte-order before being
- written to the target-bound data blob, and may need endian conversion.
- This macro allows providing the correct value in situ more simply than
- writing a separate converter for *all fields* of *all records* included
- in union bpf_attr. Note that sizeof(rval) should match the assignment
- target to avoid runtime problems.
- */
+#define tgt_endian(rval) ({ \
typeof(rval) _val; \
if (!gen->swapped_endian) \
if/else has to have balanced branches w.r.t. {}. Either both should have it or both shouldn't. In this case both should have it.
_val = (rval); \
else { \
switch (sizeof(rval)) { \
case 1: _val = (rval); break; \
case 2: _val = bswap_16(rval); break; \
case 4: _val = bswap_32(rval); break; \
case 8: _val = bswap_64(rval); break; \
default:_val = (rval); \
pr_warn("unsupported bswap size!\n"); \
this is a weird formatting, but you can also just unconditionally assign _val, and only swap it if gen->swapped_endian
typeof(rval) _val = (rval);
if (gen->swapped_endian) { switch (...) { case 1: ... ... case 8: ... default: pr_warn("..."); } }
_val;
seems simpler and cleaner, imo
} \
} \
_val; \
+})
for the rest, Alexei, can you please review and give your ack?
On Fri, Aug 30, 2024 at 2:31 PM Andrii Nakryiko andrii.nakryiko@gmail.com wrote:
for the rest, Alexei, can you please review and give your ack?
It looks fine. All of the additional pr_debug()s look a bit excessive. Will take another look at respin.
On Fri, Aug 30, 2024 at 06:24:47PM -0700, Alexei Starovoitov wrote:
On Fri, Aug 30, 2024 at 2:31 PM Andrii Nakryiko andrii.nakryiko@gmail.com wrote:
for the rest, Alexei, can you please review and give your ack?
It looks fine. All of the additional pr_debug()s look a bit excessive. Will take another look at respin.
Thanks for taking a look. My experience is the added pr_debug() were essential to troubleshooting problems with the loader blobs, and only provide indexes to embedded data. I'd hate for someone else to debug without these. I think several of them could be consolidated however, so let me have a try.
There's another area where I'd appreciate your help. The loader code includes some inline debug statements, but I wasn't ever able to make these work or see them in trace logs. Could you share an example of how to enable these (e.g. for test_progs tests)?
Thanks, Tony
On Fri, Aug 30, 2024 at 02:30:46PM -0700, Andrii Nakryiko wrote:
On Fri, Aug 30, 2024 at 12:30 AM Tony Ambardar tony.ambardar@gmail.com wrote:
Track target endianness in 'struct bpf_gen' and process in-memory data in native byte-order, but on finalization convert the embedded loader BPF insns to target endianness.
The light skeleton also includes a target-accessed data blob which is heterogeneous and thus difficult to convert to target byte-order on finalization. Add support functions to convert data to target endianness as it is added to the blob.
Also add additional debug logging for data blob structure details and skeleton loading.
Signed-off-by: Tony Ambardar tony.ambardar@gmail.com
tools/lib/bpf/bpf_gen_internal.h | 1 + tools/lib/bpf/gen_loader.c | 187 +++++++++++++++++++++++-------- tools/lib/bpf/libbpf.c | 1 + tools/lib/bpf/skel_internal.h | 3 +- 4 files changed, 147 insertions(+), 45 deletions(-)
diff --git a/tools/lib/bpf/bpf_gen_internal.h b/tools/lib/bpf/bpf_gen_internal.h index fdf44403ff36..6ff963a491d9 100644 --- a/tools/lib/bpf/bpf_gen_internal.h +++ b/tools/lib/bpf/bpf_gen_internal.h @@ -34,6 +34,7 @@ struct bpf_gen { void *data_cur; void *insn_start; void *insn_cur;
bool swapped_endian; ssize_t cleanup_label; __u32 nr_progs; __u32 nr_maps;
diff --git a/tools/lib/bpf/gen_loader.c b/tools/lib/bpf/gen_loader.c index cf3323fd47b8..4374399bc3f8 100644 --- a/tools/lib/bpf/gen_loader.c +++ b/tools/lib/bpf/gen_loader.c @@ -401,6 +401,15 @@ int bpf_gen__finish(struct bpf_gen *gen, int nr_progs, int nr_maps) opts->insns_sz = gen->insn_cur - gen->insn_start; opts->data = gen->data_start; opts->data_sz = gen->data_cur - gen->data_start;
/* use target endianness for embedded loader */
if (gen->swapped_endian) {
struct bpf_insn *insn = (struct bpf_insn *)opts->insns;
int insn_cnt = opts->insns_sz / sizeof(struct bpf_insn);
for (i = 0; i < insn_cnt; i++)
bpf_insn_bswap(insn++);
} } return gen->error;
} @@ -414,6 +423,31 @@ void bpf_gen__free(struct bpf_gen *gen) free(gen); }
+/*
- Fields of bpf_attr are set to values in native byte-order before being
- written to the target-bound data blob, and may need endian conversion.
- This macro allows providing the correct value in situ more simply than
- writing a separate converter for *all fields* of *all records* included
- in union bpf_attr. Note that sizeof(rval) should match the assignment
- target to avoid runtime problems.
- */
+#define tgt_endian(rval) ({ \
typeof(rval) _val; \
if (!gen->swapped_endian) \
if/else has to have balanced branches w.r.t. {}. Either both should have it or both shouldn't. In this case both should have it.
_val = (rval); \
else { \
switch (sizeof(rval)) { \
case 1: _val = (rval); break; \
case 2: _val = bswap_16(rval); break; \
case 4: _val = bswap_32(rval); break; \
case 8: _val = bswap_64(rval); break; \
default:_val = (rval); \
pr_warn("unsupported bswap size!\n"); \
this is a weird formatting, but you can also just unconditionally assign _val, and only swap it if gen->swapped_endian
typeof(rval) _val = (rval);
if (gen->swapped_endian) { switch (...) { case 1: ... ... case 8: ... default: pr_warn("..."); } }
_val;
seems simpler and cleaner, imo
Yes, agreed. Will update.
} \
} \
_val; \
+})
for the rest, Alexei, can you please review and give your ack?
Update Makefile build rules to compile BPF programs with target endianness rather than host byte-order. With recent changes, this allows building the full selftests/bpf suite hosted on x86_64 and targeting s390x or mips64eb for example.
Signed-off-by: Tony Ambardar tony.ambardar@gmail.com --- tools/testing/selftests/bpf/Makefile | 7 ++++--- 1 file changed, 4 insertions(+), 3 deletions(-)
diff --git a/tools/testing/selftests/bpf/Makefile b/tools/testing/selftests/bpf/Makefile index c120617b64ad..27afbfa9e831 100644 --- a/tools/testing/selftests/bpf/Makefile +++ b/tools/testing/selftests/bpf/Makefile @@ -438,6 +438,7 @@ endef IS_LITTLE_ENDIAN = $(shell $(CC) -dM -E - </dev/null | \ grep 'define __BYTE_ORDER__ __ORDER_LITTLE_ENDIAN__') MENDIAN=$(if $(IS_LITTLE_ENDIAN),-mlittle-endian,-mbig-endian) +BPF_TARGET_ENDIAN=$(if $(IS_LITTLE_ENDIAN),--target=bpfel,--target=bpfeb)
ifneq ($(CROSS_COMPILE),) CLANG_TARGET_ARCH = --target=$(notdir $(CROSS_COMPILE:%-=%)) @@ -465,17 +466,17 @@ $(OUTPUT)/cgroup_getset_retval_hooks.o: cgroup_getset_retval_hooks.h # $4 - binary name define CLANG_BPF_BUILD_RULE $(call msg,CLNG-BPF,$4,$2) - $(Q)$(CLANG) $3 -O2 --target=bpf -c $1 -mcpu=v3 -o $2 + $(Q)$(CLANG) $3 -O2 $(BPF_TARGET_ENDIAN) -c $1 -mcpu=v3 -o $2 endef # Similar to CLANG_BPF_BUILD_RULE, but with disabled alu32 define CLANG_NOALU32_BPF_BUILD_RULE $(call msg,CLNG-BPF,$4,$2) - $(Q)$(CLANG) $3 -O2 --target=bpf -c $1 -mcpu=v2 -o $2 + $(Q)$(CLANG) $3 -O2 $(BPF_TARGET_ENDIAN) -c $1 -mcpu=v2 -o $2 endef # Similar to CLANG_BPF_BUILD_RULE, but with cpu-v4 define CLANG_CPUV4_BPF_BUILD_RULE $(call msg,CLNG-BPF,$4,$2) - $(Q)$(CLANG) $3 -O2 --target=bpf -c $1 -mcpu=v4 -o $2 + $(Q)$(CLANG) $3 -O2 $(BPF_TARGET_ENDIAN) -c $1 -mcpu=v4 -o $2 endef # Build BPF object using GCC define GCC_BPF_BUILD_RULE
linux-kselftest-mirror@lists.linaro.org