Hello all,
This patch series targets a long-standing BPF usability issue - the lack of general cross-compilation support - by enabling cross-endian usage of libbpf and bpftool, as well as supporting cross-endian build targets for selftests/bpf.
Benefits include improved BPF development and testing for embedded systems based on e.g. big-endian MIPS, more build options e.g for s390x systems, and better accessibility to the very latest test tools e.g. 'test_progs'.
The series touches many functional areas: BTF.ext handling; object access, introspection, and linking; generation of normal and "light" skeletons.
Initial development and testing used mips64, since this arch makes switching the build byte-order trivial and is thus very handy for A/B testing. However, it lacks some key features (bpf2bpf call, kfuncs, etc) making for poor selftests/bpf coverage.
Final testing takes the kernel and selftests/bpf cross-built from x86_64 to s390x, and runs the result under QEMU/s390x. That same configuration could also be used on kernel-patches/bpf CI for regression testing endian support or perhaps load-sharing s390x builds across x86_64 systems.
This thread includes some background regarding testing on QEMU/s390x and the generally favourable results: https://lore.kernel.org/bpf/ZsEcsaa3juxxQBUf@kodidev-ubuntu/
Earlier versions and related discussion of the series are here:
v1: https://lore.kernel.org/bpf/cover.1724216108.git.tony.ambardar@gmail.com/ v2: https://lore.kernel.org/bpf/cover.1724313164.git.tony.ambardar@gmail.com/ v3: https://lore.kernel.org/bpf/cover.1724843049.git.tony.ambardar@gmail.com/ v4: https://lore.kernel.org/bpf/cover.1724976539.git.tony.ambardar@gmail.com/
Feedback and suggestions are welcome!
Best regards, Tony
Changelog: --------- v4 -> v5: (feedback from Andrii and Eduard) - add separate functions to byte-swap info metadata and records, and ensure ordering so record bswaps occur when metadata is native endian - use new and existing macros to iterate through info sections/records, and check embedded record sizes match that of info structs used - drop use of <cough> evil callbacks - move setting swapped_endian flag to after byte-swapping functions are called during initialization, allowing funcs to infer endianness and drop a 'bool native' call parameter - simplify byte-swapping macro used to generate light skeleton, and use internal lib funcs to swap info records instead of assuming all __u32 - change info bswap library funcs to void return - rework/consolidate new debug statements to reduce their number - remove some unneeded handling of impossible errors, and drop a safety check already handled elsewhere - add and clarify some comments
v3 -> v4: - fix a use-after-free ELF data-handling error causing rare CI failures - move bswap functions for func/line/core-relo records to internal header - use bswap functions also for info blobs in light skeleton
v2 -> v3: (feedback from Andrii) - improve some log and commit message formatting - restructure BTF.ext endianness safety checks and byte-swapping - use BTF.ext info record definitions for swapping, require BTF v1 - follow BTF API implementation more closely for BTF.ext - explicitly reject loading non-native endianness program into kernel - simplify linker output byte-order setting - drop redundant safety checks during linking - simplify endianness macro and improve blob setup code for light skel - no unexpected test failures after cross-compiling x86_64 -> s390x
v1 -> v2: - fixed a light skeleton bug causing test_progs 'map_ptr' failure - simplified some BTF.ext related endianness logic - remove an 'inline' usage related to CI checkpatch failure - improve some formatting noted by checkpatch warnings - unexpected 'test_progs' failures drop 3 -> 2 (x86_64 to s390x cross)
Tony Ambardar (8): libbpf: Improve log message formatting libbpf: Fix header comment typos for BTF.ext libbpf: Fix output .symtab byte-order during linking libbpf: Support BTF.ext loading and output in either endianness libbpf: Support opening bpf objects of either endianness libbpf: Support linking bpf objects of either endianness libbpf: Support creating light skeleton of either endianness selftests/bpf: Support cross-endian building
tools/lib/bpf/bpf_gen_internal.h | 1 + tools/lib/bpf/btf.c | 242 +++++++++++++++++++++++++-- tools/lib/bpf/btf.h | 3 + tools/lib/bpf/btf_dump.c | 2 +- tools/lib/bpf/btf_relocate.c | 2 +- tools/lib/bpf/gen_loader.c | 191 +++++++++++++++------ tools/lib/bpf/libbpf.c | 57 +++++-- tools/lib/bpf/libbpf.map | 2 + tools/lib/bpf/libbpf_internal.h | 43 ++++- tools/lib/bpf/linker.c | 80 +++++++-- tools/lib/bpf/relo_core.c | 2 +- tools/lib/bpf/skel_internal.h | 3 +- tools/testing/selftests/bpf/Makefile | 7 +- 13 files changed, 529 insertions(+), 106 deletions(-)
Fix missing newlines and extraneous terminal spaces in messages.
Signed-off-by: Tony Ambardar tony.ambardar@gmail.com --- tools/lib/bpf/btf.c | 6 +++--- tools/lib/bpf/btf_dump.c | 2 +- tools/lib/bpf/btf_relocate.c | 2 +- tools/lib/bpf/libbpf.c | 4 ++-- tools/lib/bpf/relo_core.c | 2 +- 5 files changed, 8 insertions(+), 8 deletions(-)
diff --git a/tools/lib/bpf/btf.c b/tools/lib/bpf/btf.c index 40aae244e35f..5f094c1f4388 100644 --- a/tools/lib/bpf/btf.c +++ b/tools/lib/bpf/btf.c @@ -2941,7 +2941,7 @@ static int btf_ext_setup_info(struct btf_ext *btf_ext,
/* If no records, return failure now so .BTF.ext won't be used. */ if (!info_left) { - pr_debug("%s section in .BTF.ext has no records", ext_sec->desc); + pr_debug("%s section in .BTF.ext has no records\n", ext_sec->desc); return -EINVAL; }
@@ -3029,7 +3029,7 @@ static int btf_ext_parse_hdr(__u8 *data, __u32 data_size)
if (data_size < offsetofend(struct btf_ext_header, hdr_len) || data_size < hdr->hdr_len) { - pr_debug("BTF.ext header not found"); + pr_debug("BTF.ext header not found\n"); return -EINVAL; }
@@ -3291,7 +3291,7 @@ int btf__dedup(struct btf *btf, const struct btf_dedup_opts *opts)
d = btf_dedup_new(btf, opts); if (IS_ERR(d)) { - pr_debug("btf_dedup_new failed: %ld", PTR_ERR(d)); + pr_debug("btf_dedup_new failed: %ld\n", PTR_ERR(d)); return libbpf_err(-EINVAL); }
diff --git a/tools/lib/bpf/btf_dump.c b/tools/lib/bpf/btf_dump.c index 894860111ddb..18cbcf342f2b 100644 --- a/tools/lib/bpf/btf_dump.c +++ b/tools/lib/bpf/btf_dump.c @@ -1304,7 +1304,7 @@ static void btf_dump_emit_type_decl(struct btf_dump *d, __u32 id, * chain, restore stack, emit warning, and try to * proceed nevertheless */ - pr_warn("not enough memory for decl stack:%d", err); + pr_warn("not enough memory for decl stack: %d\n", err); d->decl_stack_cnt = stack_start; return; } diff --git a/tools/lib/bpf/btf_relocate.c b/tools/lib/bpf/btf_relocate.c index 4f7399d85eab..b72f83e15156 100644 --- a/tools/lib/bpf/btf_relocate.c +++ b/tools/lib/bpf/btf_relocate.c @@ -428,7 +428,7 @@ static int btf_relocate_rewrite_strs(struct btf_relocate *r, __u32 i) } else { off = r->str_map[*str_off]; if (!off) { - pr_warn("string '%s' [offset %u] is not mapped to base BTF", + pr_warn("string '%s' [offset %u] is not mapped to base BTF\n", btf__str_by_offset(r->btf, off), *str_off); return -ENOENT; } diff --git a/tools/lib/bpf/libbpf.c b/tools/lib/bpf/libbpf.c index d3a542649e6b..0226d3b50709 100644 --- a/tools/lib/bpf/libbpf.c +++ b/tools/lib/bpf/libbpf.c @@ -12755,7 +12755,7 @@ struct bpf_link *bpf_program__attach_freplace(const struct bpf_program *prog, }
if (prog->type != BPF_PROG_TYPE_EXT) { - pr_warn("prog '%s': only BPF_PROG_TYPE_EXT can attach as freplace", + pr_warn("prog '%s': only BPF_PROG_TYPE_EXT can attach as freplace\n", prog->name); return libbpf_err_ptr(-EINVAL); } @@ -13829,7 +13829,7 @@ int bpf_object__open_subskeleton(struct bpf_object_subskeleton *s) map_type = btf__type_by_id(btf, map_type_id);
if (!btf_is_datasec(map_type)) { - pr_warn("type for map '%1$s' is not a datasec: %2$s", + pr_warn("type for map '%1$s' is not a datasec: %2$s\n", bpf_map__name(map), __btf_kind_str(btf_kind(map_type))); return libbpf_err(-EINVAL); diff --git a/tools/lib/bpf/relo_core.c b/tools/lib/bpf/relo_core.c index 63a4d5ad12d1..7632e9d41827 100644 --- a/tools/lib/bpf/relo_core.c +++ b/tools/lib/bpf/relo_core.c @@ -1339,7 +1339,7 @@ int bpf_core_calc_relo_insn(const char *prog_name, cands->cands[i].id, cand_spec); if (err < 0) { bpf_core_format_spec(spec_buf, sizeof(spec_buf), cand_spec); - pr_warn("prog '%s': relo #%d: error matching candidate #%d %s: %d\n ", + pr_warn("prog '%s': relo #%d: error matching candidate #%d %s: %d\n", prog_name, relo_idx, i, spec_buf, err); return err; }
Mention struct btf_ext_info_sec rather than non-existent btf_sec_func_info in BTF.ext struct documentation.
Signed-off-by: Tony Ambardar tony.ambardar@gmail.com --- tools/lib/bpf/libbpf_internal.h | 4 ++-- 1 file changed, 2 insertions(+), 2 deletions(-)
diff --git a/tools/lib/bpf/libbpf_internal.h b/tools/lib/bpf/libbpf_internal.h index 408df59e0771..8cda511a1982 100644 --- a/tools/lib/bpf/libbpf_internal.h +++ b/tools/lib/bpf/libbpf_internal.h @@ -448,11 +448,11 @@ struct btf_ext_info { * * The func_info subsection layout: * record size for struct bpf_func_info in the func_info subsection - * struct btf_sec_func_info for section #1 + * struct btf_ext_info_sec for section #1 * a list of bpf_func_info records for section #1 * where struct bpf_func_info mimics one in include/uapi/linux/bpf.h * but may not be identical - * struct btf_sec_func_info for section #2 + * struct btf_ext_info_sec for section #2 * a list of bpf_func_info records for section #2 * ...... *
Object linking output data uses the default ELF_T_BYTE type for '.symtab' section data, which disables any libelf-based translation. Explicitly set the ELF_T_SYM type for output to restore libelf's byte-order conversion, noting that input '.symtab' data is already correctly translated.
Fixes: faf6ed321cf6 ("libbpf: Add BPF static linker APIs") Signed-off-by: Tony Ambardar tony.ambardar@gmail.com --- tools/lib/bpf/linker.c | 2 ++ 1 file changed, 2 insertions(+)
diff --git a/tools/lib/bpf/linker.c b/tools/lib/bpf/linker.c index 9cd3d4109788..7489306cd6f7 100644 --- a/tools/lib/bpf/linker.c +++ b/tools/lib/bpf/linker.c @@ -396,6 +396,8 @@ static int init_output_elf(struct bpf_linker *linker, const char *file) pr_warn_elf("failed to create SYMTAB data"); return -EINVAL; } + /* Ensure libelf translates byte-order of symbol records */ + sec->data->d_type = ELF_T_SYM;
str_off = strset__add_str(linker->strtab_strs, sec->sec_name); if (str_off < 0)
Support for handling BTF data of either endianness was added in [1], but did not include BTF.ext data for lack of use cases. Later, support for static linking [2] provided a use case, but this feature and later ones were restricted to native-endian usage.
Add support for BTF.ext handling in either endianness. Convert BTF.ext data to native endianness when read into memory for further processing, and support raw data access that restores the original byte-order for output. Add internal header functions for byte-swapping func, line, and core info records.
Add new API functions btf_ext__endianness() and btf_ext__set_endianness() for query and setting byte-order, as already exist for BTF data.
[1] 3289959b97ca ("libbpf: Support BTF loading and raw data output in both endianness") [2] 8fd27bf69b86 ("libbpf: Add BPF static linker BTF and BTF.ext support")
Signed-off-by: Tony Ambardar tony.ambardar@gmail.com --- tools/lib/bpf/btf.c | 238 +++++++++++++++++++++++++++++--- tools/lib/bpf/btf.h | 3 + tools/lib/bpf/libbpf.map | 2 + tools/lib/bpf/libbpf_internal.h | 28 ++++ 4 files changed, 255 insertions(+), 16 deletions(-)
diff --git a/tools/lib/bpf/btf.c b/tools/lib/bpf/btf.c index 5f094c1f4388..c11dfc81d007 100644 --- a/tools/lib/bpf/btf.c +++ b/tools/lib/bpf/btf.c @@ -3023,25 +3023,140 @@ static int btf_ext_setup_core_relos(struct btf_ext *btf_ext) return btf_ext_setup_info(btf_ext, ¶m); }
-static int btf_ext_parse_hdr(__u8 *data, __u32 data_size) +/* Swap byte-order of BTF.ext header with any endianness */ +static void btf_ext_bswap_hdr(struct btf_ext *btf_ext, __u32 hdr_len) { - const struct btf_ext_header *hdr = (struct btf_ext_header *)data; + struct btf_ext_header *h = btf_ext->hdr;
- if (data_size < offsetofend(struct btf_ext_header, hdr_len) || - data_size < hdr->hdr_len) { - pr_debug("BTF.ext header not found\n"); + h->magic = bswap_16(h->magic); + h->hdr_len = bswap_32(h->hdr_len); + h->func_info_off = bswap_32(h->func_info_off); + h->func_info_len = bswap_32(h->func_info_len); + h->line_info_off = bswap_32(h->line_info_off); + h->line_info_len = bswap_32(h->line_info_len); + + if (hdr_len < offsetofend(struct btf_ext_header, core_relo_len)) + return; + + h->core_relo_off = bswap_32(h->core_relo_off); + h->core_relo_len = bswap_32(h->core_relo_len); +} + +/* Swap metadata byte-order of generic info subsection */ +static int info_subsec_bswap_metadata(const struct btf_ext *btf_ext, struct btf_ext_info *ext_info) +{ + const bool is_native = btf_ext->swapped_endian; + __u32 left, *rs, rec_size, num_info; + struct btf_ext_info_sec *si; + + if (ext_info->len == 0) + return 0; + + rs = ext_info->info - sizeof(__u32); /* back up to record size */ + rec_size = is_native ? *rs : bswap_32(*rs); + if (rec_size != ext_info->rec_size) return -EINVAL; + *rs = bswap_32(*rs); + + si = ext_info->info; /* sec info #1 */ + left = ext_info->len; + while (left > 0) { + num_info = is_native ? si->num_info : bswap_32(si->num_info); + si->sec_name_off = bswap_32(si->sec_name_off); + si->num_info = bswap_32(si->num_info); + si = (void *)si->data + rec_size * num_info; + left -= offsetof(struct btf_ext_info_sec, data) + + rec_size * num_info; }
+ return 0; +} + +/* Swap byte order of info subsection metadata and records in the correct + * order depending on whether or not data is in native endianness. + */ +#define ORDER_INFO_BSWAP(btf_ext, ext_info, info_t, swap_fn) \ +{ \ + const bool is_native = btf_ext->swapped_endian; \ + struct btf_ext_info_sec *si; \ + int c, err; \ + info_t *i; \ + if (is_native) \ + for_each_btf_ext_sec(ext_info, si) \ + for_each_btf_ext_rec(ext_info, si, c, i) \ + swap_fn(i); \ + err = info_subsec_bswap_metadata(btf_ext, ext_info); \ + if (err) { \ + pr_warn(#info_t " record size mismatch!\n"); \ + return err; \ + } \ + if (!is_native) \ + for_each_btf_ext_sec(ext_info, si) \ + for_each_btf_ext_rec(ext_info, si, c, i) \ + swap_fn(i); \ +} + +/* + * Swap endianness of the whole info segment in a BTF.ext data section: + * - requires BTF.ext header data in native byte order + * - only support info structs from BTF version 1 + */ +static int btf_ext_bswap_info(struct btf_ext *btf_ext) +{ + const struct btf_ext_header *h = btf_ext->hdr; + struct btf_ext_info ext = {}; + + /* Swap func_info subsection byte-order */ + ext.info = (void *)h + h->hdr_len + h->func_info_off + sizeof(__u32); + ext.len = h->func_info_len - (h->func_info_len ? sizeof(__u32) : 0); + ext.rec_size = sizeof(struct bpf_func_info); + + ORDER_INFO_BSWAP(btf_ext, &ext, struct bpf_func_info, bpf_func_info_bswap); + + /* Swap line_info subsection byte-order */ + ext.info = (void *)h + h->hdr_len + h->line_info_off + sizeof(__u32); + ext.len = h->line_info_len - (h->line_info_len ? sizeof(__u32) : 0); + ext.rec_size = sizeof(struct bpf_line_info); + + ORDER_INFO_BSWAP(btf_ext, &ext, struct bpf_line_info, bpf_line_info_bswap); + + /* Swap core_relo subsection byte-order (if present) */ + if (h->hdr_len < offsetofend(struct btf_ext_header, core_relo_len)) + return 0; + + ext.info = (void *)h + h->hdr_len + h->core_relo_off + sizeof(__u32); + ext.len = h->core_relo_len - (h->core_relo_len ? sizeof(__u32) : 0); + ext.rec_size = sizeof(struct bpf_core_relo); + + ORDER_INFO_BSWAP(btf_ext, &ext, struct bpf_core_relo, bpf_core_relo_bswap); + + return 0; +} +#undef ORDER_INFO_BSWAP + +/* Validate hdr data & info sections, convert to native endianness */ +static int btf_ext_parse(struct btf_ext *btf_ext) +{ + __u32 hdr_len, info_size, data_size = btf_ext->data_size; + struct btf_ext_header *hdr = btf_ext->hdr; + bool swapped_endian = false; + + if (data_size < offsetofend(struct btf_ext_header, hdr_len)) { + pr_debug("BTF.ext header too short\n"); + return -EINVAL; + } + + hdr_len = hdr->hdr_len; if (hdr->magic == bswap_16(BTF_MAGIC)) { - pr_warn("BTF.ext in non-native endianness is not supported\n"); - return -ENOTSUP; + swapped_endian = true; + hdr_len = bswap_32(hdr_len); } else if (hdr->magic != BTF_MAGIC) { pr_debug("Invalid BTF.ext magic:%x\n", hdr->magic); return -EINVAL; }
- if (hdr->version != BTF_VERSION) { + /* Ensure known version of structs, current BTF_VERSION == 1 */ + if (hdr->version != 1) { pr_debug("Unsupported BTF.ext version:%u\n", hdr->version); return -ENOTSUP; } @@ -3051,11 +3166,50 @@ static int btf_ext_parse_hdr(__u8 *data, __u32 data_size) return -ENOTSUP; }
- if (data_size == hdr->hdr_len) { + if (data_size < hdr_len) { + pr_debug("BTF.ext header not found\n"); + return -EINVAL; + } else if (data_size == hdr_len) { pr_debug("BTF.ext has no data\n"); return -EINVAL; }
+ /* Verify mandatory hdr info details present */ + if (hdr_len < offsetofend(struct btf_ext_header, line_info_len)) { + pr_warn("BTF.ext header missing func_info, line_info\n"); + return -EINVAL; + } + + /* Keep hdr native byte-order in memory for introspection */ + if (swapped_endian) + btf_ext_bswap_hdr(btf_ext, hdr_len); + + /* Basic info section consistency checks*/ + info_size = btf_ext->data_size - hdr_len; + if (info_size & 0x03) { + pr_warn("BTF.ext info size not 4-byte multiple\n"); + return -EINVAL; + } + info_size -= hdr->func_info_len + hdr->line_info_len; + if (hdr_len >= offsetofend(struct btf_ext_header, core_relo_len)) + info_size -= hdr->core_relo_len; + if (info_size) { + pr_warn("BTF.ext info size mismatch with header data\n"); + return -EINVAL; + } + + /* Keep infos native byte-order in memory for introspection */ + if (swapped_endian) { + int err = btf_ext_bswap_info(btf_ext); + if (err) + return err; + } + /* + * Set btf_ext->swapped_endian only after all header and info data has + * been swapped, helping btf_ext_bswap_info() determine if its data + * is in native byte order when called. + */ + btf_ext->swapped_endian = swapped_endian; return 0; }
@@ -3067,6 +3221,7 @@ void btf_ext__free(struct btf_ext *btf_ext) free(btf_ext->line_info.sec_idxs); free(btf_ext->core_relo_info.sec_idxs); free(btf_ext->data); + free(btf_ext->data_swapped); free(btf_ext); }
@@ -3087,15 +3242,10 @@ struct btf_ext *btf_ext__new(const __u8 *data, __u32 size) } memcpy(btf_ext->data, data, size);
- err = btf_ext_parse_hdr(btf_ext->data, size); + err = btf_ext_parse(btf_ext); if (err) goto done;
- if (btf_ext->hdr->hdr_len < offsetofend(struct btf_ext_header, line_info_len)) { - err = -EINVAL; - goto done; - } - err = btf_ext_setup_func_info(btf_ext); if (err) goto done; @@ -3120,15 +3270,71 @@ struct btf_ext *btf_ext__new(const __u8 *data, __u32 size) return btf_ext; }
+static void *btf_ext_raw_data(const struct btf_ext *btf_ext_ro, __u32 *size, + bool swap_endian) +{ + struct btf_ext *btf_ext = (struct btf_ext *)btf_ext_ro; + const __u32 data_sz = btf_ext->data_size; + void *data; + + data = swap_endian ? btf_ext->data_swapped : btf_ext->data; + if (data) { + *size = data_sz; + return data; + } + + data = calloc(1, data_sz); + if (!data) + return NULL; + memcpy(data, btf_ext->data, data_sz); + + if (swap_endian) { + btf_ext_bswap_info(btf_ext); + btf_ext_bswap_hdr(btf_ext, btf_ext->hdr->hdr_len); + btf_ext->data_swapped = data; + } + + *size = data_sz; + return data; +} + const void *btf_ext__raw_data(const struct btf_ext *btf_ext, __u32 *size) { + __u32 data_sz; + void *data; + + data = btf_ext_raw_data(btf_ext, &data_sz, btf_ext->swapped_endian); + if (!data) + return errno = ENOMEM, NULL; + *size = btf_ext->data_size; - return btf_ext->data; + return data; }
__attribute__((alias("btf_ext__raw_data"))) const void *btf_ext__get_raw_data(const struct btf_ext *btf_ext, __u32 *size);
+enum btf_endianness btf_ext__endianness(const struct btf_ext *btf_ext) +{ + if (is_host_big_endian()) + return btf_ext->swapped_endian ? BTF_LITTLE_ENDIAN : BTF_BIG_ENDIAN; + else + return btf_ext->swapped_endian ? BTF_BIG_ENDIAN : BTF_LITTLE_ENDIAN; +} + +int btf_ext__set_endianness(struct btf_ext *btf_ext, enum btf_endianness endian) +{ + if (endian != BTF_LITTLE_ENDIAN && endian != BTF_BIG_ENDIAN) + return libbpf_err(-EINVAL); + + btf_ext->swapped_endian = is_host_big_endian() != (endian == BTF_BIG_ENDIAN); + + if (!btf_ext->swapped_endian) { + free(btf_ext->data_swapped); + btf_ext->data_swapped = NULL; + } + return 0; +}
struct btf_dedup;
diff --git a/tools/lib/bpf/btf.h b/tools/lib/bpf/btf.h index b68d216837a9..e3cf91687c78 100644 --- a/tools/lib/bpf/btf.h +++ b/tools/lib/bpf/btf.h @@ -167,6 +167,9 @@ LIBBPF_API const char *btf__str_by_offset(const struct btf *btf, __u32 offset); LIBBPF_API struct btf_ext *btf_ext__new(const __u8 *data, __u32 size); LIBBPF_API void btf_ext__free(struct btf_ext *btf_ext); LIBBPF_API const void *btf_ext__raw_data(const struct btf_ext *btf_ext, __u32 *size); +LIBBPF_API enum btf_endianness btf_ext__endianness(const struct btf_ext *btf_ext); +LIBBPF_API int btf_ext__set_endianness(struct btf_ext *btf_ext, + enum btf_endianness endian);
LIBBPF_API int btf__find_str(struct btf *btf, const char *s); LIBBPF_API int btf__add_str(struct btf *btf, const char *s); diff --git a/tools/lib/bpf/libbpf.map b/tools/lib/bpf/libbpf.map index 8f0d9ea3b1b4..5c17632807b6 100644 --- a/tools/lib/bpf/libbpf.map +++ b/tools/lib/bpf/libbpf.map @@ -421,6 +421,8 @@ LIBBPF_1.5.0 { global: btf__distill_base; btf__relocate; + btf_ext__endianness; + btf_ext__set_endianness; bpf_map__autoattach; bpf_map__set_autoattach; bpf_program__attach_sockmap; diff --git a/tools/lib/bpf/libbpf_internal.h b/tools/lib/bpf/libbpf_internal.h index 8cda511a1982..a8531195acd4 100644 --- a/tools/lib/bpf/libbpf_internal.h +++ b/tools/lib/bpf/libbpf_internal.h @@ -484,6 +484,8 @@ struct btf_ext { struct btf_ext_header *hdr; void *data; }; + void *data_swapped; + bool swapped_endian; struct btf_ext_info func_info; struct btf_ext_info line_info; struct btf_ext_info core_relo_info; @@ -511,6 +513,32 @@ struct bpf_line_info_min { __u32 line_col; };
+/* Functions to byte-swap info records */ + +static inline void bpf_func_info_bswap(struct bpf_func_info *i) +{ + i->insn_off = bswap_32(i->insn_off); + i->type_id = bswap_32(i->type_id); +} + +static inline void bpf_line_info_bswap(struct bpf_line_info *i) +{ + i->insn_off = bswap_32(i->insn_off); + i->file_name_off = bswap_32(i->file_name_off); + i->line_off = bswap_32(i->line_off); + i->line_col = bswap_32(i->line_col); +} + +static inline void bpf_core_relo_bswap(struct bpf_core_relo *i) +{ + _Static_assert(sizeof(i->kind) == sizeof(__u32), + "enum bpf_core_relo_kind is not 32-bit\n"); + i->insn_off = bswap_32(i->insn_off); + i->type_id = bswap_32(i->type_id); + i->access_str_off = bswap_32(i->access_str_off); + i->kind = bswap_32(i->kind); +} + enum btf_field_iter_kind { BTF_FIELD_ITER_IDS, BTF_FIELD_ITER_STRS,
On Tue, Sep 3, 2024 at 12:33 AM Tony Ambardar tony.ambardar@gmail.com wrote:
Support for handling BTF data of either endianness was added in [1], but did not include BTF.ext data for lack of use cases. Later, support for static linking [2] provided a use case, but this feature and later ones were restricted to native-endian usage.
Add support for BTF.ext handling in either endianness. Convert BTF.ext data to native endianness when read into memory for further processing, and support raw data access that restores the original byte-order for output. Add internal header functions for byte-swapping func, line, and core info records.
Add new API functions btf_ext__endianness() and btf_ext__set_endianness() for query and setting byte-order, as already exist for BTF data.
[1] 3289959b97ca ("libbpf: Support BTF loading and raw data output in both endianness") [2] 8fd27bf69b86 ("libbpf: Add BPF static linker BTF and BTF.ext support")
Signed-off-by: Tony Ambardar tony.ambardar@gmail.com
tools/lib/bpf/btf.c | 238 +++++++++++++++++++++++++++++--- tools/lib/bpf/btf.h | 3 + tools/lib/bpf/libbpf.map | 2 + tools/lib/bpf/libbpf_internal.h | 28 ++++ 4 files changed, 255 insertions(+), 16 deletions(-)
diff --git a/tools/lib/bpf/btf.c b/tools/lib/bpf/btf.c index 5f094c1f4388..c11dfc81d007 100644 --- a/tools/lib/bpf/btf.c +++ b/tools/lib/bpf/btf.c @@ -3023,25 +3023,140 @@ static int btf_ext_setup_core_relos(struct btf_ext *btf_ext) return btf_ext_setup_info(btf_ext, ¶m); }
-static int btf_ext_parse_hdr(__u8 *data, __u32 data_size) +/* Swap byte-order of BTF.ext header with any endianness */ +static void btf_ext_bswap_hdr(struct btf_ext *btf_ext, __u32 hdr_len) {
const struct btf_ext_header *hdr = (struct btf_ext_header *)data;
struct btf_ext_header *h = btf_ext->hdr;
if (data_size < offsetofend(struct btf_ext_header, hdr_len) ||
data_size < hdr->hdr_len) {
pr_debug("BTF.ext header not found\n");
h->magic = bswap_16(h->magic);
h->hdr_len = bswap_32(h->hdr_len);
h->func_info_off = bswap_32(h->func_info_off);
h->func_info_len = bswap_32(h->func_info_len);
h->line_info_off = bswap_32(h->line_info_off);
h->line_info_len = bswap_32(h->line_info_len);
if (hdr_len < offsetofend(struct btf_ext_header, core_relo_len))
return;
h->core_relo_off = bswap_32(h->core_relo_off);
h->core_relo_len = bswap_32(h->core_relo_len);
+}
+/* Swap metadata byte-order of generic info subsection */ +static int info_subsec_bswap_metadata(const struct btf_ext *btf_ext, struct btf_ext_info *ext_info) +{
const bool is_native = btf_ext->swapped_endian;
__u32 left, *rs, rec_size, num_info;
struct btf_ext_info_sec *si;
if (ext_info->len == 0)
return 0;
rs = ext_info->info - sizeof(__u32); /* back up to record size */
rec_size = is_native ? *rs : bswap_32(*rs);
if (rec_size != ext_info->rec_size) return -EINVAL;
*rs = bswap_32(*rs);
si = ext_info->info; /* sec info #1 */
left = ext_info->len;
while (left > 0) {
num_info = is_native ? si->num_info : bswap_32(si->num_info);
si->sec_name_off = bswap_32(si->sec_name_off);
si->num_info = bswap_32(si->num_info);
si = (void *)si->data + rec_size * num_info;
left -= offsetof(struct btf_ext_info_sec, data) +
rec_size * num_info; }
return 0;
+}
+/* Swap byte order of info subsection metadata and records in the correct
- order depending on whether or not data is in native endianness.
- */
+#define ORDER_INFO_BSWAP(btf_ext, ext_info, info_t, swap_fn) \ +{ \
const bool is_native = btf_ext->swapped_endian; \
struct btf_ext_info_sec *si; \
int c, err; \
info_t *i; \
if (is_native) \
for_each_btf_ext_sec(ext_info, si) \
for_each_btf_ext_rec(ext_info, si, c, i) \
swap_fn(i); \
err = info_subsec_bswap_metadata(btf_ext, ext_info); \
if (err) { \
pr_warn(#info_t " record size mismatch!\n"); \
return err; \
} \
if (!is_native) \
for_each_btf_ext_sec(ext_info, si) \
for_each_btf_ext_rec(ext_info, si, c, i) \
swap_fn(i); \
+}
Nope-nope-nope, I'm never landing something like this :)
Ok, if there is no clean swapping and setup separation we can come up with, let's still do a decent job at trying to keep all this coherent and simple.
How about this? btf_ext_bswap_hdr() stays, it seems fine.
btf_ext_setup_func_info(), btf_ext_setup_line_info(), and btf_ext_setup_core_relos() all call into generic btf_ext_setup_info, right? We teach btf_ext_setup_info() to work with both endianness (but not swap data at all). The only two fields that we need to byte-swap on the fly are record_size and sinfo->num_info, so it's minimal changes to btf_ext_setup_info().
Oh, and while we are at it, if data is in non-native endianness, btf_ext_setup_info() should return failure if any of the record sizes are not *exactly* matching our expectations. This should be a trivial addition as we have never extended those records, I believe.
This will take care of correctness checking regardless of endianness.
Now, forget about for_each_btf_ext_{sec,rec}(), it will be too fragile to make them work for inverted endianness. But we know now that the data is correct, right? So, fine, let's have a bit of duplication (but without any error checking) to go over raw .BTF.ext data and swap all the records and all metadata fields. This logic now will be used after btf_ext_setup_*() steps, and in btf_ext_raw_data(). For btf_ext_raw_data() you'll also btf_ext_bswap_hdr() at the very end, right?
How does that sound? Am I missing something big again?
And fine, let's use callbacks for different record types to keep this simple. I dislike callbacks, in principle, but sometimes they are the simplest way forward, unfortunately (a proper iterator would be better, but that's another story and I don't want to get into that implementation task just yet).
+/*
- Swap endianness of the whole info segment in a BTF.ext data section:
- requires BTF.ext header data in native byte order
- only support info structs from BTF version 1
- */
+static int btf_ext_bswap_info(struct btf_ext *btf_ext) +{
const struct btf_ext_header *h = btf_ext->hdr;
struct btf_ext_info ext = {};
/* Swap func_info subsection byte-order */
ext.info = (void *)h + h->hdr_len + h->func_info_off + sizeof(__u32);
ext.len = h->func_info_len - (h->func_info_len ? sizeof(__u32) : 0);
ext.rec_size = sizeof(struct bpf_func_info);
ORDER_INFO_BSWAP(btf_ext, &ext, struct bpf_func_info, bpf_func_info_bswap);
You shouldn't have bent over backwards just to use for_each_btf_ext_{sec,rec}() macros. Just because I proposed it (initially, without actually coding anything) doesn't mean it's the best and final solution :)
/* Swap line_info subsection byte-order */
ext.info = (void *)h + h->hdr_len + h->line_info_off + sizeof(__u32);
ext.len = h->line_info_len - (h->line_info_len ? sizeof(__u32) : 0);
ext.rec_size = sizeof(struct bpf_line_info);
ORDER_INFO_BSWAP(btf_ext, &ext, struct bpf_line_info, bpf_line_info_bswap);
/* Swap core_relo subsection byte-order (if present) */
if (h->hdr_len < offsetofend(struct btf_ext_header, core_relo_len))
return 0;
ext.info = (void *)h + h->hdr_len + h->core_relo_off + sizeof(__u32);
ext.len = h->core_relo_len - (h->core_relo_len ? sizeof(__u32) : 0);
ext.rec_size = sizeof(struct bpf_core_relo);
ORDER_INFO_BSWAP(btf_ext, &ext, struct bpf_core_relo, bpf_core_relo_bswap);
return 0;
+} +#undef ORDER_INFO_BSWAP
[...]
On Wed, Sep 04, 2024 at 12:48:36PM -0700, Andrii Nakryiko wrote:
On Tue, Sep 3, 2024 at 12:33 AM Tony Ambardar tony.ambardar@gmail.com wrote:
Support for handling BTF data of either endianness was added in [1], but did not include BTF.ext data for lack of use cases. Later, support for static linking [2] provided a use case, but this feature and later ones were restricted to native-endian usage.
Add support for BTF.ext handling in either endianness. Convert BTF.ext data to native endianness when read into memory for further processing, and support raw data access that restores the original byte-order for output. Add internal header functions for byte-swapping func, line, and core info records.
Add new API functions btf_ext__endianness() and btf_ext__set_endianness() for query and setting byte-order, as already exist for BTF data.
[1] 3289959b97ca ("libbpf: Support BTF loading and raw data output in both endianness") [2] 8fd27bf69b86 ("libbpf: Add BPF static linker BTF and BTF.ext support")
Signed-off-by: Tony Ambardar tony.ambardar@gmail.com
tools/lib/bpf/btf.c | 238 +++++++++++++++++++++++++++++--- tools/lib/bpf/btf.h | 3 + tools/lib/bpf/libbpf.map | 2 + tools/lib/bpf/libbpf_internal.h | 28 ++++ 4 files changed, 255 insertions(+), 16 deletions(-)
diff --git a/tools/lib/bpf/btf.c b/tools/lib/bpf/btf.c index 5f094c1f4388..c11dfc81d007 100644 --- a/tools/lib/bpf/btf.c +++ b/tools/lib/bpf/btf.c @@ -3023,25 +3023,140 @@ static int btf_ext_setup_core_relos(struct btf_ext *btf_ext) return btf_ext_setup_info(btf_ext, ¶m); }
-static int btf_ext_parse_hdr(__u8 *data, __u32 data_size) +/* Swap byte-order of BTF.ext header with any endianness */ +static void btf_ext_bswap_hdr(struct btf_ext *btf_ext, __u32 hdr_len) {
const struct btf_ext_header *hdr = (struct btf_ext_header *)data;
struct btf_ext_header *h = btf_ext->hdr;
if (data_size < offsetofend(struct btf_ext_header, hdr_len) ||
data_size < hdr->hdr_len) {
pr_debug("BTF.ext header not found\n");
h->magic = bswap_16(h->magic);
h->hdr_len = bswap_32(h->hdr_len);
h->func_info_off = bswap_32(h->func_info_off);
h->func_info_len = bswap_32(h->func_info_len);
h->line_info_off = bswap_32(h->line_info_off);
h->line_info_len = bswap_32(h->line_info_len);
if (hdr_len < offsetofend(struct btf_ext_header, core_relo_len))
return;
h->core_relo_off = bswap_32(h->core_relo_off);
h->core_relo_len = bswap_32(h->core_relo_len);
+}
+/* Swap metadata byte-order of generic info subsection */ +static int info_subsec_bswap_metadata(const struct btf_ext *btf_ext, struct btf_ext_info *ext_info) +{
const bool is_native = btf_ext->swapped_endian;
__u32 left, *rs, rec_size, num_info;
struct btf_ext_info_sec *si;
if (ext_info->len == 0)
return 0;
rs = ext_info->info - sizeof(__u32); /* back up to record size */
rec_size = is_native ? *rs : bswap_32(*rs);
if (rec_size != ext_info->rec_size) return -EINVAL;
*rs = bswap_32(*rs);
si = ext_info->info; /* sec info #1 */
left = ext_info->len;
while (left > 0) {
num_info = is_native ? si->num_info : bswap_32(si->num_info);
si->sec_name_off = bswap_32(si->sec_name_off);
si->num_info = bswap_32(si->num_info);
si = (void *)si->data + rec_size * num_info;
left -= offsetof(struct btf_ext_info_sec, data) +
rec_size * num_info; }
return 0;
+}
+/* Swap byte order of info subsection metadata and records in the correct
- order depending on whether or not data is in native endianness.
- */
+#define ORDER_INFO_BSWAP(btf_ext, ext_info, info_t, swap_fn) \ +{ \
const bool is_native = btf_ext->swapped_endian; \
struct btf_ext_info_sec *si; \
int c, err; \
info_t *i; \
if (is_native) \
for_each_btf_ext_sec(ext_info, si) \
for_each_btf_ext_rec(ext_info, si, c, i) \
swap_fn(i); \
err = info_subsec_bswap_metadata(btf_ext, ext_info); \
if (err) { \
pr_warn(#info_t " record size mismatch!\n"); \
return err; \
} \
if (!is_native) \
for_each_btf_ext_sec(ext_info, si) \
for_each_btf_ext_rec(ext_info, si, c, i) \
swap_fn(i); \
+}
Nope-nope-nope, I'm never landing something like this :)
Ok, if there is no clean swapping and setup separation we can come up with, let's still do a decent job at trying to keep all this coherent and simple.
How about this? btf_ext_bswap_hdr() stays, it seems fine.
btf_ext_setup_func_info(), btf_ext_setup_line_info(), and btf_ext_setup_core_relos() all call into generic btf_ext_setup_info, right? We teach btf_ext_setup_info() to work with both endianness (but not swap data at all). The only two fields that we need to byte-swap on the fly are record_size and sinfo->num_info, so it's minimal changes to btf_ext_setup_info().
Oh, and while we are at it, if data is in non-native endianness, btf_ext_setup_info() should return failure if any of the record sizes are not *exactly* matching our expectations. This should be a trivial addition as we have never extended those records, I believe.
This will take care of correctness checking regardless of endianness.
Now, forget about for_each_btf_ext_{sec,rec}(), it will be too fragile to make them work for inverted endianness. But we know now that the data is correct, right? So, fine, let's have a bit of duplication (but without any error checking) to go over raw .BTF.ext data and swap all the records and all metadata fields. This logic now will be used after btf_ext_setup_*() steps, and in btf_ext_raw_data(). For btf_ext_raw_data() you'll also btf_ext_bswap_hdr() at the very end, right?
How does that sound? Am I missing something big again?
And fine, let's use callbacks for different record types to keep this simple. I dislike callbacks, in principle, but sometimes they are the simplest way forward, unfortunately (a proper iterator would be better, but that's another story and I don't want to get into that implementation task just yet).
Hi Andrii,
Sorry, this took longer than expected to get back to, but hopefully we're nearly there.
In principle this all works, but so did first swapping info metadata and data together per my v4 patch. Calling various "setup" functions before byte-swapping seems confusing too, and warrants some reorganization, also to cleanly pass the current byte-order state into info parsing functions. On balance, I do agree there's an advantage to reusing the existing info validation checks and dropping the extra ones I had to add, so this seems a worthwhile approach to try. I'll follow up with a v6 shortly.
+/*
- Swap endianness of the whole info segment in a BTF.ext data section:
- requires BTF.ext header data in native byte order
- only support info structs from BTF version 1
- */
+static int btf_ext_bswap_info(struct btf_ext *btf_ext) +{
const struct btf_ext_header *h = btf_ext->hdr;
struct btf_ext_info ext = {};
/* Swap func_info subsection byte-order */
ext.info = (void *)h + h->hdr_len + h->func_info_off + sizeof(__u32);
ext.len = h->func_info_len - (h->func_info_len ? sizeof(__u32) : 0);
ext.rec_size = sizeof(struct bpf_func_info);
ORDER_INFO_BSWAP(btf_ext, &ext, struct bpf_func_info, bpf_func_info_bswap);
You shouldn't have bent over backwards just to use for_each_btf_ext_{sec,rec}() macros. Just because I proposed it (initially, without actually coding anything) doesn't mean it's the best and final solution :)
/* Swap line_info subsection byte-order */
ext.info = (void *)h + h->hdr_len + h->line_info_off + sizeof(__u32);
ext.len = h->line_info_len - (h->line_info_len ? sizeof(__u32) : 0);
ext.rec_size = sizeof(struct bpf_line_info);
ORDER_INFO_BSWAP(btf_ext, &ext, struct bpf_line_info, bpf_line_info_bswap);
/* Swap core_relo subsection byte-order (if present) */
if (h->hdr_len < offsetofend(struct btf_ext_header, core_relo_len))
return 0;
ext.info = (void *)h + h->hdr_len + h->core_relo_off + sizeof(__u32);
ext.len = h->core_relo_len - (h->core_relo_len ? sizeof(__u32) : 0);
ext.rec_size = sizeof(struct bpf_core_relo);
ORDER_INFO_BSWAP(btf_ext, &ext, struct bpf_core_relo, bpf_core_relo_bswap);
return 0;
+} +#undef ORDER_INFO_BSWAP
[...]
Allow bpf_object__open() to access files of either endianness, and convert included BPF programs to native byte-order in-memory for introspection. Loading BPF objects of non-native byte-order is still disallowed however.
Signed-off-by: Tony Ambardar tony.ambardar@gmail.com --- tools/lib/bpf/libbpf.c | 52 +++++++++++++++++++++++++++------ tools/lib/bpf/libbpf_internal.h | 11 +++++++ 2 files changed, 54 insertions(+), 9 deletions(-)
diff --git a/tools/lib/bpf/libbpf.c b/tools/lib/bpf/libbpf.c index 0226d3b50709..46f41ea5e74d 100644 --- a/tools/lib/bpf/libbpf.c +++ b/tools/lib/bpf/libbpf.c @@ -694,6 +694,8 @@ struct bpf_object { /* Information when doing ELF related work. Only valid if efile.elf is not NULL */ struct elf_state efile;
+ unsigned char byteorder; + struct btf *btf; struct btf_ext *btf_ext;
@@ -940,6 +942,21 @@ bpf_object__add_programs(struct bpf_object *obj, Elf_Data *sec_data, return 0; }
+static void bpf_object_bswap_progs(struct bpf_object *obj) +{ + struct bpf_program *prog = obj->programs; + struct bpf_insn *insn; + int p, i; + + for (p = 0; p < obj->nr_programs; p++, prog++) { + insn = prog->insns; + for (i = 0; i < prog->insns_cnt; i++, insn++) + bpf_insn_bswap(insn); + } + pr_debug("converted %zu BPF programs to native byte order\n", + obj->nr_programs); +} + static const struct btf_member * find_member_by_offset(const struct btf_type *t, __u32 bit_offset) { @@ -1506,6 +1523,7 @@ static void bpf_object__elf_finish(struct bpf_object *obj)
elf_end(obj->efile.elf); obj->efile.elf = NULL; + obj->efile.ehdr = NULL; obj->efile.symbols = NULL; obj->efile.arena_data = NULL;
@@ -1571,6 +1589,16 @@ static int bpf_object__elf_init(struct bpf_object *obj) goto errout; }
+ /* Validate ELF object endianness... */ + if (ehdr->e_ident[EI_DATA] != ELFDATA2LSB && + ehdr->e_ident[EI_DATA] != ELFDATA2MSB) { + err = -LIBBPF_ERRNO__ENDIAN; + pr_warn("elf: '%s' has unknown byte order\n", obj->path); + goto errout; + } + /* and save after bpf_object_open() frees ELF data */ + obj->byteorder = ehdr->e_ident[EI_DATA]; + if (elf_getshdrstrndx(elf, &obj->efile.shstrndx)) { pr_warn("elf: failed to get section names section index for %s: %s\n", obj->path, elf_errmsg(-1)); @@ -1599,19 +1627,15 @@ static int bpf_object__elf_init(struct bpf_object *obj) return err; }
-static int bpf_object__check_endianness(struct bpf_object *obj) +static bool is_native_endianness(struct bpf_object *obj) { #if __BYTE_ORDER__ == __ORDER_LITTLE_ENDIAN__ - if (obj->efile.ehdr->e_ident[EI_DATA] == ELFDATA2LSB) - return 0; + return obj->byteorder == ELFDATA2LSB; #elif __BYTE_ORDER__ == __ORDER_BIG_ENDIAN__ - if (obj->efile.ehdr->e_ident[EI_DATA] == ELFDATA2MSB) - return 0; + return obj->byteorder == ELFDATA2MSB; #else # error "Unrecognized __BYTE_ORDER__" #endif - pr_warn("elf: endianness mismatch in %s.\n", obj->path); - return -LIBBPF_ERRNO__ENDIAN; }
static int @@ -3953,6 +3977,10 @@ static int bpf_object__elf_collect(struct bpf_object *obj) return -LIBBPF_ERRNO__FORMAT; }
+ /* change BPF program insns to native endianness for introspection */ + if (!is_native_endianness(obj)) + bpf_object_bswap_progs(obj); + /* sort BPF programs by section name and in-section instruction offset * for faster search */ @@ -7992,7 +8020,6 @@ static struct bpf_object *bpf_object_open(const char *path, const void *obj_buf, }
err = bpf_object__elf_init(obj); - err = err ? : bpf_object__check_endianness(obj); err = err ? : bpf_object__elf_collect(obj); err = err ? : bpf_object__collect_externs(obj); err = err ? : bpf_object_fixup_btf(obj); @@ -8498,8 +8525,15 @@ static int bpf_object_load(struct bpf_object *obj, int extra_log_level, const ch return libbpf_err(-EINVAL); }
- if (obj->gen_loader) + /* Disallow kernel loading programs of non-native endianness but + * permit cross-endian creation of "light skeleton". + */ + if (obj->gen_loader) { bpf_gen__init(obj->gen_loader, extra_log_level, obj->nr_programs, obj->nr_maps); + } else if (!is_native_endianness(obj)) { + pr_warn("object '%s': loading non-native endianness is unsupported\n", obj->name); + return libbpf_err(-LIBBPF_ERRNO__ENDIAN); + }
err = bpf_object_prepare_token(obj); err = err ? : bpf_object__probe_loading(obj); diff --git a/tools/lib/bpf/libbpf_internal.h b/tools/lib/bpf/libbpf_internal.h index a8531195acd4..eda7a0cc2b8c 100644 --- a/tools/lib/bpf/libbpf_internal.h +++ b/tools/lib/bpf/libbpf_internal.h @@ -11,6 +11,7 @@
#include <stdlib.h> #include <limits.h> +#include <byteswap.h> #include <errno.h> #include <linux/err.h> #include <fcntl.h> @@ -616,6 +617,16 @@ static inline bool is_ldimm64_insn(struct bpf_insn *insn) return insn->code == (BPF_LD | BPF_IMM | BPF_DW); }
+static inline void bpf_insn_bswap(struct bpf_insn *insn) +{ + __u8 tmp_reg = insn->dst_reg; + + insn->dst_reg = insn->src_reg; + insn->src_reg = tmp_reg; + insn->off = bswap_16(insn->off); + insn->imm = bswap_32(insn->imm); +} + /* Unconditionally dup FD, ensuring it doesn't use [0, 2] range. * Original FD is not closed or altered in any other way. * Preserves original FD value, if it's invalid (negative).
On Tue, Sep 3, 2024 at 12:33 AM Tony Ambardar tony.ambardar@gmail.com wrote:
Allow bpf_object__open() to access files of either endianness, and convert included BPF programs to native byte-order in-memory for introspection. Loading BPF objects of non-native byte-order is still disallowed however.
Signed-off-by: Tony Ambardar tony.ambardar@gmail.com
tools/lib/bpf/libbpf.c | 52 +++++++++++++++++++++++++++------ tools/lib/bpf/libbpf_internal.h | 11 +++++++ 2 files changed, 54 insertions(+), 9 deletions(-)
diff --git a/tools/lib/bpf/libbpf.c b/tools/lib/bpf/libbpf.c index 0226d3b50709..46f41ea5e74d 100644 --- a/tools/lib/bpf/libbpf.c +++ b/tools/lib/bpf/libbpf.c @@ -694,6 +694,8 @@ struct bpf_object { /* Information when doing ELF related work. Only valid if efile.elf is not NULL */ struct elf_state efile;
unsigned char byteorder;
struct btf *btf; struct btf_ext *btf_ext;
@@ -940,6 +942,21 @@ bpf_object__add_programs(struct bpf_object *obj, Elf_Data *sec_data, return 0; }
+static void bpf_object_bswap_progs(struct bpf_object *obj) +{
struct bpf_program *prog = obj->programs;
struct bpf_insn *insn;
int p, i;
for (p = 0; p < obj->nr_programs; p++, prog++) {
insn = prog->insns;
for (i = 0; i < prog->insns_cnt; i++, insn++)
bpf_insn_bswap(insn);
}
pr_debug("converted %zu BPF programs to native byte order\n",
obj->nr_programs);
Does it fit in 100 characters? If yes, it stays on single line. That's the rule for all code, don't wrap lines unnecessarily.
+}
static const struct btf_member * find_member_by_offset(const struct btf_type *t, __u32 bit_offset) { @@ -1506,6 +1523,7 @@ static void bpf_object__elf_finish(struct bpf_object *obj)
elf_end(obj->efile.elf); obj->efile.elf = NULL;
obj->efile.ehdr = NULL; obj->efile.symbols = NULL; obj->efile.arena_data = NULL;
@@ -1571,6 +1589,16 @@ static int bpf_object__elf_init(struct bpf_object *obj) goto errout; }
/* Validate ELF object endianness... */
if (ehdr->e_ident[EI_DATA] != ELFDATA2LSB &&
ehdr->e_ident[EI_DATA] != ELFDATA2MSB) {
err = -LIBBPF_ERRNO__ENDIAN;
pr_warn("elf: '%s' has unknown byte order\n", obj->path);
goto errout;
}
/* and save after bpf_object_open() frees ELF data */
obj->byteorder = ehdr->e_ident[EI_DATA];
if (elf_getshdrstrndx(elf, &obj->efile.shstrndx)) { pr_warn("elf: failed to get section names section index for %s: %s\n", obj->path, elf_errmsg(-1));
@@ -1599,19 +1627,15 @@ static int bpf_object__elf_init(struct bpf_object *obj) return err; }
-static int bpf_object__check_endianness(struct bpf_object *obj) +static bool is_native_endianness(struct bpf_object *obj) { #if __BYTE_ORDER__ == __ORDER_LITTLE_ENDIAN__
if (obj->efile.ehdr->e_ident[EI_DATA] == ELFDATA2LSB)
return 0;
return obj->byteorder == ELFDATA2LSB;
#elif __BYTE_ORDER__ == __ORDER_BIG_ENDIAN__
if (obj->efile.ehdr->e_ident[EI_DATA] == ELFDATA2MSB)
return 0;
return obj->byteorder == ELFDATA2MSB;
#else # error "Unrecognized __BYTE_ORDER__" #endif
pr_warn("elf: endianness mismatch in %s.\n", obj->path);
return -LIBBPF_ERRNO__ENDIAN;
}
static int @@ -3953,6 +3977,10 @@ static int bpf_object__elf_collect(struct bpf_object *obj) return -LIBBPF_ERRNO__FORMAT; }
/* change BPF program insns to native endianness for introspection */
if (!is_native_endianness(obj))
bpf_object_bswap_progs(obj);
/* sort BPF programs by section name and in-section instruction offset * for faster search */
@@ -7992,7 +8020,6 @@ static struct bpf_object *bpf_object_open(const char *path, const void *obj_buf, }
err = bpf_object__elf_init(obj);
err = err ? : bpf_object__check_endianness(obj); err = err ? : bpf_object__elf_collect(obj); err = err ? : bpf_object__collect_externs(obj); err = err ? : bpf_object_fixup_btf(obj);
@@ -8498,8 +8525,15 @@ static int bpf_object_load(struct bpf_object *obj, int extra_log_level, const ch return libbpf_err(-EINVAL); }
if (obj->gen_loader)
/* Disallow kernel loading programs of non-native endianness but
* permit cross-endian creation of "light skeleton".
*/
if (obj->gen_loader) { bpf_gen__init(obj->gen_loader, extra_log_level, obj->nr_programs, obj->nr_maps);
} else if (!is_native_endianness(obj)) {
pr_warn("object '%s': loading non-native endianness is unsupported\n", obj->name);
return libbpf_err(-LIBBPF_ERRNO__ENDIAN);
} err = bpf_object_prepare_token(obj); err = err ? : bpf_object__probe_loading(obj);
diff --git a/tools/lib/bpf/libbpf_internal.h b/tools/lib/bpf/libbpf_internal.h index a8531195acd4..eda7a0cc2b8c 100644 --- a/tools/lib/bpf/libbpf_internal.h +++ b/tools/lib/bpf/libbpf_internal.h @@ -11,6 +11,7 @@
#include <stdlib.h> #include <limits.h> +#include <byteswap.h> #include <errno.h> #include <linux/err.h> #include <fcntl.h> @@ -616,6 +617,16 @@ static inline bool is_ldimm64_insn(struct bpf_insn *insn) return insn->code == (BPF_LD | BPF_IMM | BPF_DW); }
+static inline void bpf_insn_bswap(struct bpf_insn *insn) +{
__u8 tmp_reg = insn->dst_reg;
insn->dst_reg = insn->src_reg;
insn->src_reg = tmp_reg;
insn->off = bswap_16(insn->off);
insn->imm = bswap_32(insn->imm);
+}
/* Unconditionally dup FD, ensuring it doesn't use [0, 2] range.
- Original FD is not closed or altered in any other way.
- Preserves original FD value, if it's invalid (negative).
-- 2.34.1
Allow static linking object files of either endianness, checking that input files have consistent byte-order, and setting output endianness from input.
Linking requires in-memory processing of programs, relocations, sections, etc. in native endianness, and output conversion to target byte-order. This is enabled by built-in ELF translation and recent BTF/BTF.ext endianness functions. Further add local functions for swapping byte-order of sections containing BPF insns.
Signed-off-by: Tony Ambardar tony.ambardar@gmail.com --- tools/lib/bpf/linker.c | 78 +++++++++++++++++++++++++++++++++--------- 1 file changed, 62 insertions(+), 16 deletions(-)
diff --git a/tools/lib/bpf/linker.c b/tools/lib/bpf/linker.c index 7489306cd6f7..85562e26c3de 100644 --- a/tools/lib/bpf/linker.c +++ b/tools/lib/bpf/linker.c @@ -135,6 +135,7 @@ struct bpf_linker { int fd; Elf *elf; Elf64_Ehdr *elf_hdr; + bool swapped_endian;
/* Output sections metadata */ struct dst_sec *secs; @@ -324,13 +325,8 @@ static int init_output_elf(struct bpf_linker *linker, const char *file)
linker->elf_hdr->e_machine = EM_BPF; linker->elf_hdr->e_type = ET_REL; -#if __BYTE_ORDER__ == __ORDER_LITTLE_ENDIAN__ - linker->elf_hdr->e_ident[EI_DATA] = ELFDATA2LSB; -#elif __BYTE_ORDER__ == __ORDER_BIG_ENDIAN__ - linker->elf_hdr->e_ident[EI_DATA] = ELFDATA2MSB; -#else -#error "Unknown __BYTE_ORDER__" -#endif + /* Set unknown ELF endianness, assign later from input files */ + linker->elf_hdr->e_ident[EI_DATA] = ELFDATANONE;
/* STRTAB */ /* initialize strset with an empty string to conform to ELF */ @@ -541,19 +537,21 @@ static int linker_load_obj_file(struct bpf_linker *linker, const char *filename, const struct bpf_linker_file_opts *opts, struct src_obj *obj) { -#if __BYTE_ORDER__ == __ORDER_LITTLE_ENDIAN__ - const int host_endianness = ELFDATA2LSB; -#elif __BYTE_ORDER__ == __ORDER_BIG_ENDIAN__ - const int host_endianness = ELFDATA2MSB; -#else -#error "Unknown __BYTE_ORDER__" -#endif int err = 0; Elf_Scn *scn; Elf_Data *data; Elf64_Ehdr *ehdr; Elf64_Shdr *shdr; struct src_sec *sec; + unsigned char obj_byteorder; + unsigned char link_byteorder = linker->elf_hdr->e_ident[EI_DATA]; +#if __BYTE_ORDER__ == __ORDER_LITTLE_ENDIAN__ + const unsigned char host_byteorder = ELFDATA2LSB; +#elif __BYTE_ORDER__ == __ORDER_BIG_ENDIAN__ + const unsigned char host_byteorder = ELFDATA2MSB; +#else +#error "Unknown __BYTE_ORDER__" +#endif
pr_debug("linker: adding object file '%s'...\n", filename);
@@ -579,11 +577,25 @@ static int linker_load_obj_file(struct bpf_linker *linker, const char *filename, pr_warn_elf("failed to get ELF header for %s", filename); return err; } - if (ehdr->e_ident[EI_DATA] != host_endianness) { + + /* Linker output endianness set by first input object */ + obj_byteorder = ehdr->e_ident[EI_DATA]; + if (obj_byteorder != ELFDATA2LSB && obj_byteorder != ELFDATA2MSB) { + err = -EOPNOTSUPP; + pr_warn("unknown byte order of ELF file %s\n", filename); + return err; + } + if (link_byteorder == ELFDATANONE) { + linker->elf_hdr->e_ident[EI_DATA] = obj_byteorder; + linker->swapped_endian = obj_byteorder != host_byteorder; + pr_debug("linker: set %s-endian output byte order\n", + obj_byteorder == ELFDATA2MSB ? "big" : "little"); + } else if (link_byteorder != obj_byteorder) { err = -EOPNOTSUPP; - pr_warn_elf("unsupported byte order of ELF file %s", filename); + pr_warn("byte order mismatch with ELF file %s\n", filename); return err; } + if (ehdr->e_type != ET_REL || ehdr->e_machine != EM_BPF || ehdr->e_ident[EI_CLASS] != ELFCLASS64) { @@ -1111,6 +1123,24 @@ static bool sec_content_is_same(struct dst_sec *dst_sec, struct src_sec *src_sec return true; }
+static bool is_exec_sec(struct dst_sec *sec) +{ + if (!sec || sec->ephemeral) + return false; + return (sec->shdr->sh_type == SHT_PROGBITS) && + (sec->shdr->sh_flags & SHF_EXECINSTR); +} + +static void exec_sec_bswap(void *raw_data, int size) +{ + const int insn_cnt = size / sizeof(struct bpf_insn); + struct bpf_insn *insn = raw_data; + int i; + + for (i = 0; i < insn_cnt; i++, insn++) + bpf_insn_bswap(insn); +} + static int extend_sec(struct bpf_linker *linker, struct dst_sec *dst, struct src_sec *src) { void *tmp; @@ -1170,6 +1200,10 @@ static int extend_sec(struct bpf_linker *linker, struct dst_sec *dst, struct src memset(dst->raw_data + dst->sec_sz, 0, dst_align_sz - dst->sec_sz); /* now copy src data at a properly aligned offset */ memcpy(dst->raw_data + dst_align_sz, src->data->d_buf, src->shdr->sh_size); + + /* convert added bpf insns to native byte-order */ + if (linker->swapped_endian && is_exec_sec(dst)) + exec_sec_bswap(dst->raw_data + dst_align_sz, src->shdr->sh_size); }
dst->sec_sz = dst_final_sz; @@ -2630,6 +2664,10 @@ int bpf_linker__finalize(struct bpf_linker *linker) if (!sec->scn) continue;
+ /* restore sections with bpf insns to target byte-order */ + if (linker->swapped_endian && is_exec_sec(sec)) + exec_sec_bswap(sec->raw_data, sec->sec_sz); + sec->data->d_buf = sec->raw_data; }
@@ -2698,6 +2736,7 @@ static int emit_elf_data_sec(struct bpf_linker *linker, const char *sec_name,
static int finalize_btf(struct bpf_linker *linker) { + enum btf_endianness link_endianness; LIBBPF_OPTS(btf_dedup_opts, opts); struct btf *btf = linker->btf; const void *raw_data; @@ -2742,6 +2781,13 @@ static int finalize_btf(struct bpf_linker *linker) return err; }
+ /* Set .BTF and .BTF.ext output byte order */ + link_endianness = linker->elf_hdr->e_ident[EI_DATA] == ELFDATA2MSB ? + BTF_BIG_ENDIAN : BTF_LITTLE_ENDIAN; + btf__set_endianness(linker->btf, link_endianness); + if (linker->btf_ext) + btf_ext__set_endianness(linker->btf_ext, link_endianness); + /* Emit .BTF section */ raw_data = btf__raw_data(linker->btf, &raw_sz); if (!raw_data)
On Tue, Sep 3, 2024 at 12:34 AM Tony Ambardar tony.ambardar@gmail.com wrote:
Allow static linking object files of either endianness, checking that input files have consistent byte-order, and setting output endianness from input.
Linking requires in-memory processing of programs, relocations, sections, etc. in native endianness, and output conversion to target byte-order. This is enabled by built-in ELF translation and recent BTF/BTF.ext endianness functions. Further add local functions for swapping byte-order of sections containing BPF insns.
Signed-off-by: Tony Ambardar tony.ambardar@gmail.com
tools/lib/bpf/linker.c | 78 +++++++++++++++++++++++++++++++++--------- 1 file changed, 62 insertions(+), 16 deletions(-)
This one looks nice, simple and straightforward, thanks.
diff --git a/tools/lib/bpf/linker.c b/tools/lib/bpf/linker.c index 7489306cd6f7..85562e26c3de 100644 --- a/tools/lib/bpf/linker.c +++ b/tools/lib/bpf/linker.c @@ -135,6 +135,7 @@ struct bpf_linker { int fd; Elf *elf; Elf64_Ehdr *elf_hdr;
bool swapped_endian;
[...]
Track target endianness in 'struct bpf_gen' and process in-memory data in native byte-order, but on finalization convert the embedded loader BPF insns to target endianness.
The light skeleton also includes a target-accessed data blob which is heterogeneous and thus difficult to convert to target byte-order on finalization. Add support functions to convert data to target endianness as it is added to the blob.
Also add additional debug logging for data blob structure details and skeleton loading.
Signed-off-by: Tony Ambardar tony.ambardar@gmail.com --- tools/lib/bpf/bpf_gen_internal.h | 1 + tools/lib/bpf/gen_loader.c | 191 ++++++++++++++++++++++--------- tools/lib/bpf/libbpf.c | 1 + tools/lib/bpf/skel_internal.h | 3 +- 4 files changed, 143 insertions(+), 53 deletions(-)
diff --git a/tools/lib/bpf/bpf_gen_internal.h b/tools/lib/bpf/bpf_gen_internal.h index fdf44403ff36..6ff963a491d9 100644 --- a/tools/lib/bpf/bpf_gen_internal.h +++ b/tools/lib/bpf/bpf_gen_internal.h @@ -34,6 +34,7 @@ struct bpf_gen { void *data_cur; void *insn_start; void *insn_cur; + bool swapped_endian; ssize_t cleanup_label; __u32 nr_progs; __u32 nr_maps; diff --git a/tools/lib/bpf/gen_loader.c b/tools/lib/bpf/gen_loader.c index cf3323fd47b8..9a8614f945dd 100644 --- a/tools/lib/bpf/gen_loader.c +++ b/tools/lib/bpf/gen_loader.c @@ -401,6 +401,15 @@ int bpf_gen__finish(struct bpf_gen *gen, int nr_progs, int nr_maps) opts->insns_sz = gen->insn_cur - gen->insn_start; opts->data = gen->data_start; opts->data_sz = gen->data_cur - gen->data_start; + + /* use target endianness for embedded loader */ + if (gen->swapped_endian) { + struct bpf_insn *insn = (struct bpf_insn *)opts->insns; + int insn_cnt = opts->insns_sz / sizeof(struct bpf_insn); + + for (i = 0; i < insn_cnt; i++) + bpf_insn_bswap(insn++); + } } return gen->error; } @@ -414,6 +423,28 @@ void bpf_gen__free(struct bpf_gen *gen) free(gen); }
+/* + * Fields of bpf_attr are set to values in native byte-order before being + * written to the target-bound data blob, and may need endian conversion. + * This macro allows providing the correct value in situ more simply than + * writing a separate converter for *all fields* of *all records* included + * in union bpf_attr. Note that sizeof(rval) should match the assignment + * target to avoid runtime problems. + */ +#define tgt_endian(rval) ({ \ + typeof(rval) _val = (rval); \ + if (gen->swapped_endian) { \ + switch (sizeof(_val)) { \ + case 1: break; \ + case 2: _val = bswap_16(_val); break; \ + case 4: _val = bswap_32(_val); break; \ + case 8: _val = bswap_64(_val); break; \ + default: pr_warn("unsupported bswap size!\n"); \ + } \ + } \ + _val; \ +}) + void bpf_gen__load_btf(struct bpf_gen *gen, const void *btf_raw_data, __u32 btf_raw_size) { @@ -422,11 +453,12 @@ void bpf_gen__load_btf(struct bpf_gen *gen, const void *btf_raw_data, union bpf_attr attr;
memset(&attr, 0, attr_size); - pr_debug("gen: load_btf: size %d\n", btf_raw_size); btf_data = add_data(gen, btf_raw_data, btf_raw_size);
- attr.btf_size = btf_raw_size; + attr.btf_size = tgt_endian(btf_raw_size); btf_load_attr = add_data(gen, &attr, attr_size); + pr_debug("gen: load_btf: off %d size %d, attr: off %d size %d\n", + btf_data, btf_raw_size, btf_load_attr, attr_size);
/* populate union bpf_attr with user provided log details */ move_ctx2blob(gen, attr_field(btf_load_attr, btf_log_level), 4, @@ -457,28 +489,29 @@ void bpf_gen__map_create(struct bpf_gen *gen, union bpf_attr attr;
memset(&attr, 0, attr_size); - attr.map_type = map_type; - attr.key_size = key_size; - attr.value_size = value_size; - attr.map_flags = map_attr->map_flags; - attr.map_extra = map_attr->map_extra; + attr.map_type = tgt_endian(map_type); + attr.key_size = tgt_endian(key_size); + attr.value_size = tgt_endian(value_size); + attr.map_flags = tgt_endian(map_attr->map_flags); + attr.map_extra = tgt_endian(map_attr->map_extra); if (map_name) libbpf_strlcpy(attr.map_name, map_name, sizeof(attr.map_name)); - attr.numa_node = map_attr->numa_node; - attr.map_ifindex = map_attr->map_ifindex; - attr.max_entries = max_entries; - attr.btf_key_type_id = map_attr->btf_key_type_id; - attr.btf_value_type_id = map_attr->btf_value_type_id; - - pr_debug("gen: map_create: %s idx %d type %d value_type_id %d\n", - attr.map_name, map_idx, map_type, attr.btf_value_type_id); + attr.numa_node = tgt_endian(map_attr->numa_node); + attr.map_ifindex = tgt_endian(map_attr->map_ifindex); + attr.max_entries = tgt_endian(max_entries); + attr.btf_key_type_id = tgt_endian(map_attr->btf_key_type_id); + attr.btf_value_type_id = tgt_endian(map_attr->btf_value_type_id);
map_create_attr = add_data(gen, &attr, attr_size); - if (attr.btf_value_type_id) + pr_debug("gen: map_create: %s idx %d type %d value_type_id %d, attr: off %d size %d\n", + map_name, map_idx, map_type, map_attr->btf_value_type_id, + map_create_attr, attr_size); + + if (map_attr->btf_value_type_id) /* populate union bpf_attr with btf_fd saved in the stack earlier */ move_stack2blob(gen, attr_field(map_create_attr, btf_fd), 4, stack_off(btf_fd)); - switch (attr.map_type) { + switch (map_type) { case BPF_MAP_TYPE_ARRAY_OF_MAPS: case BPF_MAP_TYPE_HASH_OF_MAPS: move_stack2blob(gen, attr_field(map_create_attr, inner_map_fd), 4, @@ -498,8 +531,8 @@ void bpf_gen__map_create(struct bpf_gen *gen, /* emit MAP_CREATE command */ emit_sys_bpf(gen, BPF_MAP_CREATE, map_create_attr, attr_size); debug_ret(gen, "map_create %s idx %d type %d value_size %d value_btf_id %d", - attr.map_name, map_idx, map_type, value_size, - attr.btf_value_type_id); + map_name, map_idx, map_type, value_size, + map_attr->btf_value_type_id); emit_check_err(gen); /* remember map_fd in the stack, if successful */ if (map_idx < 0) { @@ -784,12 +817,12 @@ static void emit_relo_ksym_typeless(struct bpf_gen *gen, emit_ksym_relo_log(gen, relo, kdesc->ref); }
-static __u32 src_reg_mask(void) +static __u32 src_reg_mask(struct bpf_gen *gen) { -#if defined(__LITTLE_ENDIAN_BITFIELD) - return 0x0f; /* src_reg,dst_reg,... */ -#elif defined(__BIG_ENDIAN_BITFIELD) - return 0xf0; /* dst_reg,src_reg,... */ +#if defined(__LITTLE_ENDIAN_BITFIELD) /* src_reg,dst_reg,... */ + return gen->swapped_endian ? 0xf0 : 0x0f; +#elif defined(__BIG_ENDIAN_BITFIELD) /* dst_reg,src_reg,... */ + return gen->swapped_endian ? 0x0f : 0xf0; #else #error "Unsupported bit endianness, cannot proceed" #endif @@ -840,7 +873,7 @@ static void emit_relo_ksym_btf(struct bpf_gen *gen, struct ksym_relo_desc *relo, emit(gen, BPF_JMP_IMM(BPF_JA, 0, 0, 3)); clear_src_reg: /* clear bpf_object__relocate_data's src_reg assignment, otherwise we get a verifier failure */ - reg_mask = src_reg_mask(); + reg_mask = src_reg_mask(gen); emit(gen, BPF_LDX_MEM(BPF_B, BPF_REG_9, BPF_REG_8, offsetofend(struct bpf_insn, code))); emit(gen, BPF_ALU32_IMM(BPF_AND, BPF_REG_9, reg_mask)); emit(gen, BPF_STX_MEM(BPF_B, BPF_REG_8, BPF_REG_9, offsetofend(struct bpf_insn, code))); @@ -931,48 +964,96 @@ static void cleanup_relos(struct bpf_gen *gen, int insns) cleanup_core_relo(gen); }
+/* Convert func, line, and core relo info blobs to target endianness */ +static void info_blob_bswap(struct bpf_gen *gen, int func_info, int line_info, + int core_relos, struct bpf_prog_load_opts *load_attr) +{ + struct bpf_func_info *fi = gen->data_start + func_info; + struct bpf_line_info *li = gen->data_start + line_info; + struct bpf_core_relo *cr = gen->data_start + core_relos; + int i; + + if (!gen->swapped_endian) + return; + + for (i = 0; i < load_attr->func_info_cnt; i++) + bpf_func_info_bswap(fi++); + + for (i = 0; i < load_attr->line_info_cnt; i++) + bpf_line_info_bswap(li++); + + for (i = 0; i < gen->core_relo_cnt; i++) + bpf_core_relo_bswap(cr++); +} + void bpf_gen__prog_load(struct bpf_gen *gen, enum bpf_prog_type prog_type, const char *prog_name, const char *license, struct bpf_insn *insns, size_t insn_cnt, struct bpf_prog_load_opts *load_attr, int prog_idx) { + int func_info_tot_sz = load_attr->func_info_cnt * + load_attr->func_info_rec_size; + int line_info_tot_sz = load_attr->line_info_cnt * + load_attr->line_info_rec_size; + int core_relo_tot_sz = gen->core_relo_cnt * + sizeof(struct bpf_core_relo); int prog_load_attr, license_off, insns_off, func_info, line_info, core_relos; int attr_size = offsetofend(union bpf_attr, core_relo_rec_size); union bpf_attr attr;
memset(&attr, 0, attr_size); - pr_debug("gen: prog_load: type %d insns_cnt %zd progi_idx %d\n", - prog_type, insn_cnt, prog_idx); /* add license string to blob of bytes */ license_off = add_data(gen, license, strlen(license) + 1); /* add insns to blob of bytes */ insns_off = add_data(gen, insns, insn_cnt * sizeof(struct bpf_insn)); + pr_debug("gen: prog_load: prog_idx %d type %d insn off %d insns_cnt %zd license off %d\n", + prog_idx, prog_type, insns_off, insn_cnt, license_off);
- attr.prog_type = prog_type; - attr.expected_attach_type = load_attr->expected_attach_type; - attr.attach_btf_id = load_attr->attach_btf_id; - attr.prog_ifindex = load_attr->prog_ifindex; - attr.kern_version = 0; - attr.insn_cnt = (__u32)insn_cnt; - attr.prog_flags = load_attr->prog_flags; - - attr.func_info_rec_size = load_attr->func_info_rec_size; - attr.func_info_cnt = load_attr->func_info_cnt; - func_info = add_data(gen, load_attr->func_info, - attr.func_info_cnt * attr.func_info_rec_size); + /* convert blob insns to target endianness */ + if (gen->swapped_endian) { + struct bpf_insn *insn = gen->data_start + insns_off; + int i;
- attr.line_info_rec_size = load_attr->line_info_rec_size; - attr.line_info_cnt = load_attr->line_info_cnt; - line_info = add_data(gen, load_attr->line_info, - attr.line_info_cnt * attr.line_info_rec_size); + for (i = 0; i < insn_cnt; i++, insn++) + bpf_insn_bswap(insn); + }
- attr.core_relo_rec_size = sizeof(struct bpf_core_relo); - attr.core_relo_cnt = gen->core_relo_cnt; - core_relos = add_data(gen, gen->core_relos, - attr.core_relo_cnt * attr.core_relo_rec_size); + attr.prog_type = tgt_endian(prog_type); + attr.expected_attach_type = tgt_endian(load_attr->expected_attach_type); + attr.attach_btf_id = tgt_endian(load_attr->attach_btf_id); + attr.prog_ifindex = tgt_endian(load_attr->prog_ifindex); + attr.kern_version = 0; + attr.insn_cnt = tgt_endian((__u32)insn_cnt); + attr.prog_flags = tgt_endian(load_attr->prog_flags); + + attr.func_info_rec_size = tgt_endian(load_attr->func_info_rec_size); + attr.func_info_cnt = tgt_endian(load_attr->func_info_cnt); + func_info = add_data(gen, load_attr->func_info, func_info_tot_sz); + pr_debug("gen: prog_load: func_info: off %d cnt %d rec size %d\n", + func_info, load_attr->func_info_cnt, + load_attr->func_info_rec_size); + + attr.line_info_rec_size = tgt_endian(load_attr->line_info_rec_size); + attr.line_info_cnt = tgt_endian(load_attr->line_info_cnt); + line_info = add_data(gen, load_attr->line_info, line_info_tot_sz); + pr_debug("gen: prog_load: line_info: off %d cnt %d rec size %d\n", + line_info, load_attr->line_info_cnt, + load_attr->line_info_rec_size); + + attr.core_relo_rec_size = tgt_endian((__u32)sizeof(struct bpf_core_relo)); + attr.core_relo_cnt = tgt_endian(gen->core_relo_cnt); + core_relos = add_data(gen, gen->core_relos, core_relo_tot_sz); + pr_debug("gen: prog_load: core_relos: off %d cnt %d rec size %zd\n", + core_relos, gen->core_relo_cnt, + sizeof(struct bpf_core_relo)); + + /* convert all info blobs to target endianness */ + info_blob_bswap(gen, func_info, line_info, core_relos, load_attr);
libbpf_strlcpy(attr.prog_name, prog_name, sizeof(attr.prog_name)); prog_load_attr = add_data(gen, &attr, attr_size); + pr_debug("gen: prog_load: attr: off %d size %d\n", + prog_load_attr, attr_size);
/* populate union bpf_attr with a pointer to license */ emit_rel_store(gen, attr_field(prog_load_attr, license), license_off); @@ -1040,10 +1121,11 @@ void bpf_gen__map_update_elem(struct bpf_gen *gen, int map_idx, void *pvalue, int zero = 0;
memset(&attr, 0, attr_size); - pr_debug("gen: map_update_elem: idx %d\n", map_idx);
value = add_data(gen, pvalue, value_size); key = add_data(gen, &zero, sizeof(zero)); + pr_debug("gen: map_update_elem: idx %d, value: off %d size %d\n", + map_idx, value, value_size);
/* if (map_desc[map_idx].initial_value) { * if (ctx->flags & BPF_SKEL_KERNEL) @@ -1068,6 +1150,8 @@ void bpf_gen__map_update_elem(struct bpf_gen *gen, int map_idx, void *pvalue, emit(gen, BPF_EMIT_CALL(BPF_FUNC_probe_read_kernel));
map_update_attr = add_data(gen, &attr, attr_size); + pr_debug("gen: map_update_elem: attr: off %d size %d\n", + map_update_attr, attr_size); move_blob2blob(gen, attr_field(map_update_attr, map_fd), 4, blob_fd_array_off(gen, map_idx)); emit_rel_store(gen, attr_field(map_update_attr, key), key); @@ -1084,14 +1168,16 @@ void bpf_gen__populate_outer_map(struct bpf_gen *gen, int outer_map_idx, int slo int attr_size = offsetofend(union bpf_attr, flags); int map_update_attr, key; union bpf_attr attr; + int tgt_slot;
memset(&attr, 0, attr_size); - pr_debug("gen: populate_outer_map: outer %d key %d inner %d\n", - outer_map_idx, slot, inner_map_idx);
- key = add_data(gen, &slot, sizeof(slot)); + tgt_slot = tgt_endian(slot); + key = add_data(gen, &tgt_slot, sizeof(tgt_slot));
map_update_attr = add_data(gen, &attr, attr_size); + pr_debug("gen: populate_outer_map: outer %d key %d inner %d, attr: off %d size %d\n", + outer_map_idx, slot, inner_map_idx, map_update_attr, attr_size); move_blob2blob(gen, attr_field(map_update_attr, map_fd), 4, blob_fd_array_off(gen, outer_map_idx)); emit_rel_store(gen, attr_field(map_update_attr, key), key); @@ -1112,8 +1198,9 @@ void bpf_gen__map_freeze(struct bpf_gen *gen, int map_idx) union bpf_attr attr;
memset(&attr, 0, attr_size); - pr_debug("gen: map_freeze: idx %d\n", map_idx); map_freeze_attr = add_data(gen, &attr, attr_size); + pr_debug("gen: map_freeze: idx %d, attr: off %d size %d\n", + map_idx, map_freeze_attr, attr_size); move_blob2blob(gen, attr_field(map_freeze_attr, map_fd), 4, blob_fd_array_off(gen, map_idx)); /* emit MAP_FREEZE command */ diff --git a/tools/lib/bpf/libbpf.c b/tools/lib/bpf/libbpf.c index 46f41ea5e74d..6a1347f0eda6 100644 --- a/tools/lib/bpf/libbpf.c +++ b/tools/lib/bpf/libbpf.c @@ -9125,6 +9125,7 @@ int bpf_object__gen_loader(struct bpf_object *obj, struct gen_loader_opts *opts) if (!gen) return -ENOMEM; gen->opts = opts; + gen->swapped_endian = !is_native_endianness(obj); obj->gen_loader = gen; return 0; } diff --git a/tools/lib/bpf/skel_internal.h b/tools/lib/bpf/skel_internal.h index 1e82ab06c3eb..67e8477ecb5b 100644 --- a/tools/lib/bpf/skel_internal.h +++ b/tools/lib/bpf/skel_internal.h @@ -351,10 +351,11 @@ static inline int bpf_load_and_run(struct bpf_load_and_run_opts *opts) attr.test.ctx_size_in = opts->ctx->sz; err = skel_sys_bpf(BPF_PROG_RUN, &attr, test_run_attr_sz); if (err < 0 || (int)attr.test.retval < 0) { - opts->errstr = "failed to execute loader prog"; if (err < 0) { + opts->errstr = "failed to execute loader prog"; set_err; } else { + opts->errstr = "error returned by loader prog"; err = (int)attr.test.retval; #ifndef __KERNEL__ errno = -err;
On Tue, Sep 3, 2024 at 12:34 AM Tony Ambardar tony.ambardar@gmail.com wrote:
@@ -1040,10 +1121,11 @@ void bpf_gen__map_update_elem(struct bpf_gen *gen, int map_idx, void *pvalue, int zero = 0;
memset(&attr, 0, attr_size);
pr_debug("gen: map_update_elem: idx %d\n", map_idx); value = add_data(gen, pvalue, value_size); key = add_data(gen, &zero, sizeof(zero));
pr_debug("gen: map_update_elem: idx %d, value: off %d size %d\n",
map_idx, value, value_size); /* if (map_desc[map_idx].initial_value) { * if (ctx->flags & BPF_SKEL_KERNEL)
@@ -1068,6 +1150,8 @@ void bpf_gen__map_update_elem(struct bpf_gen *gen, int map_idx, void *pvalue, emit(gen, BPF_EMIT_CALL(BPF_FUNC_probe_read_kernel));
map_update_attr = add_data(gen, &attr, attr_size);
pr_debug("gen: map_update_elem: attr: off %d size %d\n",
map_update_attr, attr_size); move_blob2blob(gen, attr_field(map_update_attr, map_fd), 4, blob_fd_array_off(gen, map_idx)); emit_rel_store(gen, attr_field(map_update_attr, key), key);
I don't see the point of two pr_debug("gen: map_update_elem... just a few lines from each other.
Other than that: Acked-by: Alexei Starovoitov ast@kernel.org
On Tue, Sep 03, 2024 at 12:57:51PM -0700, Alexei Starovoitov wrote:
On Tue, Sep 3, 2024 at 12:34 AM Tony Ambardar tony.ambardar@gmail.com wrote:
@@ -1040,10 +1121,11 @@ void bpf_gen__map_update_elem(struct bpf_gen *gen, int map_idx, void *pvalue, int zero = 0;
memset(&attr, 0, attr_size);
pr_debug("gen: map_update_elem: idx %d\n", map_idx); value = add_data(gen, pvalue, value_size); key = add_data(gen, &zero, sizeof(zero));
pr_debug("gen: map_update_elem: idx %d, value: off %d size %d\n",
map_idx, value, value_size); /* if (map_desc[map_idx].initial_value) { * if (ctx->flags & BPF_SKEL_KERNEL)
@@ -1068,6 +1150,8 @@ void bpf_gen__map_update_elem(struct bpf_gen *gen, int map_idx, void *pvalue, emit(gen, BPF_EMIT_CALL(BPF_FUNC_probe_read_kernel));
map_update_attr = add_data(gen, &attr, attr_size);
pr_debug("gen: map_update_elem: attr: off %d size %d\n",
map_update_attr, attr_size); move_blob2blob(gen, attr_field(map_update_attr, map_fd), 4, blob_fd_array_off(gen, map_idx)); emit_rel_store(gen, attr_field(map_update_attr, key), key);
I don't see the point of two pr_debug("gen: map_update_elem... just a few lines from each other.
Other than that: Acked-by: Alexei Starovoitov ast@kernel.org
Thanks for reviewing, Alexei. I agree those could be consolidated, and I tested the following patch to do so. I'll include it if another respin is needed, or someone else could modify during merge otherwise.
--- a/tools/lib/bpf/gen_loader.c +++ b/tools/lib/bpf/gen_loader.c @@ -1124,8 +1124,6 @@ void bpf_gen__map_update_elem(struct bpf_gen *gen, int map_idx, void *pvalue,
value = add_data(gen, pvalue, value_size); key = add_data(gen, &zero, sizeof(zero)); - pr_debug("gen: map_update_elem: idx %d, value: off %d size %d\n", - map_idx, value, value_size);
/* if (map_desc[map_idx].initial_value) { * if (ctx->flags & BPF_SKEL_KERNEL) @@ -1150,8 +1148,8 @@ void bpf_gen__map_update_elem(struct bpf_gen *gen, int map_idx, void *pvalue, emit(gen, BPF_EMIT_CALL(BPF_FUNC_probe_read_kernel));
map_update_attr = add_data(gen, &attr, attr_size); - pr_debug("gen: map_update_elem: attr: off %d size %d\n", - map_update_attr, attr_size); + pr_debug("gen: map_update_elem: idx %d, value: off %d size %d, attr: off %d size %d\n", + map_idx, value, value_size, map_update_attr, attr_size); move_blob2blob(gen, attr_field(map_update_attr, map_fd), 4, blob_fd_array_off(gen, map_idx)); emit_rel_store(gen, attr_field(map_update_attr, key), key);
Update Makefile build rules to compile BPF programs with target endianness rather than host byte-order. With recent changes, this allows building the full selftests/bpf suite hosted on x86_64 and targeting s390x or mips64eb for example.
Signed-off-by: Tony Ambardar tony.ambardar@gmail.com --- tools/testing/selftests/bpf/Makefile | 7 ++++--- 1 file changed, 4 insertions(+), 3 deletions(-)
diff --git a/tools/testing/selftests/bpf/Makefile b/tools/testing/selftests/bpf/Makefile index 7660d19b66c2..1f21d3a0c20f 100644 --- a/tools/testing/selftests/bpf/Makefile +++ b/tools/testing/selftests/bpf/Makefile @@ -442,6 +442,7 @@ endef IS_LITTLE_ENDIAN = $(shell $(CC) -dM -E - </dev/null | \ grep 'define __BYTE_ORDER__ __ORDER_LITTLE_ENDIAN__') MENDIAN=$(if $(IS_LITTLE_ENDIAN),-mlittle-endian,-mbig-endian) +BPF_TARGET_ENDIAN=$(if $(IS_LITTLE_ENDIAN),--target=bpfel,--target=bpfeb)
ifneq ($(CROSS_COMPILE),) CLANG_TARGET_ARCH = --target=$(notdir $(CROSS_COMPILE:%-=%)) @@ -469,17 +470,17 @@ $(OUTPUT)/cgroup_getset_retval_hooks.o: cgroup_getset_retval_hooks.h # $4 - binary name define CLANG_BPF_BUILD_RULE $(call msg,CLNG-BPF,$4,$2) - $(Q)$(CLANG) $3 -O2 --target=bpf -c $1 -mcpu=v3 -o $2 + $(Q)$(CLANG) $3 -O2 $(BPF_TARGET_ENDIAN) -c $1 -mcpu=v3 -o $2 endef # Similar to CLANG_BPF_BUILD_RULE, but with disabled alu32 define CLANG_NOALU32_BPF_BUILD_RULE $(call msg,CLNG-BPF,$4,$2) - $(Q)$(CLANG) $3 -O2 --target=bpf -c $1 -mcpu=v2 -o $2 + $(Q)$(CLANG) $3 -O2 $(BPF_TARGET_ENDIAN) -c $1 -mcpu=v2 -o $2 endef # Similar to CLANG_BPF_BUILD_RULE, but with cpu-v4 define CLANG_CPUV4_BPF_BUILD_RULE $(call msg,CLNG-BPF,$4,$2) - $(Q)$(CLANG) $3 -O2 --target=bpf -c $1 -mcpu=v4 -o $2 + $(Q)$(CLANG) $3 -O2 $(BPF_TARGET_ENDIAN) -c $1 -mcpu=v4 -o $2 endef # Build BPF object using GCC define GCC_BPF_BUILD_RULE
On 9/3/24 12:33 AM, Tony Ambardar wrote:
Update Makefile build rules to compile BPF programs with target endianness rather than host byte-order. With recent changes, this allows building the full selftests/bpf suite hosted on x86_64 and targeting s390x or mips64eb for example.
Signed-off-by: Tony Ambardar tony.ambardar@gmail.com
LGTM.
Acked-by: Yonghong Song yonghong.song@linux.dev
linux-kselftest-mirror@lists.linaro.org