On Mon, Sep 2, 2024 at 1:19 AM Tony Ambardar tony.ambardar@gmail.com wrote:
On Fri, Aug 30, 2024 at 02:14:19PM -0700, Andrii Nakryiko wrote:
On Fri, Aug 30, 2024 at 12:30 AM Tony Ambardar tony.ambardar@gmail.com wrote:
Support for handling BTF data of either endianness was added in [1], but did not include BTF.ext data for lack of use cases. Later, support for static linking [2] provided a use case, but this feature and later ones were restricted to native-endian usage.
Add support for BTF.ext handling in either endianness. Convert BTF.ext data to native endianness when read into memory for further processing, and support raw data access that restores the original byte-order for output. Add internal header functions for byte-swapping func, line, and core info records.
Add new API functions btf_ext__endianness() and btf_ext__set_endianness() for query and setting byte-order, as already exist for BTF data.
[1] 3289959b97ca ("libbpf: Support BTF loading and raw data output in both endianness") [2] 8fd27bf69b86 ("libbpf: Add BPF static linker BTF and BTF.ext support")
Signed-off-by: Tony Ambardar tony.ambardar@gmail.com
tools/lib/bpf/btf.c | 192 +++++++++++++++++++++++++++++--- tools/lib/bpf/btf.h | 3 + tools/lib/bpf/libbpf.map | 2 + tools/lib/bpf/libbpf_internal.h | 33 ++++++ 4 files changed, 214 insertions(+), 16 deletions(-)
diff --git a/tools/lib/bpf/btf.c b/tools/lib/bpf/btf.c index f5081de86ee0..064cfe126c09 100644 --- a/tools/lib/bpf/btf.c +++ b/tools/lib/bpf/btf.c @@ -3022,25 +3022,102 @@ static int btf_ext_setup_core_relos(struct btf_ext *btf_ext) return btf_ext_setup_info(btf_ext, ¶m); }
-static int btf_ext_parse_hdr(__u8 *data, __u32 data_size) +/* Swap byte-order of BTF.ext header with any endianness */ +static void btf_ext_bswap_hdr(struct btf_ext *btf_ext, __u32 hdr_len) {
const struct btf_ext_header *hdr = (struct btf_ext_header *)data;
struct btf_ext_header *h = btf_ext->hdr;
if (data_size < offsetofend(struct btf_ext_header, hdr_len) ||
data_size < hdr->hdr_len) {
pr_debug("BTF.ext header not found\n");
h->magic = bswap_16(h->magic);
h->hdr_len = bswap_32(h->hdr_len);
h->func_info_off = bswap_32(h->func_info_off);
h->func_info_len = bswap_32(h->func_info_len);
h->line_info_off = bswap_32(h->line_info_off);
h->line_info_len = bswap_32(h->line_info_len);
if (hdr_len < offsetofend(struct btf_ext_header, core_relo_len))
return;
h->core_relo_off = bswap_32(h->core_relo_off);
h->core_relo_len = bswap_32(h->core_relo_len);
+}
+/* Swap byte-order of a generic info subsection */ +static void info_subsec_bswap(const struct btf_ext_header *hdr, bool native,
__u32 off, __u32 len, anon_info_bswap_fn_t bswap)
ok, so I'm not a fan of this bswap callback, tbh. Also, we don't really enforce that each kind of record has exact size we expect (i.e., bpf_line_info_min and bpf_func_info_min shouldn't be "min" for byte-swapped case, it should be exact).
How about this slight modification: split byte swapping of sections/subsection metadata, so we adjust record size, sec_name_off and num_info separately from adjusting each record.
Hmmm, the bulk of code needed is to parse the metadata, with only 2 lines used to go through records. Splitting per above would add unnecessary duplication it seems, no?
Once this swapping is done we:
a) validate record size for each section is expected (according to its type, of course)
This is a good point I overlooked, and needs doing in any case.
b) we can then use for_each_btf_ext_sec() and for_each_btf_ext_rec() macro (which assume proper in-memory metadata byte order) and then hard-code swapping of each record fields
How easily can we use these macros? Consider the current call chain:
Not that easily, turns out, because a) it acquires data pointer implicitly, which makes it hard for btf_ext_raw_data() and b) it accesses sec->num_info and doesn't cache it, so we'd need an extra local variable to keep it if we were to swap it in raw data.
btf_ext__new btf_ext_parse btf_ext_bswap_hdr (1) btf_ext_bswap_info (2) btf_ext_setup_func_info btf_ext_setup_line_info btf_ext_setup_core_relos (3)
btf_ext__raw_data btf_ext_bswap_info (4) btf_ext_bswap_hdr
The macros iterate on 'struct btf_ext_info' instances in 'struct btf_ext' but these are only set up after (3) it seems and unavailable at (2). I suppose they could be used with some sort of kludge but unsure how well they'll work.
No callbacks.
This has also a benefit of not needing this annoying "bool native" flag when producing raw bytes. We just ensure proper order of operation:
a) swap records b) swap metadata (so just mirrored order from initialization)
How does that work? If we split up btf_ext_bswap_info(), after (1) btf_ext->swapped_endian is set and btf_ext->hdr->magic is swapped, so at (2) it's not possible to tell the current info data byte order without some hinting.
But maybe if we defer setting btf_ext->swapped_endian until after (b) we can drop the "bool native" thanks to symmetry breaking. Let me check.
WDYT?
Adding a record_size check is definitely needed.
But I have trouble seeing how splitting bswap of info metadata/records would yield something simpler and cleaner than the callbacks. What if they were passed via a descriptor, as in btf_ext_setup_func_info()? I think I need to play around with this a while and see..
It would also help me if you'd elaborate on the drawbacks you see of using callbacks, given I see then in other parts of libbpf.
I replied to your latest patches. I generally dislike callbacks as they make following the code harder. If it's possible to not use callbacks with reasonable simplicity, I'll always go for that. But it's ok, given those existing iteration macros are a bit assuming about data and its endianness, it's hard to use them.
pw-bot: cr
+{
__u32 left, i, *rs, rec_size, num_info;
struct btf_ext_info_sec *si;
void *p;
if (len == 0)
return;
rs = (void *)hdr + hdr->hdr_len + off; /* record size */
si = (void *)rs + sizeof(__u32); /* sec info #1 */
rec_size = native ? *rs : bswap_32(*rs);
*rs = bswap_32(*rs);
left = len - sizeof(__u32);
while (left > 0) {
num_info = native ? si->num_info : bswap_32(si->num_info);
si->sec_name_off = bswap_32(si->sec_name_off);
si->num_info = bswap_32(si->num_info);
left -= offsetof(struct btf_ext_info_sec, data);
p = si->data;
for (i = 0; i < num_info; i++) /* list of records */
p += bswap(p);
si = p;
left -= rec_size * num_info;
nit: extra space here
Fixed, thanks.
}
+}
[...]