On Wed, Aug 28, 2024 at 04:11:50AM -0700, Tony Ambardar wrote:
Hello all,
[snip]
Changelog:
v2 -> v3: (feedback from Andrii)
- improve some log and commit message formatting
- restructure BTF.ext endianness safety checks and byte-swapping
- use BTF.ext info record definitions for swapping, require BTF v1
- follow BTF API implementation more closely for BTF.ext
- explicitly reject loading non-native endianness program into kernel
- simplify linker output byte-order setting
- drop redundant safety checks during linking
- simplify endianness macro and improve blob setup code for light skel
- no unexpected test failures after cross-compiling x86_64 -> s390x
Sadly, shortly after posting v3 I hit a strange new issue in CI testing.
Existing code in bpf_object__elf_finish() doesn't zero Ehdr references after freeing the related ELF data, allowing use of stale endian data which can be reallocated and overwritten, leading to rare, confusing CI errors like:
test_tailcall_count:PASS:open fentry_obj file 0 nsec test_tailcall_count:PASS:find fentry prog 0 nsec test_tailcall_count:PASS:set_attach_target subprog_tail 0 nsec libbpf: object 'tailcall_bpf2bp' is not native endianness test_tailcall_count:FAIL:load fentry_obj unexpected error: -4003 (errno 4003) #333/13 tailcalls/tailcall_bpf2bpf_fentry:FAIL
I have a minor patch to fix this but will wait for feedback on v3 before posting it together with any further requested changes in a v4. Apologies for the extra churn, and I'll attach the pending patch for reference.
Thanks, Tony