Re: Re: Re: [PATCH 2/2] libbpf: BPF programs dynamic loading and attaching

8 Feb 2025


      On Wed, 2025-02-05 at 14:33 -0800, Andrii Nakryiko wrote:
...
...
...
...
I see two ways forward for you. Either you can break apart your
BPF
object of ~100 BPF programs into more independent BPF objects >
...
(seeing
that programs can be independently loaded/unloaded depending on
configuration, seems like you do have a bunch of logic > >
independence,
right?). I assume shared BPF maps are the biggest reason to
keep > > all
those programs together in one BPF object. To share BPF maps >
...
between
multiple BPF objects libbpf provides two complementary
interfaces:
- bpf_map__reuse_fd() for manual control
  - BPF map pinning (could be declarative or manual)
This way you can ensure that all BPF objects would use the same
BPF
map, where necessary.
I think this approach *could* work but could easily become complex for
us because we'd need to track all the dependencies between programs and
maps, and anything missed could lead to difficult refcount bugs.
Further, splitting into objects incurs some performance and memory cost
because bpf_object__load_vmlinux_btf will be called for each object,
and there's currently no way to share BTF data across the objects.
Having a single BPF object avoids this issue. Potentially, libbpf could
cache some BTF data to make lessen the impact.
...
...
...
...
Alternatively, we can look at this problem as needing libbpf to
...
...
only
prepare BPF program code (doing all the relocations and stuff
like
that), but then application actually taking care of > >
loading/unloading
BPF program with bpf_prog_load() outside of bpf_object
abstraction.
I've had an almost ready patches splitting bpf_object__load()
into > > > two
steps: bpf_object__prepare() and bpf_object__load() after that.
"prepare" step would create BPF maps, load BTF information,
perform
necessary relocations and arrive at final state of BPF program
code
(which you can get with bpf_program__insns() API), but stopping
...
...
just
short of actually doing bpf_prog_load() step.
This seems like it would solve your problem as well. You'd use
...
...
libbpf
to do all the low-level ELF processing and relocation, but then
...
...
take
over managing BPF program lifetime. Loading/unloading as you
see > > fit,
including in parallel.
Is this something that would work for you?
I think this API could work, though I think we would need a few other
modifications as well in order to correctly handle program/map
dependencies and account for relocations. At a high level, I think we'd
need something that includes:
1) A way to associate each BPF program with all the maps it will use
(association of struct bpf_program * --> list of struct bpf_map * in
some form). This is so that we can load/unload associated maps when we
load/unload a program.
2) An API to create a BPF map, in case a new map needs to be loaded
after initial startup.
3) An API to allow unloading a map while keeping map->fd reserved. This
is important because the fd value is used by BPF program instructions,
so without something like this, we'd have to redo the relocation
process for any other BPF programs that access this map (and thus
reload those programs too). This API could be implemented by dup'ing a
placeholder fd.
Alternatively, if libbpf could automatically refcount maps across
multiple BPF objects to load/unload them on demand, then all of the
above work could happen behind the scenes. This would be similar to the
other approach you mentioned, but with libbpf doing the refcounting
heavy lifting instead of leaving that to each application, thus more
robust and elegant. This would mean changing libbpf to (a) synchronize
access to some map functions and (b) allowing struct bpf_map * to be
shared across BPF objects. Perhaps a concept of a "collection of BPF
objects" might allow for this.
...
...
...
...
...
...
> > 
> > This patch set also permits loading BPF programs in
> > parallel if > > > > the
> > application wishes. We tested parallel loading with
> > 200+ BPF > > > > > > programs
> > and found the load time dropped from 18 seconds to 5
> > seconds > > > > when > > done
> > in parallel on a 6.8 kernel.
bpf_object is intentionally single-threaded, so I don't think
we'll > > > be
supporting parallel BPF program loading in the paradigm of > >
bpf_object
(but see the bpf_object__prepare() proposal). Even from API > >
...
standpoint
this is problematic with logging and log buffers basically
assuming
single-threaded execution of BPF program loading.
All that could be changed or worked around, but your use case
is > > not
really a typical case, so I'm a bit hesitant at this point.
...
...
> >
I can understand where you're coming from if no one else has mentioned
a use case like this. We can do parallel loading by splitting our
programs into BPF objects, but unless the objects are split very
evenly, this results in less optimal load time. For example, if 100
programs are split into 2 objects and one object has 80 programs while
the other has 20, then the one with 80 programs creates a bottleneck.
...

2025

2024

2023

2022

2021

2020

2019

2018

2017

Re: Re: Re: [PATCH 2/2] libbpf: BPF programs dynamic loading and attaching