Re: [External] Re: [PATCH bpf-next v2 1/9] bpf: tracing: add support to record and check the accessed args

28 Mar 2024


      On Fri, Mar 15, 2024 at 4:00 PM 梦龙董 dongmenglong.8@bytedance.com wrote:
...
On Thu, Mar 14, 2024 at 8:27 AM Alexei Starovoitov
alexei.starovoitov@gmail.com wrote:
...
On Tue, Mar 12, 2024 at 6:53 PM 梦龙董 dongmenglong.8@bytedance.com wrote:
[......]
...
...
What does "a hundred attachments max" means? Can't I
trace thousands of kernel functions with a bpf program of
tracing multi-link?
I mean what time does it take to attach one program
to 100 fentry-s ?
What is the time for 1k and for 10k ?
The kprobe multi test attaches to pretty much all funcs in
/sys/kernel/tracing/available_filter_functions
and it's fast enough to run in test_progs on every commit in bpf CI.
See get_syms() in prog_tests/kprobe_multi_test.c
Can this new multi fentry do that?
and at what speed?
The answer will decide how applicable this api is going to be.
Generating different trampolines for every attach point
is an approach as well. Pls benchmark it too.
I see. Creating plenty of trampolines does take a lot of time,
and I'll do testing on it.
I have done a simple benchmark on creating 1000
trampolines. It is slow, quite slow, which consume up to
60s. We can't do it this way.
Now, I have a bad idea. How about we introduce
a "dynamic trampoline"? The basic logic of it can be:
"""
save regs
bpfs = trampoline_lookup_ip(ip)
fentry = bpfs->fentries
while fentry:
  fentry(ctx)
  fentry = fentry->next
call origin
save return value
fexit = bpfs->fexits
while fexit:
  fexit(ctx)
  fexit = fexit->next
xxxxxx
"""
And we lookup the "bpfs" by the function ip in a hash map
in trampoline_lookup_ip. The type of "bpfs" is:
struct bpf_array {
  struct bpf_prog *fentries;
 struct bpf_prog *fexits;
  struct bpf_prog *modify_returns;
}
When we need to attach the bpf progA to function A/B/C,
we only need to create the bpf_arrayA, bpf_arrayB, bpf_arrayC
and add the progA to them, and insert them to the hash map
"direct_call_bpfs", and attach the "dynamic trampoline" to
A/B/C. If bpf_arrayA exist, just add progA to the tail of
bpf_arrayA->fentries. When we need to attach progB to
B/C, just add progB to bpf_arrayB->fentries and
bpf_arrayB->fentries.
Compared to the trampoline, extra overhead is introduced
by the hash lookuping.
I have not begun to code yet, and I am not sure the overhead is
acceptable. Considering that we also need to do hash lookup
by the function in kprobe_multi, maybe the overhead is
acceptable?
Thanks!
Menglong Dong
...
...
...
...
Let's step back.
[......]
...
For one trampoline to handle all attach points we might
need some arch support, but we can start simple.
Make btf_func_model with MAX_BPF_FUNC_REG_ARGS
by calling btf_distill_func_proto() with func==NULL.
And use that to build a trampoline.
The challenge is how to use minimal number of trampolines
when bpf_progA is attached for func1, func2, func3
and bpf_progB is attached to func3, func4, func5.
We'd still need 3 trampolines:
for func[12] to call bpf_progA,
for func3 to call bpf_progA and bpf_progB,
for func[45] to call bpf_progB.
Jiri was trying to solve it in the past. His slides from LPC:
https://lpc.events/event/16/contributions/1350/attachments/1033/1983/plumber...
Pls study them and his prior patchsets to avoid stepping on the same rakes.

2025

2024

2023

2022

2021

2020

2019

2018

2017

Re: [External] Re: [PATCH bpf-next v2 1/9] bpf: tracing: add support to record and check the accessed args