Hi everyone,
This patchset introduces a new BPF program type that allows overriding a tracepoint probe function registered via register_trace_*.
Motivation ---------- Tracepoint probe functions registered via register_trace_* in the kernel cannot be dynamically modified, changing a probe function requires recompiling the kernel and rebooting. Nor can BPF programs change an existing probe function.
Overiding tracepoint supports a way to apply patches into kernel quickly (such as applying security ones), through predefined static tracepoints, without waiting for upstream integration.
This patchset demonstrates the way to override probe functions by BPF program.
Overview -------- This patchset adds BPF_PROG_TYPE_RAW_TRACEPOINT_OVERRIDE program type. When this type of BPF program attaches, it overrides the target tracepoint probe function.
And it also extends a new struct type "tracepoint_func_snapshot", which extends the tracepoint structure. It is used to record the original probe function registered by kernel after BPF program being attached and restore from it after detachment.
Critical steps --------------
1. Attach: Attach programs via the raw_tracepoint_open syscall. 2. Override: (a) Locate the target probe by `probe_name`. (b) Override target probe with the BPF program. (c) Save the BPF program and target probe function into "tracepoint_func_snapshot". 3. Restore: When the BPF program is detached, automatically restore the original probe function from earlier saved snapshot.
Future work ----------- This patchset is intended as a first step toward supporting BPF programs that can override tracepoint probes. The current implementation may not yet cover all use cases or handle every corner case.
I welcome feedback and suggestions from the community, and will continue to refine and improve the design based on comments and real-world requirements.
Thanks! Fuyu
Fuyu Zhao (3): bpf: Introduce BPF_PROG_TYPE_RAW_TRACEPOINT_OVERRIDE libbpf: Add support for BPF_PROG_TYPE_RAW_TRACEPOINT_OVERRIDE selftests/bpf: Add selftest for "raw_tp.o"
include/linux/bpf_types.h | 2 + include/linux/trace_events.h | 9 + include/linux/tracepoint-defs.h | 6 + include/linux/tracepoint.h | 3 + include/uapi/linux/bpf.h | 2 + kernel/bpf/syscall.c | 35 +++- kernel/trace/bpf_trace.c | 31 +++ kernel/tracepoint.c | 190 +++++++++++++++++- tools/include/uapi/linux/bpf.h | 2 + tools/lib/bpf/bpf.c | 1 + tools/lib/bpf/bpf.h | 3 +- tools/lib/bpf/libbpf.c | 27 ++- tools/lib/bpf/libbpf.h | 3 +- .../bpf/prog_tests/raw_tp_override_test_run.c | 23 +++ .../bpf/progs/test_raw_tp_override_test_run.c | 20 ++ .../selftests/bpf/test_kmods/bpf_testmod.c | 7 + 16 files changed, 352 insertions(+), 12 deletions(-) create mode 100644 tools/testing/selftests/bpf/prog_tests/raw_tp_override_test_run.c create mode 100644 tools/testing/selftests/bpf/progs/test_raw_tp_override_test_run.c
This patch introduces a new program type -- BPF_PROG_TYPE_RAW_TRACEPOINT_OVERRIDE. Program of this type requires an additional parameter -- probe_name, to locate the target tracepoint probe function registered by register_trace_* in the kernel.
This type reuses existing RAW_TRACEPOINT infrastructure, and differs only when probe_name is specified. In that case, the newly attached RAW_TRACEPOINT_OVERRIDE program and the target probe function are paired and stored in a snapshot.
When the BPF program is detached, snapshots are consulted to determine whether restoration of the original probe function is required.
Signed-off-by: Fuyu Zhao zhaofuyu@vivo.com --- include/linux/bpf_types.h | 2 + include/linux/trace_events.h | 9 ++ include/linux/tracepoint-defs.h | 6 + include/linux/tracepoint.h | 3 + include/uapi/linux/bpf.h | 2 + kernel/bpf/syscall.c | 35 ++++-- kernel/trace/bpf_trace.c | 31 ++++++ kernel/tracepoint.c | 190 +++++++++++++++++++++++++++++++- 8 files changed, 269 insertions(+), 9 deletions(-)
diff --git a/include/linux/bpf_types.h b/include/linux/bpf_types.h index fa78f49d4a9a..e5cf8a1af6cd 100644 --- a/include/linux/bpf_types.h +++ b/include/linux/bpf_types.h @@ -48,6 +48,8 @@ BPF_PROG_TYPE(BPF_PROG_TYPE_RAW_TRACEPOINT_WRITABLE, raw_tracepoint_writable, struct bpf_raw_tracepoint_args, u64) BPF_PROG_TYPE(BPF_PROG_TYPE_TRACING, tracing, void *, void *) +BPF_PROG_TYPE(BPF_PROG_TYPE_RAW_TRACEPOINT_OVERRIDE, raw_tracepoint_override, + struct bpf_raw_tracepoint_args, u64) #endif #ifdef CONFIG_CGROUP_BPF BPF_PROG_TYPE(BPF_PROG_TYPE_CGROUP_DEVICE, cg_dev, diff --git a/include/linux/trace_events.h b/include/linux/trace_events.h index 04307a19cde3..fcb2d62d0c9f 100644 --- a/include/linux/trace_events.h +++ b/include/linux/trace_events.h @@ -768,6 +768,9 @@ int perf_event_query_prog_array(struct perf_event *event, void __user *info); struct bpf_raw_tp_link; int bpf_probe_register(struct bpf_raw_event_map *btp, struct bpf_raw_tp_link *link); int bpf_probe_unregister(struct bpf_raw_event_map *btp, struct bpf_raw_tp_link *link); +int bpf_probe_override(struct bpf_raw_event_map *btp, + struct bpf_raw_tp_link *link, + const char *probe_name);
struct bpf_raw_event_map *bpf_get_raw_tracepoint(const char *name); void bpf_put_raw_tracepoint(struct bpf_raw_event_map *btp); @@ -805,6 +808,12 @@ static inline int bpf_probe_unregister(struct bpf_raw_event_map *btp, struct bpf { return -EOPNOTSUPP; } +static inline int bpf_probe_override(struct bpf_raw_event_map *btp, + struct bpf_raw_tp_link *link, + const char *probe_name) +{ + return -EOPNOTSUPP; +} static inline struct bpf_raw_event_map *bpf_get_raw_tracepoint(const char *name) { return NULL; diff --git a/include/linux/tracepoint-defs.h b/include/linux/tracepoint-defs.h index aebf0571c736..9d7b1710c0aa 100644 --- a/include/linux/tracepoint-defs.h +++ b/include/linux/tracepoint-defs.h @@ -29,6 +29,11 @@ struct tracepoint_func { int prio; };
+struct tracepoint_func_snapshot { + struct tracepoint_func orig; + struct tracepoint_func override; +}; + struct tracepoint_ext { int (*regfunc)(void); void (*unregfunc)(void); @@ -45,6 +50,7 @@ struct tracepoint { void *probestub; struct tracepoint_func __rcu *funcs; struct tracepoint_ext *ext; + struct tracepoint_func_snapshot *snapshot; };
#ifdef CONFIG_HAVE_ARCH_PREL32_RELOCATIONS diff --git a/include/linux/tracepoint.h b/include/linux/tracepoint.h index 826ce3f8e1f8..399001e2afca 100644 --- a/include/linux/tracepoint.h +++ b/include/linux/tracepoint.h @@ -50,6 +50,9 @@ tracepoint_probe_register_may_exist(struct tracepoint *tp, void *probe, return tracepoint_probe_register_prio_may_exist(tp, probe, data, TRACEPOINT_DEFAULT_PRIO); } +extern int +tracepoint_probe_override(struct tracepoint *tp, void *probe, void *data, + const char *func_replaced); extern void for_each_kernel_tracepoint(void (*fct)(struct tracepoint *tp, void *priv), void *priv); diff --git a/include/uapi/linux/bpf.h b/include/uapi/linux/bpf.h index 233de8677382..cd3d889fe634 100644 --- a/include/uapi/linux/bpf.h +++ b/include/uapi/linux/bpf.h @@ -1071,6 +1071,7 @@ enum bpf_prog_type { BPF_PROG_TYPE_SK_LOOKUP, BPF_PROG_TYPE_SYSCALL, /* a program that can execute syscalls */ BPF_PROG_TYPE_NETFILTER, + BPF_PROG_TYPE_RAW_TRACEPOINT_OVERRIDE, __MAX_BPF_PROG_TYPE };
@@ -1707,6 +1708,7 @@ union bpf_attr { __u32 prog_fd; __u32 :32; __aligned_u64 cookie; + __aligned_u64 probe_name; } raw_tracepoint;
struct { /* anonymous struct for BPF_BTF_LOAD */ diff --git a/kernel/bpf/syscall.c b/kernel/bpf/syscall.c index 3f178a0f8eb1..e360062db34e 100644 --- a/kernel/bpf/syscall.c +++ b/kernel/bpf/syscall.c @@ -4092,14 +4092,16 @@ static int bpf_perf_link_attach(const union bpf_attr *attr, struct bpf_prog *pro #endif /* CONFIG_PERF_EVENTS */
static int bpf_raw_tp_link_attach(struct bpf_prog *prog, - const char __user *user_tp_name, u64 cookie, + const char __user *user_tp_name, + const char __user *user_probe_name, + u64 cookie, enum bpf_attach_type attach_type) { struct bpf_link_primer link_primer; struct bpf_raw_tp_link *link; struct bpf_raw_event_map *btp; - const char *tp_name; - char buf[128]; + const char *tp_name, *probe_name; + char buf[128], probe[128]; int err;
switch (prog->type) { @@ -4124,6 +4126,17 @@ static int bpf_raw_tp_link_attach(struct bpf_prog *prog, buf[sizeof(buf) - 1] = 0; tp_name = buf; break; + case BPF_PROG_TYPE_RAW_TRACEPOINT_OVERRIDE: + if (strncpy_from_user(buf, user_tp_name, sizeof(buf) - 1) < 0) + return -EFAULT; + buf[sizeof(buf) - 1] = 0; + tp_name = buf; + + if (strncpy_from_user(probe, user_probe_name, sizeof(probe) - 1) < 0) + return -EFAULT; + probe[sizeof(probe) - 1] = 0; + probe_name = probe; + break; default: return -EINVAL; } @@ -4149,7 +4162,10 @@ static int bpf_raw_tp_link_attach(struct bpf_prog *prog, goto out_put_btp; }
- err = bpf_probe_register(link->btp, link); + if (prog->type == BPF_PROG_TYPE_RAW_TRACEPOINT_OVERRIDE) + err = bpf_probe_override(link->btp, link, probe_name); + else + err = bpf_probe_register(link->btp, link); if (err) { bpf_link_cleanup(&link_primer); goto out_put_btp; @@ -4162,12 +4178,12 @@ static int bpf_raw_tp_link_attach(struct bpf_prog *prog, return err; }
-#define BPF_RAW_TRACEPOINT_OPEN_LAST_FIELD raw_tracepoint.cookie +#define BPF_RAW_TRACEPOINT_OPEN_LAST_FIELD raw_tracepoint.probe_name
static int bpf_raw_tracepoint_open(const union bpf_attr *attr) { struct bpf_prog *prog; - void __user *tp_name; + void __user *tp_name, *probe_name; __u64 cookie; int fd;
@@ -4180,7 +4196,9 @@ static int bpf_raw_tracepoint_open(const union bpf_attr *attr)
tp_name = u64_to_user_ptr(attr->raw_tracepoint.name); cookie = attr->raw_tracepoint.cookie; - fd = bpf_raw_tp_link_attach(prog, tp_name, cookie, prog->expected_attach_type); + probe_name = u64_to_user_ptr(attr->raw_tracepoint.probe_name); + fd = bpf_raw_tp_link_attach(prog, tp_name, probe_name, + cookie, prog->expected_attach_type); if (fd < 0) bpf_prog_put(prog); return fd; @@ -5565,7 +5583,8 @@ static int link_create(union bpf_attr *attr, bpfptr_t uattr) goto out; } if (prog->expected_attach_type == BPF_TRACE_RAW_TP) - ret = bpf_raw_tp_link_attach(prog, NULL, attr->link_create.tracing.cookie, + ret = bpf_raw_tp_link_attach(prog, NULL, NULL, + attr->link_create.tracing.cookie, attr->link_create.attach_type); else if (prog->expected_attach_type == BPF_TRACE_ITER) ret = bpf_iter_link_attach(attr, uattr, prog); diff --git a/kernel/trace/bpf_trace.c b/kernel/trace/bpf_trace.c index 606007c387c5..1e965517ba05 100644 --- a/kernel/trace/bpf_trace.c +++ b/kernel/trace/bpf_trace.c @@ -1998,6 +1998,14 @@ const struct bpf_verifier_ops raw_tracepoint_writable_verifier_ops = { const struct bpf_prog_ops raw_tracepoint_writable_prog_ops = { };
+const struct bpf_verifier_ops raw_tracepoint_override_verifier_ops = { + .get_func_proto = raw_tp_prog_func_proto, + .is_valid_access = raw_tp_writable_prog_is_valid_access, +}; + +const struct bpf_prog_ops raw_tracepoint_override_prog_ops = { +}; + static bool pe_prog_is_valid_access(int off, int size, enum bpf_access_type type, const struct bpf_prog *prog, struct bpf_insn_access_aux *info) @@ -2307,6 +2315,29 @@ BPF_TRACE_DEFN_x(10); BPF_TRACE_DEFN_x(11); BPF_TRACE_DEFN_x(12);
+int bpf_probe_override(struct bpf_raw_event_map *btp, + struct bpf_raw_tp_link *link, + const char *probe_name) +{ + struct tracepoint *tp = btp->tp; + struct bpf_prog *prog = link->link.prog; + + if (!probe_name) + return -EINVAL; + + /* + * check that program doesn't access arguments beyond what's + * available in this tracepoint + */ + if (prog->aux->max_ctx_offset > btp->num_args * sizeof(u64)) + return -EINVAL; + + if (prog->aux->max_tp_access > btp->writable_size) + return -EINVAL; + + return tracepoint_probe_override(tp, (void *)btp->bpf_func, link, probe_name); +} + int bpf_probe_register(struct bpf_raw_event_map *btp, struct bpf_raw_tp_link *link) { struct tracepoint *tp = btp->tp; diff --git a/kernel/tracepoint.c b/kernel/tracepoint.c index 62719d2941c9..3b8317306edc 100644 --- a/kernel/tracepoint.c +++ b/kernel/tracepoint.c @@ -14,6 +14,7 @@ #include <linux/sched/signal.h> #include <linux/sched/task.h> #include <linux/static_key.h> +#include <linux/kallsyms.h>
enum tp_func_state { TP_FUNC_0, @@ -130,6 +131,121 @@ static void debug_print_probes(struct tracepoint_func *funcs) printk(KERN_DEBUG "Probe %d : %pSb\n", i, funcs[i].func); }
+static struct tracepoint_func * +find_func_to_override(struct tracepoint_func *funcs, + unsigned long probe_addr) +{ + int iter; + + if (!funcs) + return NULL; + + for (iter = 0; funcs[iter].func; iter++) { + if ((unsigned long)funcs[iter].func == probe_addr) + return &(funcs[iter]); + } + + return NULL; +} + +static struct tracepoint_func_snapshot * +find_func_snapshot(struct tracepoint_func_snapshot **ss, + struct tracepoint_func *func, + bool *is_override) +{ + int iter; + struct tracepoint_func_snapshot *shots; + + shots = *ss; + if (!shots) + return NULL; + + for (iter = 0; shots[iter].override.func; iter++) { + if (shots[iter].override.func == func->func && + shots[iter].override.data == func->data) { + *is_override = true; + return &(shots[iter]); + } + + if (shots[iter].orig.func == func->func && + shots[iter].orig.data == func->data) { + *is_override = false; + return &(shots[iter]); + } + } + + return NULL; +} + +static void drop_func_snapshot(struct tracepoint_func_snapshot **ss, + struct tracepoint_func_snapshot *drop) +{ + struct tracepoint_func_snapshot *old, *new; + int nr_snapshots; /* Counter for snapshots */ + int iter; /* Iterate over old snapshots */ + int idx = 0; /* Index of snapshot to drop */ + + old = *ss; + if (!old) + return; + + for (nr_snapshots = 0; old[nr_snapshots].override.func; nr_snapshots++) { + if (&(old[nr_snapshots]) == drop) + idx = nr_snapshots; + } + + if (nr_snapshots == 0) { + kfree(old); + *ss = NULL; + return; + } + + new = kmalloc_array(nr_snapshots, sizeof(struct tracepoint_func_snapshot), GFP_KERNEL); + if (!new) { + for (iter = idx; iter < nr_snapshots - 1; iter++) + old[iter] = old[iter + 1]; + memset(&(old[nr_snapshots - 1]), 0, sizeof(struct tracepoint_func_snapshot)); + } else { + int j = 0; + + for (iter = 0; iter < nr_snapshots; iter++) { + if (iter != idx) + new[j++] = old[iter]; + } + kfree(old); + *ss = new; + } +} + +static int save_func_snapshot(struct tracepoint_func_snapshot **ss, + struct tracepoint_func *new_func, + struct tracepoint_func *old_func) +{ + struct tracepoint_func_snapshot *old, *new; + int nr_shots = 0; /* Counter for old snapshots */ + int total; /* Total count of new snapshots */ + + old = *ss; + if (old) + while (old[nr_shots].override.func) + nr_shots++; + + /* + 2 : one for new snapshot, one for NULL snapshot */ + total = nr_shots + 2; + new = kmalloc_array(total, sizeof(struct tracepoint_func_snapshot), GFP_KERNEL); + if (!new) + return -ENOMEM; + + memcpy(new, old, nr_shots * sizeof(struct tracepoint_func_snapshot)); + new[nr_shots].orig = *old_func; + new[nr_shots].override = *new_func; + new[nr_shots + 1].override.func = NULL; + + *ss = new; + kfree(old); + return 0; +} + static struct tracepoint_func * func_add(struct tracepoint_func **funcs, struct tracepoint_func *tp_func, int prio) @@ -412,6 +528,52 @@ static int tracepoint_remove_func(struct tracepoint *tp, return 0; }
+static int tracepoint_override_func(struct tracepoint *tp, + struct tracepoint_func *func, + struct tracepoint_func *func_override) +{ + int ret = tracepoint_remove_func(tp, func); + + return ret ? : tracepoint_add_func(tp, func_override, + func_override->prio, false); +} + +static int tracepoint_restore_func(struct tracepoint *tp, + struct tracepoint_func *func, + struct tracepoint_func *func_restore) +{ + int ret = tracepoint_remove_func(tp, func); + + return ret ? : tracepoint_add_func(tp, func_restore, + func_restore->prio, false); +} + +int tracepoint_probe_override(struct tracepoint *tp, void *probe, + void *data, const char *probe_name) +{ + struct tracepoint_func tp_func; + struct tracepoint_func *target_func; + unsigned long probe_addr; + int ret; + + probe_addr = kallsyms_lookup_name(probe_name); + mutex_lock(&tracepoints_mutex); + target_func = find_func_to_override(tp->funcs, probe_addr); + if (!target_func) + return -ESRCH; + tp_func.func = probe; + tp_func.data = data; + tp_func.prio = target_func->prio; + ret = save_func_snapshot(&(tp->snapshot), &tp_func, target_func); + if (ret) + goto unlock; + + ret = tracepoint_override_func(tp, target_func, &tp_func); +unlock: + mutex_unlock(&tracepoints_mutex); + return ret; +} + /** * tracepoint_probe_register_prio_may_exist - Connect a probe to a tracepoint with priority * @tp: tracepoint @@ -496,12 +658,38 @@ EXPORT_SYMBOL_GPL(tracepoint_probe_register); int tracepoint_probe_unregister(struct tracepoint *tp, void *probe, void *data) { struct tracepoint_func tp_func; + struct tracepoint_func_snapshot *shot; int ret; + bool is_override; /* whether probe is an overriding func */
mutex_lock(&tracepoints_mutex); tp_func.func = probe; tp_func.data = data; - ret = tracepoint_remove_func(tp, &tp_func); + + shot = find_func_snapshot(&(tp->snapshot), &tp_func, &is_override); + if (!shot) { + ret = tracepoint_remove_func(tp, &tp_func); + } else { + /* unregister probe rengistered by raw_tracepoint_open, + * restore to original tp_func. + * + * 1. restore orig func from snapshot. + * 2. remove snapshot. + */ + if (is_override) + ret = tracepoint_restore_func(tp, &tp_func, &(shot->orig)); + /* unregister orig probe registered by register_trace_*. + * + * 1. remove curr probe func(registered by raw_tracepoint_open) + * from tp->funcs. + * 2. remove snapshot. + */ + else + ret = tracepoint_remove_func(tp, &(shot->override)); + if (!ret) + drop_func_snapshot(&(tp->snapshot), shot); + } + mutex_unlock(&tracepoints_mutex); return ret; }
Extend libbpf to support the new BPF_PROG_TYPE_RAW_TRACEPOINT_OVERRIDE program type, making it available to user space applications through the standard libbpf API.
Signed-off-by: Fuyu Zhao zhaofuyu@vivo.com --- tools/include/uapi/linux/bpf.h | 2 ++ tools/lib/bpf/bpf.c | 1 + tools/lib/bpf/bpf.h | 3 ++- tools/lib/bpf/libbpf.c | 27 ++++++++++++++++++++++++++- tools/lib/bpf/libbpf.h | 3 ++- 5 files changed, 33 insertions(+), 3 deletions(-)
diff --git a/tools/include/uapi/linux/bpf.h b/tools/include/uapi/linux/bpf.h index 233de8677382..7438836b3e4b 100644 --- a/tools/include/uapi/linux/bpf.h +++ b/tools/include/uapi/linux/bpf.h @@ -1071,6 +1071,7 @@ enum bpf_prog_type { BPF_PROG_TYPE_SK_LOOKUP, BPF_PROG_TYPE_SYSCALL, /* a program that can execute syscalls */ BPF_PROG_TYPE_NETFILTER, + BPF_PROG_TYPE_RAW_TRACEPOINT_OVERRIDE, __MAX_BPF_PROG_TYPE };
@@ -1707,6 +1708,7 @@ union bpf_attr { __u32 prog_fd; __u32 :32; __aligned_u64 cookie; + __aligned_u64 probe_name; } raw_tracepoint;
struct { /* anonymous struct for BPF_BTF_LOAD */ diff --git a/tools/lib/bpf/bpf.c b/tools/lib/bpf/bpf.c index ab40dbf9f020..95b73f94ce72 100644 --- a/tools/lib/bpf/bpf.c +++ b/tools/lib/bpf/bpf.c @@ -1235,6 +1235,7 @@ int bpf_raw_tracepoint_open_opts(int prog_fd, struct bpf_raw_tp_opts *opts) attr.raw_tracepoint.prog_fd = prog_fd; attr.raw_tracepoint.name = ptr_to_u64(OPTS_GET(opts, tp_name, NULL)); attr.raw_tracepoint.cookie = OPTS_GET(opts, cookie, 0); + attr.raw_tracepoint.probe_name = ptr_to_u64(OPTS_GET(opts, probe_name, NULL));
fd = sys_bpf_fd(BPF_RAW_TRACEPOINT_OPEN, &attr, attr_sz); return libbpf_err_errno(fd); diff --git a/tools/lib/bpf/bpf.h b/tools/lib/bpf/bpf.h index 7252150e7ad3..0ebedbd99fe5 100644 --- a/tools/lib/bpf/bpf.h +++ b/tools/lib/bpf/bpf.h @@ -630,9 +630,10 @@ struct bpf_raw_tp_opts { size_t sz; /* size of this struct for forward/backward compatibility */ const char *tp_name; __u64 cookie; + const char *probe_name; size_t :0; }; -#define bpf_raw_tp_opts__last_field cookie +#define bpf_raw_tp_opts__last_field probe_name
LIBBPF_API int bpf_raw_tracepoint_open_opts(int prog_fd, struct bpf_raw_tp_opts *opts); LIBBPF_API int bpf_raw_tracepoint_open(const char *name, int prog_fd); diff --git a/tools/lib/bpf/libbpf.c b/tools/lib/bpf/libbpf.c index fe4fc5438678..ce67c917ba59 100644 --- a/tools/lib/bpf/libbpf.c +++ b/tools/lib/bpf/libbpf.c @@ -9557,6 +9557,8 @@ static const struct bpf_sec_def section_defs[] = { SEC_DEF("raw_tp+", RAW_TRACEPOINT, 0, SEC_NONE, attach_raw_tp), SEC_DEF("raw_tracepoint.w+", RAW_TRACEPOINT_WRITABLE, 0, SEC_NONE, attach_raw_tp), SEC_DEF("raw_tp.w+", RAW_TRACEPOINT_WRITABLE, 0, SEC_NONE, attach_raw_tp), + SEC_DEF("raw_tracepoint.o+", RAW_TRACEPOINT_OVERRIDE, 0, SEC_NONE, attach_raw_tp), + SEC_DEF("raw_tp.o+", RAW_TRACEPOINT_OVERRIDE, 0, SEC_NONE, attach_raw_tp), SEC_DEF("tp_btf+", TRACING, BPF_TRACE_RAW_TP, SEC_ATTACH_BTF, attach_trace), SEC_DEF("fentry+", TRACING, BPF_TRACE_FENTRY, SEC_ATTACH_BTF, attach_trace), SEC_DEF("fmod_ret+", TRACING, BPF_MODIFY_RETURN, SEC_ATTACH_BTF, attach_trace), @@ -12684,6 +12686,7 @@ bpf_program__attach_raw_tracepoint_opts(const struct bpf_program *prog,
raw_opts.tp_name = tp_name; raw_opts.cookie = OPTS_GET(opts, cookie, 0); + raw_opts.probe_name = OPTS_GET(opts, probe_name, NULL); pfd = bpf_raw_tracepoint_open_opts(prog_fd, &raw_opts); if (pfd < 0) { pfd = -errno; @@ -12704,14 +12707,18 @@ struct bpf_link *bpf_program__attach_raw_tracepoint(const struct bpf_program *pr
static int attach_raw_tp(const struct bpf_program *prog, long cookie, struct bpf_link **link) { + LIBBPF_OPTS(bpf_raw_tracepoint_opts, raw_opts); static const char *const prefixes[] = { "raw_tp", "raw_tracepoint", "raw_tp.w", "raw_tracepoint.w", + "raw_tp.o", + "raw_tracepoint.o", }; size_t i; const char *tp_name = NULL; + char *dup = NULL, *sep = NULL;
*link = NULL;
@@ -12739,7 +12746,25 @@ static int attach_raw_tp(const struct bpf_program *prog, long cookie, struct bpf return -EINVAL; }
- *link = bpf_program__attach_raw_tracepoint(prog, tp_name); + if (prog->type == BPF_PROG_TYPE_RAW_TRACEPOINT_OVERRIDE) { + dup = strdup(tp_name); + if (!dup) + return -ENOMEM; + + sep = strchr(dup, ':'); + if (!sep) { + free(dup); + return -EINVAL; + } + *sep = '\0'; + + tp_name = dup; + raw_opts.probe_name = sep + 1, + *link = bpf_program__attach_raw_tracepoint_opts(prog, tp_name, &raw_opts); + free(dup); + } else { + *link = bpf_program__attach_raw_tracepoint(prog, tp_name); + } return libbpf_get_error(*link); }
diff --git a/tools/lib/bpf/libbpf.h b/tools/lib/bpf/libbpf.h index 2e91148d9b44..f4e9cb819b75 100644 --- a/tools/lib/bpf/libbpf.h +++ b/tools/lib/bpf/libbpf.h @@ -820,9 +820,10 @@ bpf_program__attach_tracepoint_opts(const struct bpf_program *prog, struct bpf_raw_tracepoint_opts { size_t sz; /* size of this struct for forward/backward compatibility */ __u64 cookie; + const char *probe_name; size_t :0; }; -#define bpf_raw_tracepoint_opts__last_field cookie +#define bpf_raw_tracepoint_opts__last_field probe_name
LIBBPF_API struct bpf_link * bpf_program__attach_raw_tracepoint(const struct bpf_program *prog,
Add test for the new BPF_PROG_TYPE_RAW_TRACEPOINT_OVERRIDE program type. This test verifies whether a BPF program can successfully override the target tracepoint probe function.
Signed-off-by: Fuyu Zhao zhaofuyu@vivo.com --- .../bpf/prog_tests/raw_tp_override_test_run.c | 23 +++++++++++++++++++ .../bpf/progs/test_raw_tp_override_test_run.c | 20 ++++++++++++++++ .../selftests/bpf/test_kmods/bpf_testmod.c | 7 ++++++ 3 files changed, 50 insertions(+) create mode 100644 tools/testing/selftests/bpf/prog_tests/raw_tp_override_test_run.c create mode 100644 tools/testing/selftests/bpf/progs/test_raw_tp_override_test_run.c
diff --git a/tools/testing/selftests/bpf/prog_tests/raw_tp_override_test_run.c b/tools/testing/selftests/bpf/prog_tests/raw_tp_override_test_run.c new file mode 100644 index 000000000000..02301253cd9b --- /dev/null +++ b/tools/testing/selftests/bpf/prog_tests/raw_tp_override_test_run.c @@ -0,0 +1,23 @@ +// SPDX-License-Identifier: GPL-2.0 + +#include <test_progs.h> +#include "bpf/libbpf_internal.h" +#include "test_raw_tp_override_test_run.skel.h" + +void test_raw_tp_override_test_run(void) +{ + struct test_raw_tp_override_test_run *skel; + + skel = test_raw_tp_override_test_run__open_and_load(); + if (!ASSERT_OK_PTR(skel, "test_raw_tp_override_test_run__open_and_load")) + return; + + if (!ASSERT_OK(test_raw_tp_override_test_run__attach(skel), + "test_raw_tp_override_test_run__attach")) + goto cleanup; + ASSERT_OK(trigger_module_test_write(1), "trigger_write"); + ASSERT_EQ(skel->bss->flag, 1, "check_flag"); + +cleanup: + test_raw_tp_override_test_run__destroy(skel); +} diff --git a/tools/testing/selftests/bpf/progs/test_raw_tp_override_test_run.c b/tools/testing/selftests/bpf/progs/test_raw_tp_override_test_run.c new file mode 100644 index 000000000000..eb6d24e1c737 --- /dev/null +++ b/tools/testing/selftests/bpf/progs/test_raw_tp_override_test_run.c @@ -0,0 +1,20 @@ +// SPDX-License-Identifier: GPL-2.0 + +#include "vmlinux.h" +#include <bpf/bpf_helpers.h> +#include <bpf/bpf_tracing.h> + +__u32 flag = 0; + +/** + * This program overrides raw_tp_override_probe handler in + * tracepoint bpf_testmode_test_raw_tp_null_tp. + */ +SEC("raw_tp.o/bpf_testmod_test_write_bare_tp:raw_tp_override_probe") +int BPF_PROG(tp_override, struct task_struct *task, char *comm) +{ + flag = 1; + return 0; +} + +char _license[] SEC("license") = "GPL"; diff --git a/tools/testing/selftests/bpf/test_kmods/bpf_testmod.c b/tools/testing/selftests/bpf/test_kmods/bpf_testmod.c index 2beb9b2fcbd8..7a49178d2343 100644 --- a/tools/testing/selftests/bpf/test_kmods/bpf_testmod.c +++ b/tools/testing/selftests/bpf/test_kmods/bpf_testmod.c @@ -1628,6 +1628,11 @@ static struct bpf_testmod_multi_st_ops multi_st_ops_cfi_stubs = { .test_1 = bpf_testmod_multi_st_ops__test_1, };
+static void raw_tp_override_probe(void *ignored, struct task_struct *task, + struct bpf_testmod_test_write_ctx *ctx) +{ +} + struct bpf_struct_ops testmod_multi_st_ops = { .verifier_ops = &bpf_testmod_verifier_ops, .init = multi_st_ops_init, @@ -1665,6 +1670,7 @@ static int bpf_testmod_init(void) ret = ret ?: register_btf_id_dtor_kfuncs(bpf_testmod_dtors, ARRAY_SIZE(bpf_testmod_dtors), THIS_MODULE); + ret = ret ?: register_trace_bpf_testmod_test_write_bare_tp(raw_tp_override_probe, NULL); if (ret < 0) return ret; if (bpf_fentry_test1(0) < 0) @@ -1701,6 +1707,7 @@ static void bpf_testmod_exit(void) bpf_kfunc_close_sock(); sysfs_remove_bin_file(kernel_kobj, &bin_attr_bpf_testmod_file); unregister_bpf_testmod_uprobe(); + unregister_trace_bpf_testmod_test_write_bare_tp(raw_tp_override_probe, NULL); }
module_init(bpf_testmod_init);
On Wed, 17 Sep 2025 15:22:39 +0800 Fuyu Zhao zhaofuyu@vivo.com wrote:
Hi everyone,
This patchset introduces a new BPF program type that allows overriding a tracepoint probe function registered via register_trace_*.
Motivation
Tracepoint probe functions registered via register_trace_* in the kernel cannot be dynamically modified, changing a probe function requires recompiling the kernel and rebooting. Nor can BPF programs change an existing probe function.
I'm confused by what you mean by "tracepoint probe function"?
You mean the function callback that gets called via the "register_trace_*()"?
Overiding tracepoint supports a way to apply patches into kernel quickly (such as applying security ones), through predefined static tracepoints, without waiting for upstream integration.
This sounds way out of scope for tracepoints. Please provide a solid example for this.
This patchset demonstrates the way to override probe functions by BPF program.
Overview
This patchset adds BPF_PROG_TYPE_RAW_TRACEPOINT_OVERRIDE program type. When this type of BPF program attaches, it overrides the target tracepoint probe function.
And it also extends a new struct type "tracepoint_func_snapshot", which extends the tracepoint structure. It is used to record the original probe function registered by kernel after BPF program being attached and restore from it after detachment.
The tracepoint structure exists for every tracepoint in the kernel. By adding a pointer to it, you just increased the size of the tracepoint. I'm already complaining that each tracepoint causes around 5K of memory overhead, and I'd like to make it smaller.
-- Steve
On Wed, Sep 17, 2025 at 12:23 AM Fuyu Zhao zhaofuyu@vivo.com wrote:
Hi everyone,
This patchset introduces a new BPF program type that allows overriding a tracepoint probe function registered via register_trace_*.
Motivation
Tracepoint probe functions registered via register_trace_* in the kernel cannot be dynamically modified, changing a probe function requires recompiling the kernel and rebooting. Nor can BPF programs change an existing probe function.
Overiding tracepoint supports a way to apply patches into kernel quickly (such as applying security ones), through predefined static tracepoints, without waiting for upstream integration.
IIUC, this work solves the same problem as raw tracepoint (raw_tp) or raw tracepoint with btf (tp_btf).
Did I miss something?
Thanks, Song
linux-kselftest-mirror@lists.linaro.org