This series attempts to provide a simple way for BPF programs (and in future other consumers) to utilize BPF Type Format (BTF) information to display kernel data structures in-kernel. The use case this functionality is applied to here is to support a snprintf()-like helper to copy a BTF representation of kernel data to a string, and a BPF seq file helper to display BTF data for an iterator.
There is already support in kernel/bpf/btf.c for "show" functionality; the changes here generalize that support from seq-file specific verifier display to the more generic case and add another specific use case; rather than seq_printf()ing the show data, it is copied to a supplied string using a snprintf()-like function. Other future consumers of the show functionality could include a bpf_printk_btf() function which printk()ed the data instead. Oops messaging in particular would be an interesting application for such functionality.
The above potential use case hints at a potential reply to a reasonable objection that such typed display should be solved by tracing programs, where the in-kernel tracing records data and the userspace program prints it out. While this is certainly the recommended approach for most cases, I believe having an in-kernel mechanism would be valuable also. Critically in BPF programs it greatly simplifies debugging and tracing of such data to invoking a simple helper.
One challenge raised in an earlier iteration of this work - where the BTF printing was implemented as a printk() format specifier - was that the amount of data printed per printk() was large, and other format specifiers were far simpler. Here we sidestep that concern by printing components of the BTF representation as we go for the seq file case, and in the string case the snprintf()-like operation is intended to be a basis for perf event or ringbuf output. The reasons for avoiding bpf_trace_printk are that
1. bpf_trace_printk() strings are restricted in size and cannot display anything beyond trivial data structures; and 2. bpf_trace_printk() is for debugging purposes only.
As Alexei suggested, a bpf_trace_puts() helper could solve this in the future but it still would be limited by the 1000 byte limit for traced strings.
Default output for an sk_buff looks like this (zeroed fields are omitted):
(struct sk_buff){ .transport_header = (__u16)65535, .mac_header = (__u16)65535, .end = (sk_buff_data_t)192, .head = (unsigned char *)0x000000007524fd8b, .data = (unsigned char *)0x000000007524fd8b, .truesize = (unsigned int)768, .users = (refcount_t){ .refs = (atomic_t){ .counter = (int)1, }, }, }
Flags can modify aspects of output format; see patch 3 for more details.
Changes since v4:
- Changed approach from a BPF trace event-centric design to one utilizing a snprintf()-like helper and an iter helper (Alexei, patches 3,5) - Added tests to verify BTF output (patch 4) - Added support to tests for verifying BTF type_id-based display as well as type name via __builtin_btf_type_id (Andrii, patch 4). - Augmented task iter tests to cover the BTF-based seq helper. Because a task_struct's BTF-based representation would overflow the PAGE_SIZE limit on iterator data, the "struct fs_struct" (task->fs) is displayed for each task instead (Alexei, patch 6).
Changes since v3:
- Moved to RFC since the approach is different (and bpf-next is closed) - Rather than using a printk() format specifier as the means of invoking BTF-enabled display, a dedicated BPF helper is used. This solves the issue of printk() having to output large amounts of data using a complex mechanism such as BTF traversal, but still provides a way for the display of such data to be achieved via BPF programs. Future work could include a bpf_printk_btf() function to invoke display via printk() where the elements of a data structure are printk()ed one at a time. Thanks to Petr Mladek, Andy Shevchenko and Rasmus Villemoes who took time to look at the earlier printk() format-specifier-focused version of this and provided feedback clarifying the problems with that approach. - Added trace id to the bpf_trace_printk events as a means of separating output from standard bpf_trace_printk() events, ensuring it can be easily parsed by the reader. - Added bpf_trace_btf() helper tests which do simple verification of the various display options.
Changes since v2:
- Alexei and Yonghong suggested it would be good to use probe_kernel_read() on to-be-shown data to ensure safety during operation. Safe copy via probe_kernel_read() to a buffer object in "struct btf_show" is used to support this. A few different approaches were explored including dynamic allocation and per-cpu buffers. The downside of dynamic allocation is that it would be done during BPF program execution for bpf_trace_printk()s using %pT format specifiers. The problem with per-cpu buffers is we'd have to manage preemption and since the display of an object occurs over an extended period and in printk context where we'd rather not change preemption status, it seemed tricky to manage buffer safety while considering preemption. The approach of utilizing stack buffer space via the "struct btf_show" seemed like the simplest approach. The stack size of the associated functions which have a "struct btf_show" on their stack to support show operation (btf_type_snprintf_show() and btf_type_seq_show()) stays under 500 bytes. The compromise here is the safe buffer we use is small - 256 bytes - and as a result multiple probe_kernel_read()s are needed for larger objects. Most objects of interest are smaller than this (e.g. "struct sk_buff" is 224 bytes), and while task_struct is a notable exception at ~8K, performance is not the priority for BTF-based display. (Alexei and Yonghong, patch 2). - safe buffer use is the default behaviour (and is mandatory for BPF) but unsafe display - meaning no safe copy is done and we operate on the object itself - is supported via a 'u' option. - pointers are prefixed with 0x for clarity (Alexei, patch 2) - added additional comments and explanations around BTF show code, especially around determining whether objects such zeroed. Also tried to comment safe object scheme used. (Yonghong, patch 2) - added late_initcall() to initialize vmlinux BTF so that it would not have to be initialized during printk operation (Alexei, patch 5) - removed CONFIG_BTF_PRINTF config option as it is not needed; CONFIG_DEBUG_INFO_BTF can be used to gate test behaviour and determining behaviour of type-based printk can be done via retrieval of BTF data; if it's not there BTF was unavailable or broken (Alexei, patches 4,6) - fix bpf_trace_printk test to use vmlinux.h and globals via skeleton infrastructure, removing need for perf events (Andrii, patch 8)
Changes since v1:
- changed format to be more drgn-like, rendering indented type info along with type names by default (Alexei) - zeroed values are omitted (Arnaldo) by default unless the '0' modifier is specified (Alexei) - added an option to print pointer values without obfuscation. The reason to do this is the sysctls controlling pointer display are likely to be irrelevant in many if not most tracing contexts. Some questions on this in the outstanding questions section below... - reworked printk format specifer so that we no longer rely on format %pT<type> but instead use a struct * which contains type information (Rasmus). This simplifies the printk parsing, makes use more dynamic and also allows specification by BTF id as well as name. - removed incorrect patch which tried to fix dereferencing of resolved BTF info for vmlinux; instead we skip modifiers for the relevant case (array element type determination) (Alexei). - fixed issues with negative snprintf format length (Rasmus) - added test cases for various data structure formats; base types, typedefs, structs, etc. - tests now iterate through all typedef, enum, struct and unions defined for vmlinux BTF and render a version of the target dummy value which is either all zeros or all 0xff values; the idea is this exercises the "skip if zero" and "print everything" cases. - added support in BPF for using the %pT format specifier in bpf_trace_printk() - added BPF tests which ensure %pT format specifier use works (Alexei).
Alan Maguire (6): bpf: provide function to get vmlinux BTF information bpf: move to generic BTF show support, apply it to seq files/strings bpf: add bpf_btf_snprintf helper selftests/bpf: add bpf_btf_snprintf helper tests bpf: add bpf_seq_btf_write helper selftests/bpf: add test for bpf_seq_btf_write helper
include/linux/bpf.h | 3 + include/linux/btf.h | 40 + include/uapi/linux/bpf.h | 78 ++ kernel/bpf/btf.c | 978 ++++++++++++++++++--- kernel/bpf/helpers.c | 4 + kernel/bpf/verifier.c | 18 +- kernel/trace/bpf_trace.c | 133 +++ scripts/bpf_helpers_doc.py | 2 + tools/include/uapi/linux/bpf.h | 78 ++ tools/testing/selftests/bpf/prog_tests/bpf_iter.c | 66 ++ .../selftests/bpf/prog_tests/btf_snprintf.c | 55 ++ .../selftests/bpf/progs/bpf_iter_task_btf.c | 49 ++ .../selftests/bpf/progs/netif_receive_skb.c | 260 ++++++ 13 files changed, 1656 insertions(+), 108 deletions(-) create mode 100644 tools/testing/selftests/bpf/prog_tests/btf_snprintf.c create mode 100644 tools/testing/selftests/bpf/progs/bpf_iter_task_btf.c create mode 100644 tools/testing/selftests/bpf/progs/netif_receive_skb.c
It will be used later for BPF structure display support
Signed-off-by: Alan Maguire alan.maguire@oracle.com --- include/linux/bpf.h | 2 ++ kernel/bpf/verifier.c | 18 ++++++++++++------ 2 files changed, 14 insertions(+), 6 deletions(-)
diff --git a/include/linux/bpf.h b/include/linux/bpf.h index c6d9f2c..c0ad5d8 100644 --- a/include/linux/bpf.h +++ b/include/linux/bpf.h @@ -1330,6 +1330,8 @@ int bpf_check(struct bpf_prog **fp, union bpf_attr *attr, union bpf_attr __user *uattr); void bpf_patch_call_args(struct bpf_insn *insn, u32 stack_depth);
+struct btf *bpf_get_btf_vmlinux(void); + /* Map specifics */ struct xdp_buff; struct sk_buff; diff --git a/kernel/bpf/verifier.c b/kernel/bpf/verifier.c index 814bc6c..11d7985 100644 --- a/kernel/bpf/verifier.c +++ b/kernel/bpf/verifier.c @@ -11311,6 +11311,17 @@ static int check_attach_btf_id(struct bpf_verifier_env *env) } }
+struct btf *bpf_get_btf_vmlinux(void) +{ + if (!btf_vmlinux && IS_ENABLED(CONFIG_DEBUG_INFO_BTF)) { + mutex_lock(&bpf_verifier_lock); + if (!btf_vmlinux) + btf_vmlinux = btf_parse_vmlinux(); + mutex_unlock(&bpf_verifier_lock); + } + return btf_vmlinux; +} + int bpf_check(struct bpf_prog **prog, union bpf_attr *attr, union bpf_attr __user *uattr) { @@ -11344,12 +11355,7 @@ int bpf_check(struct bpf_prog **prog, union bpf_attr *attr, env->ops = bpf_verifier_ops[env->prog->type]; is_priv = bpf_capable();
- if (!btf_vmlinux && IS_ENABLED(CONFIG_DEBUG_INFO_BTF)) { - mutex_lock(&bpf_verifier_lock); - if (!btf_vmlinux) - btf_vmlinux = btf_parse_vmlinux(); - mutex_unlock(&bpf_verifier_lock); - } + bpf_get_btf_vmlinux();
/* grab the mutex to protect few globals used by verifier */ if (!is_priv)
generalize the "seq_show" seq file support in btf.c to support a generic show callback of which we support two instances; the current seq file show, and a show with snprintf() behaviour which instead writes the type data to a supplied string.
Both classes of show function call btf_type_show() with different targets; the seq file or the string to be written. In the string case we need to track additional data - length left in string to write and length to return that we would have written (a la snprintf).
By default show will display type information, field members and their types and values etc, and the information is indented based upon structure depth. Zeroed fields are omitted.
Show however supports flags which modify its behaviour:
BTF_SHOW_COMPACT - suppress newline/indent. BTF_SHOW_NONAME - suppress show of type and member names. BTF_SHOW_PTR_RAW - do not obfuscate pointer values. BTF_SHOW_UNSAFE - do not copy data to safe buffer before display. BTF_SHOW_ZERO - show zeroed values (by default they are not shown).
Signed-off-by: Alan Maguire alan.maguire@oracle.com --- include/linux/btf.h | 36 ++ kernel/bpf/btf.c | 971 ++++++++++++++++++++++++++++++++++++++++++++++------ 2 files changed, 904 insertions(+), 103 deletions(-)
diff --git a/include/linux/btf.h b/include/linux/btf.h index a9af5e7..d0f5d3c 100644 --- a/include/linux/btf.h +++ b/include/linux/btf.h @@ -13,6 +13,7 @@ struct btf_member; struct btf_type; union bpf_attr; +struct btf_show;
extern const struct file_operations btf_fops;
@@ -46,8 +47,43 @@ int btf_get_info_by_fd(const struct btf *btf, const struct btf_type *btf_type_id_size(const struct btf *btf, u32 *type_id, u32 *ret_size); + +/* + * Options to control show behaviour. + * - BTF_SHOW_COMPACT: no formatting around type information + * - BTF_SHOW_NONAME: no struct/union member names/types + * - BTF_SHOW_PTR_RAW: show raw (unobfuscated) pointer values; + * equivalent to %px. + * - BTF_SHOW_ZERO: show zero-valued struct/union members; they + * are not displayed by default + * - BTF_SHOW_UNSAFE: skip use of bpf_probe_read() to safely read + * data before displaying it. + */ +#define BTF_SHOW_COMPACT (1ULL << 0) +#define BTF_SHOW_NONAME (1ULL << 1) +#define BTF_SHOW_PTR_RAW (1ULL << 2) +#define BTF_SHOW_ZERO (1ULL << 3) +#define BTF_SHOW_UNSAFE (1ULL << 4) + void btf_type_seq_show(const struct btf *btf, u32 type_id, void *obj, struct seq_file *m); + +/* + * Copy len bytes of string representation of obj of BTF type_id into buf. + * + * @btf: struct btf object + * @type_id: type id of type obj points to + * @obj: pointer to typed data + * @buf: buffer to write to + * @len: maximum length to write to buf + * @flags: show options (see above) + * + * Return: length that would have been/was copied as per snprintf, or + * negative error. + */ +int btf_type_snprintf_show(const struct btf *btf, u32 type_id, void *obj, + char *buf, int len, u64 flags); + int btf_get_fd_by_id(u32 id); u32 btf_id(const struct btf *btf); bool btf_member_is_reg_int(const struct btf *btf, const struct btf_type *s, diff --git a/kernel/bpf/btf.c b/kernel/bpf/btf.c index f9ac693..70f5b88 100644 --- a/kernel/bpf/btf.c +++ b/kernel/bpf/btf.c @@ -284,6 +284,88 @@ static const char *btf_type_str(const struct btf_type *t) return btf_kind_str[BTF_INFO_KIND(t->info)]; }
+/* Chunk size we use in safe copy of data to be shown. */ +#define BTF_SHOW_OBJ_SAFE_SIZE 256 + +/* + * This is the maximum size of a base type value (equivalent to a + * 128-bit int); if we are at the end of our safe buffer and have + * less than 16 bytes space we can't be assured of being able + * to copy the next type safely, so in such cases we will initiate + * a new copy. + */ +#define BTF_SHOW_OBJ_BASE_TYPE_SIZE 16 + +/* + * Common data to all BTF show operations. Private show functions can add + * their own data to a structure containing a struct btf_show and consult it + * in the show callback. See btf_type_show() below. + * + * One challenge with showing nested data is we want to skip 0-valued + * data, but in order to figure out whether a nested object is all zeros + * we need to walk through it. As a result, we need to make two passes + * when handling structs, unions and arrays; the first path simply looks + * for nonzero data, while the second actually does the display. The first + * pass is signalled by show->state.depth_check being set, and if we + * encounter a non-zero value we set show->state.depth_to_show to + * the depth at which we encountered it. When we have completed the + * first pass, we will know if anything needs to be displayed if + * depth_to_show > depth. See btf_[struct,array]_show() for the + * implementation of this. + * + * Another problem is we want to ensure the data for display is safe to + * access. To support this, the "struct obj" is used to track the data + * object and our safe copy of it. We copy portions of the data needed + * to the object "copy" buffer, but because its size is limited to + * BTF_SHOW_OBJ_COPY_LEN bytes, multiple copies may be required as we + * traverse larger objects for display. + * + * The various data type show functions all start with a call to + * btf_show_start_type() which returns a pointer to the safe copy + * of the data needed (or if BTF_SHOW_UNSAFE is specified, to the + * raw data itself). btf_show_obj_safe() is responsible for + * using copy_from_kernel_nofault() to update the safe data if necessary + * as we traverse the object's data. skbuff-like semantics are + * used: + * + * - obj.head points to the start of the toplevel object for display + * - obj.size is the size of the toplevel object + * - obj.data points to the current point in the original data at + * which our safe data starts. obj.data will advance as we copy + * portions of the data. + * + * In most cases a single copy will suffice, but larger data structures + * such as "struct task_struct" will require many copies. The logic in + * btf_show_obj_safe() handles the logic that determines if a new + * copy_from_kernel_nofault() is needed. + */ +struct btf_show { + u64 flags; + void *target; /* target of show operation (seq file, buffer) */ + void (*showfn)(struct btf_show *show, const char *fmt, ...); + const struct btf *btf; + /* below are used during iteration */ + struct { + u8 depth; + u8 depth_to_show; + u8 depth_check; + u8 array_member:1, + array_terminated:1; + u16 array_encoding; + u32 type_id; + int status; /* non-zero for error */ + const struct btf_type *type; + const struct btf_member *member; + char name[KSYM_NAME_LEN]; /* space for member name/type */ + } state; + struct { + u32 size; + void *head; + void *data; + u8 safe[BTF_SHOW_OBJ_SAFE_SIZE]; + } obj; +}; + struct btf_kind_operations { s32 (*check_meta)(struct btf_verifier_env *env, const struct btf_type *t, @@ -300,9 +382,9 @@ struct btf_kind_operations { const struct btf_type *member_type); void (*log_details)(struct btf_verifier_env *env, const struct btf_type *t); - void (*seq_show)(const struct btf *btf, const struct btf_type *t, + void (*show)(const struct btf *btf, const struct btf_type *t, u32 type_id, void *data, u8 bits_offsets, - struct seq_file *m); + struct btf_show *show); };
static const struct btf_kind_operations * const kind_ops[NR_BTF_KINDS]; @@ -679,6 +761,457 @@ bool btf_member_is_reg_int(const struct btf *btf, const struct btf_type *s, return true; }
+/* Similar to btf_type_skip_modifiers() but does not skip typedefs. */ +static inline +const struct btf_type *btf_type_skip_qualifiers(const struct btf *btf, u32 id) +{ + const struct btf_type *t = btf_type_by_id(btf, id); + + while (btf_type_is_modifier(t) && + BTF_INFO_KIND(t->info) != BTF_KIND_TYPEDEF) { + id = t->type; + t = btf_type_by_id(btf, t->type); + } + + return t; +} + +#define BTF_SHOW_MAX_ITER 10 + +#define BTF_KIND_BIT(kind) (1ULL << kind) + +/* + * Populate show->state.name with type name information. + * Format of type name is + * + * [.member_name = ] (type_name) + */ +static inline const char *btf_show_name(struct btf_show *show) +{ + /* BTF_MAX_ITER array suffixes "[]" */ + const char *array_suffixes = "[][][][][][][][][][]"; + const char *array_suffix = &array_suffixes[strlen(array_suffixes)]; + /* BTF_MAX_ITER pointer suffixes "*" */ + const char *ptr_suffixes = "**********"; + const char *ptr_suffix = &ptr_suffixes[strlen(ptr_suffixes)]; + const char *name = NULL, *prefix = "", *parens = ""; + const struct btf_member *m = show->state.member; + const struct btf_type *t = show->state.type; + const struct btf_array *array; + u32 id = show->state.type_id; + const char *member = NULL; + bool show_member = false; + u64 kinds = 0; + int i; + + show->state.name[0] = '\0'; + + /* + * Don't show type name if we're showing an array member; + * in that case we show the array type so don't need to repeat + * ourselves for each member. + */ + if (show->state.array_member) + return ""; + + /* Retrieve member name, if any. */ + if (m) { + member = btf_name_by_offset(show->btf, m->name_off); + show_member = strlen(member) > 0; + id = m->type; + } + + /* + * Start with type_id, as we have resolved the struct btf_type * + * via btf_modifier_show() past the parent typedef to the child + * struct, int etc it is defined as. In such cases, the type_id + * still represents the starting type while the struct btf_type * + * in our show->state points at the resolved type of the typedef. + */ + t = btf_type_by_id(show->btf, id); + if (!t) + return ""; + + /* + * The goal here is to build up the right number of pointer and + * array suffixes while ensuring the type name for a typedef + * is represented. Along the way we accumulate a list of + * BTF kinds we have encountered, since these will inform later + * display; for example, pointer types will not require an + * opening "{" for struct, we will just display the pointer value. + * + * We also want to accumulate the right number of pointer or array + * indices in the format string while iterating until we get to + * the typedef/pointee/array member target type. + * + * We start by pointing at the end of pointer and array suffix + * strings; as we accumulate pointers and arrays we move the pointer + * or array string backwards so it will show the expected number of + * '*' or '[]' for the type. BTF_SHOW_MAX_ITER of nesting of pointers + * and/or arrays and typedefs are supported as a precaution. + * + * We also want to get typedef name while proceeding to resolve + * type it points to so that we can add parentheses if it is a + * "typedef struct" etc. + */ + for (i = 0; i < BTF_SHOW_MAX_ITER; i++) { + + switch (BTF_INFO_KIND(t->info)) { + case BTF_KIND_TYPEDEF: + if (!name) + name = btf_name_by_offset(show->btf, + t->name_off); + kinds |= BTF_KIND_BIT(BTF_KIND_TYPEDEF); + id = t->type; + break; + case BTF_KIND_ARRAY: + kinds |= BTF_KIND_BIT(BTF_KIND_ARRAY); + parens = "["; + if (!t) + return ""; + array = btf_type_array(t); + if (array_suffix > array_suffixes) + array_suffix -= 2; + id = array->type; + break; + case BTF_KIND_PTR: + kinds |= BTF_KIND_BIT(BTF_KIND_PTR); + if (ptr_suffix > ptr_suffixes) + ptr_suffix -= 1; + id = t->type; + break; + default: + id = 0; + break; + } + if (!id) + break; + t = btf_type_skip_qualifiers(show->btf, id); + } + /* We may not be able to represent this type; bail to be safe */ + if (i == BTF_SHOW_MAX_ITER) + return ""; + + if (!name) + name = btf_name_by_offset(show->btf, t->name_off); + + switch (BTF_INFO_KIND(t->info)) { + case BTF_KIND_STRUCT: + case BTF_KIND_UNION: + prefix = BTF_INFO_KIND(t->info) == BTF_KIND_STRUCT ? + "struct" : "union"; + /* if it's an array of struct/union, parens is already set */ + if (!(kinds & (BTF_KIND_BIT(BTF_KIND_ARRAY)))) + parens = "{"; + break; + case BTF_KIND_ENUM: + prefix = "enum"; + break; + default: + break; + } + + /* pointer does not require parens */ + if (kinds & BTF_KIND_BIT(BTF_KIND_PTR)) + parens = ""; + /* typedef does not require struct/union/enum prefix */ + if (kinds & BTF_KIND_BIT(BTF_KIND_TYPEDEF)) + prefix = ""; + + if (!name) + name = ""; + + /* Even if we don't want type name info, we want parentheses etc */ + if (show->flags & BTF_SHOW_NONAME) + snprintf(show->state.name, sizeof(show->state.name), "%s", + parens); + else + snprintf(show->state.name, sizeof(show->state.name), + "%s%s%s(%s%s%s%s%s%s)%s", + /* first 3 strings comprise ".member = " */ + show_member ? "." : "", + show_member ? member : "", + show_member ? " = " : "", + /* ...next is our prefix (struct, enum, etc) */ + prefix, + strlen(prefix) > 0 && strlen(name) > 0 ? " " : "", + /* ...this is the type name itself */ + name, + /* ...suffixed by the appropriate '*', '[]' suffixes */ + strlen(ptr_suffix) > 0 ? " " : "", ptr_suffix, + array_suffix, parens); + + return show->state.name; +} + +#define btf_show(show, ...) \ + do { \ + if (!show->state.depth_check) \ + show->showfn(show, __VA_ARGS__); \ + } while (0) + +static inline const char *__btf_show_indent(struct btf_show *show) +{ + const char *indents = " "; + const char *indent = &indents[strlen(indents)]; + + if ((indent - show->state.depth) >= indents) + return indent - show->state.depth; + return indents; +} + +#define btf_show_indent(show) \ + ((show->flags & BTF_SHOW_COMPACT) ? "" : __btf_show_indent(show)) + +#define btf_show_newline(show) \ + ((show->flags & BTF_SHOW_COMPACT) ? "" : "\n") + +#define btf_show_delim(show) \ + (show->state.depth == 0 ? "" : \ + ((show->flags & BTF_SHOW_COMPACT) && show->state.type && \ + BTF_INFO_KIND(show->state.type->info) == BTF_KIND_UNION) ? "|" : ",") + +#define btf_show_type_value(show, fmt, value) \ + do { \ + if ((value) != 0 || (show->flags & BTF_SHOW_ZERO) || \ + show->state.depth == 0) { \ + btf_show(show, "%s%s" fmt "%s%s", \ + btf_show_indent(show), \ + btf_show_name(show), \ + value, btf_show_delim(show), \ + btf_show_newline(show)); \ + if (show->state.depth > show->state.depth_to_show) \ + show->state.depth_to_show = show->state.depth; \ + } \ + } while (0) + +#define btf_show_type_values(show, fmt, ...) \ + do { \ + btf_show(show, "%s%s" fmt "%s%s", btf_show_indent(show), \ + btf_show_name(show), \ + __VA_ARGS__, btf_show_delim(show), \ + btf_show_newline(show)); \ + if (show->state.depth > show->state.depth_to_show) \ + show->state.depth_to_show = show->state.depth; \ + } while (0) + +/* How much is left to copy to safe buffer after @data? */ +#define btf_show_obj_size_left(show, data) \ + (show->obj.head + show->obj.size - data) + +/* Is object pointed to by @data of @size already copied to our safe buffer? */ +#define btf_show_obj_is_safe(show, data, size) \ + (data >= show->obj.data && \ + (data + size) < (show->obj.data + BTF_SHOW_OBJ_SAFE_SIZE)) + +/* + * If object pointed to by @data of @size falls within our safe buffer, return + * the equivalent pointer to the same safe data. Assumes + * copy_from_kernel_nofault() has already happened and our safe buffer is + * populated. + */ +#define __btf_show_obj_safe(show, data, size) \ + (btf_show_obj_is_safe(show, data, size) ? \ + show->obj.safe + (data - show->obj.data) : NULL) + +/* + * Return a safe-to-access version of data pointed to by @data. + * We do this by copying the relevant amount of information + * to the struct btf_show obj.safe buffer using copy_from_kernel_nofault(). + * + * If BTF_SHOW_UNSAFE is specified, just return data as-is; no + * safe copy is needed. + * + * Otherwise we need to determine if we have the required amount + * of data (determined by the @data pointer and the size of the + * largest base type we can encounter (represented by + * BTF_SHOW_OBJ_BASE_TYPE_SIZE). Having that much data ensures + * that we will be able to print some of the current object, + * and if more is needed a copy will be triggered. + * Some objects such as structs will not fit into the buffer; + * in such cases additional copies when we iterate over their + * members may be needed. + * + * btf_show_obj_safe() is used to return a safe buffer for + * btf_show_start_type(); this ensures that as we recurse into + * nested types we always have safe data for the given type. + * This approach is somewhat wasteful; it's possible for example + * that when iterating over a large union we'll end up copying the + * same data repeatedly, but the goal is safety not performance. + * We use stack data as opposed to per-CPU buffers because the + * iteration over a type can take some time, and preemption handling + * would greatly complicate use of the safe buffer. + */ +static inline void *btf_show_obj_safe(struct btf_show *show, + const struct btf_type *t, + void *data) +{ + int size_left, size; + void *safe = NULL; + + if (show->flags & BTF_SHOW_UNSAFE) + return data; + + (void) btf_resolve_size(show->btf, t, &size); + + /* + * Is this toplevel object? If so, set total object size and + * initialize pointers. Otherwise check if we still fall within + * our safe object data. + */ + if (show->state.depth == 0) { + show->obj.size = size; + show->obj.head = data; + } else { + /* + * If the size of the current object is > our remaining + * safe buffer we _may_ need to do a new copy. However + * consider the case of a nested struct; it's size pushes + * us over the safe buffer limit, but showing any individual + * struct members does not. In such cases, we don't need + * to initiate a fresh copy yet; however we definitely need + * at least BTF_SHOW_OBJ_BASE_TYPE_SIZE bytes left + * in our buffer, regardless of the current object size. + * The logic here is that as we resolve types we will + * hit a base type at some point, and we need to be sure + * the next chunk of data is safely available to display + * that type info safely. We cannot rely on the size of + * the current object here because it may be much larger + * than our current buffer (e.g. task_struct is 8k). + * All we want to do here is ensure that we can print the + * next basic type, which we can if either + * - the current type size is within the safe buffer; or + * - at least BTF_SHOW_OBJ_BASE_TYPE_SIZE bytes are left in + * the safe buffer. + */ + safe = __btf_show_obj_safe(show, data, + min(size, + BTF_SHOW_OBJ_BASE_TYPE_SIZE)); + } + + /* + * We need a new copy to our safe object, either because we haven't + * yet copied and are intializing safe data, or because the data + * we want falls outside the boundaries of the safe object. + */ + if (!safe) { + size_left = btf_show_obj_size_left(show, data); + if (size_left > BTF_SHOW_OBJ_SAFE_SIZE) + size_left = BTF_SHOW_OBJ_SAFE_SIZE; + show->state.status = copy_from_kernel_nofault(show->obj.safe, + data, size_left); + if (!show->state.status) { + show->obj.data = data; + safe = show->obj.safe; + } + } + + return safe; +} + +/* + * Set the type we are starting to show and return a safe data pointer + * to be used for showing the associated data. + */ +static inline void *btf_show_start_type(struct btf_show *show, + const struct btf_type *t, + u32 type_id, + void *data) +{ + show->state.type = t; + show->state.type_id = type_id; + show->state.name[0] = '\0'; + + return btf_show_obj_safe(show, t, data); +} + +static inline void btf_show_end_type(struct btf_show *show) +{ + show->state.type = NULL; + show->state.type_id = 0; + show->state.name[0] = '\0'; +} + +static inline void *btf_show_start_aggr_type(struct btf_show *show, + const struct btf_type *t, + u32 type_id, + void *data) +{ + void *safe_data = btf_show_start_type(show, t, type_id, data); + + if (!safe_data) + return safe_data; + + btf_show(show, "%s%s%s", btf_show_indent(show), + btf_show_name(show), + btf_show_newline(show)); + show->state.depth++; + return safe_data; +} + +static inline void btf_show_end_aggr_type(struct btf_show *show, + const char *suffix) +{ + show->state.depth--; + btf_show(show, "%s%s%s%s", btf_show_indent(show), suffix, + btf_show_delim(show), btf_show_newline(show)); + btf_show_end_type(show); +} + +static inline void btf_show_start_member(struct btf_show *show, + const struct btf_member *m) +{ + show->state.member = m; +} + +static inline void btf_show_start_array_member(struct btf_show *show) +{ + show->state.array_member = 1; + btf_show_start_member(show, NULL); +} + +static inline void btf_show_end_member(struct btf_show *show) +{ + show->state.member = NULL; +} + +static inline void btf_show_end_array_member(struct btf_show *show) +{ + show->state.array_member = 0; + btf_show_end_member(show); +} + +static inline void *btf_show_start_array_type(struct btf_show *show, + const struct btf_type *t, + u32 type_id, + u16 array_encoding, + void *data) +{ + show->state.array_encoding = array_encoding; + show->state.array_terminated = 0; + return btf_show_start_aggr_type(show, t, type_id, data); +} + +static inline void btf_show_end_array_type(struct btf_show *show) +{ + show->state.array_encoding = 0; + show->state.array_terminated = 0; + btf_show_end_aggr_type(show, "]"); +} + +static inline void *btf_show_start_struct_type(struct btf_show *show, + const struct btf_type *t, + u32 type_id, + void *data) +{ + return btf_show_start_aggr_type(show, t, type_id, data); +} + +static inline void btf_show_end_struct_type(struct btf_show *show) +{ + btf_show_end_aggr_type(show, "}"); +} + __printf(2, 3) static void __btf_verifier_log(struct bpf_verifier_log *log, const char *fmt, ...) { @@ -1268,11 +1801,11 @@ static int btf_df_resolve(struct btf_verifier_env *env, return -EINVAL; }
-static void btf_df_seq_show(const struct btf *btf, const struct btf_type *t, - u32 type_id, void *data, u8 bits_offsets, - struct seq_file *m) +static void btf_df_show(const struct btf *btf, const struct btf_type *t, + u32 type_id, void *data, u8 bits_offsets, + struct btf_show *show) { - seq_printf(m, "<unsupported kind:%u>", BTF_INFO_KIND(t->info)); + btf_show(show, "<unsupported kind:%u>", BTF_INFO_KIND(t->info)); }
static int btf_int_check_member(struct btf_verifier_env *env, @@ -1445,7 +1978,7 @@ static void btf_int_log(struct btf_verifier_env *env, btf_int_encoding_str(BTF_INT_ENCODING(int_data))); }
-static void btf_int128_print(struct seq_file *m, void *data) +static void btf_int128_print(struct btf_show *show, void *data) { /* data points to a __int128 number. * Suppose @@ -1464,9 +1997,10 @@ static void btf_int128_print(struct seq_file *m, void *data) lower_num = *(u64 *)data; #endif if (upper_num == 0) - seq_printf(m, "0x%llx", lower_num); + btf_show_type_value(show, "0x%llx", lower_num); else - seq_printf(m, "0x%llx%016llx", upper_num, lower_num); + btf_show_type_values(show, "0x%llx%016llx", upper_num, + lower_num); }
static void btf_int128_shift(u64 *print_num, u16 left_shift_bits, @@ -1510,8 +2044,8 @@ static void btf_int128_shift(u64 *print_num, u16 left_shift_bits, #endif }
-static void btf_bitfield_seq_show(void *data, u8 bits_offset, - u8 nr_bits, struct seq_file *m) +static void btf_bitfield_show(void *data, u8 bits_offset, + u8 nr_bits, struct btf_show *show) { u16 left_shift_bits, right_shift_bits; u8 nr_copy_bytes; @@ -1531,14 +2065,14 @@ static void btf_bitfield_seq_show(void *data, u8 bits_offset, right_shift_bits = BITS_PER_U128 - nr_bits;
btf_int128_shift(print_num, left_shift_bits, right_shift_bits); - btf_int128_print(m, print_num); + btf_int128_print(show, print_num); }
-static void btf_int_bits_seq_show(const struct btf *btf, - const struct btf_type *t, - void *data, u8 bits_offset, - struct seq_file *m) +static void btf_int_bits_show(const struct btf *btf, + const struct btf_type *t, + void *data, u8 bits_offset, + struct btf_show *show) { u32 int_data = btf_type_int(t); u8 nr_bits = BTF_INT_BITS(int_data); @@ -1551,55 +2085,77 @@ static void btf_int_bits_seq_show(const struct btf *btf, total_bits_offset = bits_offset + BTF_INT_OFFSET(int_data); data += BITS_ROUNDDOWN_BYTES(total_bits_offset); bits_offset = BITS_PER_BYTE_MASKED(total_bits_offset); - btf_bitfield_seq_show(data, bits_offset, nr_bits, m); + btf_bitfield_show(data, bits_offset, nr_bits, show); }
-static void btf_int_seq_show(const struct btf *btf, const struct btf_type *t, - u32 type_id, void *data, u8 bits_offset, - struct seq_file *m) +static void btf_int_show(const struct btf *btf, const struct btf_type *t, + u32 type_id, void *data, u8 bits_offset, + struct btf_show *show) { u32 int_data = btf_type_int(t); u8 encoding = BTF_INT_ENCODING(int_data); bool sign = encoding & BTF_INT_SIGNED; u8 nr_bits = BTF_INT_BITS(int_data); + void *safe_data; + + safe_data = btf_show_start_type(show, t, type_id, data); + if (!safe_data) + return;
if (bits_offset || BTF_INT_OFFSET(int_data) || BITS_PER_BYTE_MASKED(nr_bits)) { - btf_int_bits_seq_show(btf, t, data, bits_offset, m); - return; + btf_int_bits_show(btf, t, safe_data, bits_offset, show); + goto out; }
switch (nr_bits) { case 128: - btf_int128_print(m, data); + btf_int128_print(show, safe_data); break; case 64: if (sign) - seq_printf(m, "%lld", *(s64 *)data); + btf_show_type_value(show, "%lld", *(s64 *)safe_data); else - seq_printf(m, "%llu", *(u64 *)data); + btf_show_type_value(show, "%llu", *(u64 *)safe_data); break; case 32: if (sign) - seq_printf(m, "%d", *(s32 *)data); + btf_show_type_value(show, "%d", *(s32 *)safe_data); else - seq_printf(m, "%u", *(u32 *)data); + btf_show_type_value(show, "%u", *(u32 *)safe_data); break; case 16: if (sign) - seq_printf(m, "%d", *(s16 *)data); + btf_show_type_value(show, "%d", *(s16 *)safe_data); else - seq_printf(m, "%u", *(u16 *)data); + btf_show_type_value(show, "%u", *(u16 *)safe_data); break; case 8: + if (show->state.array_encoding == BTF_INT_CHAR) { + /* check for null terminator */ + if (show->state.array_terminated) + break; + if (*(char *)data == '\0') { + show->state.array_terminated = 1; + break; + } + if (isprint(*(char *)data)) { + btf_show_type_value(show, "'%c'", + *(char *)safe_data); + break; + } + } if (sign) - seq_printf(m, "%d", *(s8 *)data); + btf_show_type_value(show, "%d", *(s8 *)safe_data); else - seq_printf(m, "%u", *(u8 *)data); + btf_show_type_value(show, "%u", *(u8 *)safe_data); break; default: - btf_int_bits_seq_show(btf, t, data, bits_offset, m); + btf_int_bits_show(btf, t, safe_data, bits_offset, show); + break; } +out: + btf_show_end_type(show); }
static const struct btf_kind_operations int_ops = { @@ -1608,7 +2164,7 @@ static void btf_int_seq_show(const struct btf *btf, const struct btf_type *t, .check_member = btf_int_check_member, .check_kflag_member = btf_int_check_kflag_member, .log_details = btf_int_log, - .seq_show = btf_int_seq_show, + .show = btf_int_show, };
static int btf_modifier_check_member(struct btf_verifier_env *env, @@ -1872,34 +2428,44 @@ static int btf_ptr_resolve(struct btf_verifier_env *env, return 0; }
-static void btf_modifier_seq_show(const struct btf *btf, - const struct btf_type *t, - u32 type_id, void *data, - u8 bits_offset, struct seq_file *m) +static void btf_modifier_show(const struct btf *btf, + const struct btf_type *t, + u32 type_id, void *data, + u8 bits_offset, struct btf_show *show) { if (btf->resolved_ids) t = btf_type_id_resolve(btf, &type_id); else t = btf_type_skip_modifiers(btf, type_id, NULL);
- btf_type_ops(t)->seq_show(btf, t, type_id, data, bits_offset, m); + btf_type_ops(t)->show(btf, t, type_id, data, bits_offset, show); }
-static void btf_var_seq_show(const struct btf *btf, const struct btf_type *t, - u32 type_id, void *data, u8 bits_offset, - struct seq_file *m) +static void btf_var_show(const struct btf *btf, const struct btf_type *t, + u32 type_id, void *data, u8 bits_offset, + struct btf_show *show) { t = btf_type_id_resolve(btf, &type_id);
- btf_type_ops(t)->seq_show(btf, t, type_id, data, bits_offset, m); + btf_type_ops(t)->show(btf, t, type_id, data, bits_offset, show); }
-static void btf_ptr_seq_show(const struct btf *btf, const struct btf_type *t, - u32 type_id, void *data, u8 bits_offset, - struct seq_file *m) +static void btf_ptr_show(const struct btf *btf, const struct btf_type *t, + u32 type_id, void *data, u8 bits_offset, + struct btf_show *show) { - /* It is a hashed value */ - seq_printf(m, "%p", *(void **)data); + void *safe_data; + + safe_data = btf_show_start_type(show, t, type_id, data); + if (!safe_data) + return; + + /* It is a hashed value unless BTF_SHOW_PTR_RAW is specified */ + if (show->flags & BTF_SHOW_PTR_RAW) + btf_show_type_value(show, "0x%px", *(void **)safe_data); + else + btf_show_type_value(show, "0x%p", *(void **)safe_data); + btf_show_end_type(show); }
static void btf_ref_type_log(struct btf_verifier_env *env, @@ -1914,7 +2480,7 @@ static void btf_ref_type_log(struct btf_verifier_env *env, .check_member = btf_modifier_check_member, .check_kflag_member = btf_modifier_check_kflag_member, .log_details = btf_ref_type_log, - .seq_show = btf_modifier_seq_show, + .show = btf_modifier_show, };
static struct btf_kind_operations ptr_ops = { @@ -1923,7 +2489,7 @@ static void btf_ref_type_log(struct btf_verifier_env *env, .check_member = btf_ptr_check_member, .check_kflag_member = btf_generic_check_kflag_member, .log_details = btf_ref_type_log, - .seq_show = btf_ptr_seq_show, + .show = btf_ptr_show, };
static s32 btf_fwd_check_meta(struct btf_verifier_env *env, @@ -1964,7 +2530,7 @@ static void btf_fwd_type_log(struct btf_verifier_env *env, .check_member = btf_df_check_member, .check_kflag_member = btf_df_check_kflag_member, .log_details = btf_fwd_type_log, - .seq_show = btf_df_seq_show, + .show = btf_df_show, };
static int btf_array_check_member(struct btf_verifier_env *env, @@ -2123,28 +2689,90 @@ static void btf_array_log(struct btf_verifier_env *env, array->type, array->index_type, array->nelems); }
-static void btf_array_seq_show(const struct btf *btf, const struct btf_type *t, - u32 type_id, void *data, u8 bits_offset, - struct seq_file *m) +static void __btf_array_show(const struct btf *btf, const struct btf_type *t, + u32 type_id, void *data, u8 bits_offset, + struct btf_show *show) { const struct btf_array *array = btf_type_array(t); const struct btf_kind_operations *elem_ops; const struct btf_type *elem_type; - u32 i, elem_size, elem_type_id; + u32 i, elem_size = 0, elem_type_id; + u16 encoding = 0;
elem_type_id = array->type; - elem_type = btf_type_id_size(btf, &elem_type_id, &elem_size); + elem_type = btf_type_skip_modifiers(btf, elem_type_id, NULL); + if (elem_type && btf_type_has_size(elem_type)) + elem_size = elem_type->size; + + if (elem_type && btf_type_is_int(elem_type)) { + u32 int_type = btf_type_int(elem_type); + + encoding = BTF_INT_ENCODING(int_type); + + /* + * BTF_INT_CHAR encoding never seems to be set for + * char arrays, so if size is 1 and element is + * printable as a char, we'll do that. + */ + if (elem_size == 1) + encoding = BTF_INT_CHAR; + } + + if (!btf_show_start_array_type(show, t, type_id, encoding, data)) + return; + + if (!elem_type) + goto out; elem_ops = btf_type_ops(elem_type); - seq_puts(m, "["); + for (i = 0; i < array->nelems; i++) { - if (i) - seq_puts(m, ",");
- elem_ops->seq_show(btf, elem_type, elem_type_id, data, - bits_offset, m); + btf_show_start_array_member(show); + + elem_ops->show(btf, elem_type, elem_type_id, data, + bits_offset, show); data += elem_size; + + btf_show_end_array_member(show); + + if (show->state.array_terminated) + break; } - seq_puts(m, "]"); +out: + btf_show_end_array_type(show); +} + +static void btf_array_show(const struct btf *btf, const struct btf_type *t, + u32 type_id, void *data, u8 bits_offset, + struct btf_show *show) +{ + const struct btf_member *m = show->state.member; + + /* + * First check if any members would be shown (are non-zero). + * See comments above "struct btf_show" definition for more + * details on how this works at a high-level. + */ + if (show->state.depth > 0 && !(show->flags & BTF_SHOW_ZERO)) { + if (!show->state.depth_check) { + show->state.depth_check = show->state.depth + 1; + show->state.depth_to_show = 0; + } + __btf_array_show(btf, t, type_id, data, bits_offset, show); + show->state.member = m; + + if (show->state.depth_check != show->state.depth + 1) + return; + show->state.depth_check = 0; + + if (show->state.depth_to_show <= show->state.depth) + return; + /* + * Reaching here indicates we have recursed and found + * non-zero array member(s). + */ + } + __btf_array_show(btf, t, type_id, data, bits_offset, show); }
static struct btf_kind_operations array_ops = { @@ -2153,7 +2781,7 @@ static void btf_array_seq_show(const struct btf *btf, const struct btf_type *t, .check_member = btf_array_check_member, .check_kflag_member = btf_generic_check_kflag_member, .log_details = btf_array_log, - .seq_show = btf_array_seq_show, + .show = btf_array_show, };
static int btf_struct_check_member(struct btf_verifier_env *env, @@ -2376,15 +3004,18 @@ int btf_find_spin_lock(const struct btf *btf, const struct btf_type *t) return off; }
-static void btf_struct_seq_show(const struct btf *btf, const struct btf_type *t, - u32 type_id, void *data, u8 bits_offset, - struct seq_file *m) +static void __btf_struct_show(const struct btf *btf, const struct btf_type *t, + u32 type_id, void *data, u8 bits_offset, + struct btf_show *show) { - const char *seq = BTF_INFO_KIND(t->info) == BTF_KIND_UNION ? "|" : ","; const struct btf_member *member; + void *safe_data; u32 i;
- seq_puts(m, "{"); + safe_data = btf_show_start_struct_type(show, t, type_id, data); + if (!safe_data) + return; + for_each_member(i, t, member) { const struct btf_type *member_type = btf_type_by_id(btf, member->type); @@ -2393,23 +3024,65 @@ static void btf_struct_seq_show(const struct btf *btf, const struct btf_type *t, u32 bytes_offset; u8 bits8_offset;
- if (i) - seq_puts(m, seq); + btf_show_start_member(show, member);
member_offset = btf_member_bit_offset(t, member); bitfield_size = btf_member_bitfield_size(t, member); bytes_offset = BITS_ROUNDDOWN_BYTES(member_offset); bits8_offset = BITS_PER_BYTE_MASKED(member_offset); if (bitfield_size) { - btf_bitfield_seq_show(data + bytes_offset, bits8_offset, - bitfield_size, m); + safe_data = btf_show_start_type(show, member_type, + member->type, + data + bytes_offset); + if (safe_data) + btf_bitfield_show(safe_data, + bits8_offset, + bitfield_size, show); + btf_show_end_type(show); } else { ops = btf_type_ops(member_type); - ops->seq_show(btf, member_type, member->type, - data + bytes_offset, bits8_offset, m); + ops->show(btf, member_type, member->type, + data + bytes_offset, bits8_offset, show); } + + btf_show_end_member(show); } - seq_puts(m, "}"); + + btf_show_end_struct_type(show); +} + +static void btf_struct_show(const struct btf *btf, const struct btf_type *t, + u32 type_id, void *data, u8 bits_offset, + struct btf_show *show) +{ + const struct btf_member *m = show->state.member; + + /* + * First check if any members would be shown (are non-zero). + * See comments above "struct btf_show" definition for more + * details on how this works at a high-level. + */ + if (show->state.depth > 0 && !(show->flags & BTF_SHOW_ZERO)) { + if (!show->state.depth_check) { + show->state.depth_check = show->state.depth + 1; + show->state.depth_to_show = 0; + } + __btf_struct_show(btf, t, type_id, data, bits_offset, show); + /* Restore saved member data here */ + show->state.member = m; + if (show->state.depth_check != show->state.depth + 1) + return; + show->state.depth_check = 0; + + if (show->state.depth_to_show <= show->state.depth) + return; + /* + * Reaching here indicates we have recursed and found + * non-zero child values. + */ + } + + __btf_struct_show(btf, t, type_id, data, bits_offset, show); }
static struct btf_kind_operations struct_ops = { @@ -2418,7 +3091,7 @@ static void btf_struct_seq_show(const struct btf *btf, const struct btf_type *t, .check_member = btf_struct_check_member, .check_kflag_member = btf_generic_check_kflag_member, .log_details = btf_struct_log, - .seq_show = btf_struct_seq_show, + .show = btf_struct_show, };
static int btf_enum_check_member(struct btf_verifier_env *env, @@ -2549,24 +3222,35 @@ static void btf_enum_log(struct btf_verifier_env *env, btf_verifier_log(env, "size=%u vlen=%u", t->size, btf_type_vlen(t)); }
-static void btf_enum_seq_show(const struct btf *btf, const struct btf_type *t, - u32 type_id, void *data, u8 bits_offset, - struct seq_file *m) +static void btf_enum_show(const struct btf *btf, const struct btf_type *t, + u32 type_id, void *data, u8 bits_offset, + struct btf_show *show) { const struct btf_enum *enums = btf_type_enum(t); u32 i, nr_enums = btf_type_vlen(t); - int v = *(int *)data; + void *safe_data; + int v; + + safe_data = btf_show_start_type(show, t, type_id, data); + if (!safe_data) + return; + + v = *(int *)safe_data;
for (i = 0; i < nr_enums; i++) { - if (v == enums[i].val) { - seq_printf(m, "%s", - __btf_name_by_offset(btf, - enums[i].name_off)); - return; - } + if (v != enums[i].val) + continue; + + btf_show_type_value(show, "%s", + __btf_name_by_offset(btf, + enums[i].name_off)); + + btf_show_end_type(show); + return; }
- seq_printf(m, "%d", v); + btf_show_type_value(show, "%d", v); + btf_show_end_type(show); }
static struct btf_kind_operations enum_ops = { @@ -2575,7 +3259,7 @@ static void btf_enum_seq_show(const struct btf *btf, const struct btf_type *t, .check_member = btf_enum_check_member, .check_kflag_member = btf_enum_check_kflag_member, .log_details = btf_enum_log, - .seq_show = btf_enum_seq_show, + .show = btf_enum_show, };
static s32 btf_func_proto_check_meta(struct btf_verifier_env *env, @@ -2662,7 +3346,7 @@ static void btf_func_proto_log(struct btf_verifier_env *env, .check_member = btf_df_check_member, .check_kflag_member = btf_df_check_kflag_member, .log_details = btf_func_proto_log, - .seq_show = btf_df_seq_show, + .show = btf_df_show, };
static s32 btf_func_check_meta(struct btf_verifier_env *env, @@ -2696,7 +3380,7 @@ static s32 btf_func_check_meta(struct btf_verifier_env *env, .check_member = btf_df_check_member, .check_kflag_member = btf_df_check_kflag_member, .log_details = btf_ref_type_log, - .seq_show = btf_df_seq_show, + .show = btf_df_show, };
static s32 btf_var_check_meta(struct btf_verifier_env *env, @@ -2760,7 +3444,7 @@ static void btf_var_log(struct btf_verifier_env *env, const struct btf_type *t) .check_member = btf_df_check_member, .check_kflag_member = btf_df_check_kflag_member, .log_details = btf_var_log, - .seq_show = btf_var_seq_show, + .show = btf_var_show, };
static s32 btf_datasec_check_meta(struct btf_verifier_env *env, @@ -2886,24 +3570,28 @@ static void btf_datasec_log(struct btf_verifier_env *env, btf_verifier_log(env, "size=%u vlen=%u", t->size, btf_type_vlen(t)); }
-static void btf_datasec_seq_show(const struct btf *btf, - const struct btf_type *t, u32 type_id, - void *data, u8 bits_offset, - struct seq_file *m) +static void btf_datasec_show(const struct btf *btf, + const struct btf_type *t, u32 type_id, + void *data, u8 bits_offset, + struct btf_show *show) { const struct btf_var_secinfo *vsi; const struct btf_type *var; u32 i;
- seq_printf(m, "section ("%s") = {", __btf_name_by_offset(btf, t->name_off)); + if (!btf_show_start_type(show, t, type_id, data)) + return; + + btf_show_type_value(show, "section ("%s") = {", + __btf_name_by_offset(btf, t->name_off)); for_each_vsi(i, t, vsi) { var = btf_type_by_id(btf, vsi->type); if (i) - seq_puts(m, ","); - btf_type_ops(var)->seq_show(btf, var, vsi->type, - data + vsi->offset, bits_offset, m); + btf_show(show, ","); + btf_type_ops(var)->show(btf, var, vsi->type, + data + vsi->offset, bits_offset, show); } - seq_puts(m, "}"); + btf_show_end_type(show); }
static const struct btf_kind_operations datasec_ops = { @@ -2912,7 +3600,7 @@ static void btf_datasec_seq_show(const struct btf *btf, .check_member = btf_df_check_member, .check_kflag_member = btf_df_check_kflag_member, .log_details = btf_datasec_log, - .seq_show = btf_datasec_seq_show, + .show = btf_datasec_show, };
static int btf_func_proto_check(struct btf_verifier_env *env, @@ -4619,12 +5307,89 @@ int btf_prepare_func_args(struct bpf_verifier_env *env, int subprog, return 0; }
-void btf_type_seq_show(const struct btf *btf, u32 type_id, void *obj, - struct seq_file *m) +void btf_type_show(const struct btf *btf, u32 type_id, void *obj, + struct btf_show *show) { const struct btf_type *t = btf_type_by_id(btf, type_id);
- btf_type_ops(t)->seq_show(btf, t, type_id, obj, 0, m); + show->btf = btf; + memset(&show->state, 0, sizeof(show->state)); + memset(&show->obj, 0, sizeof(show->obj)); + + btf_type_ops(t)->show(btf, t, type_id, obj, 0, show); +} + +static void btf_seq_show(struct btf_show *show, const char *fmt, ...) +{ + va_list args; + + va_start(args, fmt); + seq_vprintf((struct seq_file *)show->target, fmt, args); + va_end(args); +} + +void btf_type_seq_show(const struct btf *btf, u32 type_id, void *obj, + struct seq_file *m) +{ + struct btf_show sseq; + + sseq.target = m; + sseq.showfn = btf_seq_show; + sseq.flags = BTF_SHOW_NONAME | BTF_SHOW_COMPACT | BTF_SHOW_ZERO | + BTF_SHOW_UNSAFE; + + btf_type_show(btf, type_id, obj, &sseq); +} + +struct btf_show_snprintf { + struct btf_show show; + int len_left; /* space left in string */ + int len; /* length we would have written */ +}; + +static void btf_snprintf_show(struct btf_show *show, const char *fmt, ...) +{ + struct btf_show_snprintf *ssnprintf = (struct btf_show_snprintf *)show; + va_list args; + int len; + + va_start(args, fmt); + len = vsnprintf(show->target, ssnprintf->len_left, fmt, args); + va_end(args); + + if (len < 0) { + ssnprintf->len_left = 0; + ssnprintf->len = len; + } else if (len > ssnprintf->len_left) { + /* no space, drive on to get length we would have written */ + ssnprintf->len_left = 0; + ssnprintf->len += len; + } else { + ssnprintf->len_left -= len; + ssnprintf->len += len; + show->target += len; + } +} + +int btf_type_snprintf_show(const struct btf *btf, u32 type_id, void *obj, + char *buf, int len, u64 flags) +{ + struct btf_show_snprintf ssnprintf; + + ssnprintf.show.target = buf; + ssnprintf.show.flags = flags; + ssnprintf.show.showfn = btf_snprintf_show; + ssnprintf.len_left = len; + ssnprintf.len = 0; + + btf_type_show(btf, type_id, obj, (struct btf_show *)&ssnprintf); + + /* If we encontered an error, return it. */ + if (ssnprintf.show.state.status) + return ssnprintf.show.state.status; + + /* Otherwise return length we would have written */ + return ssnprintf.len; }
#ifdef CONFIG_PROC_FS
A helper is added to support tracing kernel type information in BPF using the BPF Type Format (BTF). Its signature is
long bpf_btf_snprintf(char *str, u32 str_size, struct btf_ptr *ptr, u32 btf_ptr_size, u64 flags);
struct btf_ptr * specifies
- a pointer to the data to be traced; - the BTF id of the type of data pointed to; or - a string representation of the type of data pointed to - a flags field is provided for future use; these flags are not to be confused with the BTF_SNPRINTF_F_* flags below that control how the btf_ptr is displayed; the flags member of the struct btf_ptr may be used to disambiguate types in kernel versus module BTF, etc; the main distinction is the flags relate to the type and information needed in identifying it; not how it is displayed.
For example a BPF program with a struct sk_buff *skb could do the following:
static const char skb_type[] = "struct sk_buff"; static struct btf_ptr b = { };
b.ptr = skb; b.type = skb_type; bpf_btf_snprintf(str, sizeof(str), &b, sizeof(b), 0, 0);
Default output looks like this:
(struct sk_buff){ .transport_header = (__u16)65535, .mac_header = (__u16)65535, .end = (sk_buff_data_t)192, .head = (unsigned char *)0x000000007524fd8b, .data = (unsigned char *)0x000000007524fd8b, .truesize = (unsigned int)768, .users = (refcount_t){ .refs = (atomic_t){ .counter = (int)1, }, }, }
Flags modifying display are as follows:
- BTF_F_COMPACT: no formatting around type information - BTF_F_NONAME: no struct/union member names/types - BTF_F_PTR_RAW: show raw (unobfuscated) pointer values; equivalent to %px. - BTF_F_ZERO: show zero-valued struct/union members; they are not displayed by default
Signed-off-by: Alan Maguire alan.maguire@oracle.com --- include/linux/bpf.h | 1 + include/linux/btf.h | 9 +++-- include/uapi/linux/bpf.h | 68 ++++++++++++++++++++++++++++++++ kernel/bpf/helpers.c | 4 ++ kernel/trace/bpf_trace.c | 88 ++++++++++++++++++++++++++++++++++++++++++ scripts/bpf_helpers_doc.py | 2 + tools/include/uapi/linux/bpf.h | 68 ++++++++++++++++++++++++++++++++ 7 files changed, 236 insertions(+), 4 deletions(-)
diff --git a/include/linux/bpf.h b/include/linux/bpf.h index c0ad5d8..9acbd59 100644 --- a/include/linux/bpf.h +++ b/include/linux/bpf.h @@ -1787,6 +1787,7 @@ static inline int bpf_fd_reuseport_array_update_elem(struct bpf_map *map, extern const struct bpf_func_proto bpf_skc_to_tcp_request_sock_proto; extern const struct bpf_func_proto bpf_skc_to_udp6_sock_proto; extern const struct bpf_func_proto bpf_copy_from_user_proto; +extern const struct bpf_func_proto bpf_btf_snprintf_proto;
const struct bpf_func_proto *bpf_tracing_func_proto( enum bpf_func_id func_id, const struct bpf_prog *prog); diff --git a/include/linux/btf.h b/include/linux/btf.h index d0f5d3c..3e5cdc2 100644 --- a/include/linux/btf.h +++ b/include/linux/btf.h @@ -6,6 +6,7 @@
#include <linux/types.h> #include <uapi/linux/btf.h> +#include <uapi/linux/bpf.h>
#define BTF_TYPE_EMIT(type) ((void)(type *)0)
@@ -59,10 +60,10 @@ const struct btf_type *btf_type_id_size(const struct btf *btf, * - BTF_SHOW_UNSAFE: skip use of bpf_probe_read() to safely read * data before displaying it. */ -#define BTF_SHOW_COMPACT (1ULL << 0) -#define BTF_SHOW_NONAME (1ULL << 1) -#define BTF_SHOW_PTR_RAW (1ULL << 2) -#define BTF_SHOW_ZERO (1ULL << 3) +#define BTF_SHOW_COMPACT BTF_F_COMPACT +#define BTF_SHOW_NONAME BTF_F_NONAME +#define BTF_SHOW_PTR_RAW BTF_F_PTR_RAW +#define BTF_SHOW_ZERO BTF_F_ZERO #define BTF_SHOW_UNSAFE (1ULL << 4)
void btf_type_seq_show(const struct btf *btf, u32 type_id, void *obj, diff --git a/include/uapi/linux/bpf.h b/include/uapi/linux/bpf.h index 7dd3141..9b89b67 100644 --- a/include/uapi/linux/bpf.h +++ b/include/uapi/linux/bpf.h @@ -3579,6 +3579,41 @@ struct bpf_stack_build_id { * the data in *dst*. This is a wrapper of **copy_from_user**\ (). * Return * 0 on success, or a negative error in case of failure. + * + * long bpf_btf_snprintf(char *str, u32 str_size, struct btf_ptr *ptr, u32 btf_ptr_size, u64 flags) + * Description + * Use BTF to store a string representation of *ptr*->ptr in *str*, + * using *ptr*->type name or *ptr*->type_id. These values should + * specify the type *ptr*->ptr points to. Traversing that + * data structure using BTF, the type information and values are + * stored in the first *str_size* - 1 bytes of *str*. Safe copy of + * the pointer data is carried out to avoid kernel crashes during + * operation. Smaller types can use string space on the stack; + * larger programs can use map data to store the string + * representation. + * + * The string can be subsequently shared with userspace via + * bpf_perf_event_output() or ring buffer interfaces. + * bpf_trace_printk() is to be avoided as it places too small + * a limit on string size to be useful. + * + * *flags* is a combination of + * + * **BTF_F_COMPACT** + * no formatting around type information + * **BTF_F_NONAME** + * no struct/union member names/types + * **BTF_F_PTR_RAW** + * show raw (unobfuscated) pointer values; + * equivalent to printk specifier %px. + * **BTF_F_ZERO** + * show zero-valued struct/union members; they + * are not displayed by default + * + * Return + * The number of bytes that were written (or would have been + * written if output had to be truncated due to string size), + * or a negative error in cases of failure. */ #define __BPF_FUNC_MAPPER(FN) \ FN(unspec), \ @@ -3730,6 +3765,7 @@ struct bpf_stack_build_id { FN(inode_storage_delete), \ FN(d_path), \ FN(copy_from_user), \ + FN(btf_snprintf), \ /* */
/* integer value in 'imm' field of BPF_CALL instruction selects which helper @@ -4838,4 +4874,36 @@ struct bpf_sk_lookup { __u32 local_port; /* Host byte order */ };
+/* + * struct btf_ptr is used for typed pointer representation; the + * additional type string/BTF type id are used to render the pointer + * data as the appropriate type via the bpf_btf_snprintf() helper + * above. A flags field - potentially to specify additional details + * about the BTF pointer (rather than its mode of display) - is + * present for future use. Display flags - BTF_F_* - are + * passed to bpf_btf_snprintf separately. + */ +struct btf_ptr { + void *ptr; + const char *type; + __u32 type_id; + __u32 flags; /* BTF ptr flags; unused at present. */ +}; + +/* + * Flags to control bpf_btf_snprintf() behaviour. + * - BTF_F_COMPACT: no formatting around type information + * - BTF_F_NONAME: no struct/union member names/types + * - BTF_F_PTR_RAW: show raw (unobfuscated) pointer values; + * equivalent to %px. + * - BTF_F_ZERO: show zero-valued struct/union members; they + * are not displayed by default + */ +enum { + BTF_F_COMPACT = (1ULL << 0), + BTF_F_NONAME = (1ULL << 1), + BTF_F_PTR_RAW = (1ULL << 2), + BTF_F_ZERO = (1ULL << 3), +}; + #endif /* _UAPI__LINUX_BPF_H__ */ diff --git a/kernel/bpf/helpers.c b/kernel/bpf/helpers.c index 5cc7425..500ff76 100644 --- a/kernel/bpf/helpers.c +++ b/kernel/bpf/helpers.c @@ -683,6 +683,10 @@ static int __bpf_strtoll(const char *buf, size_t buf_len, u64 flags, if (!perfmon_capable()) return NULL; return bpf_get_trace_printk_proto(); + case BPF_FUNC_btf_snprintf: + if (!perfmon_capable()) + return NULL; + return &bpf_btf_snprintf_proto; case BPF_FUNC_jiffies64: return &bpf_jiffies64_proto; default: diff --git a/kernel/trace/bpf_trace.c b/kernel/trace/bpf_trace.c index b2a5380..f171e03 100644 --- a/kernel/trace/bpf_trace.c +++ b/kernel/trace/bpf_trace.c @@ -7,6 +7,7 @@ #include <linux/slab.h> #include <linux/bpf.h> #include <linux/bpf_perf_event.h> +#include <linux/btf.h> #include <linux/filter.h> #include <linux/uaccess.h> #include <linux/ctype.h> @@ -16,6 +17,9 @@ #include <linux/error-injection.h> #include <linux/btf_ids.h>
+#include <uapi/linux/bpf.h> +#include <uapi/linux/btf.h> + #include <asm/tlb.h>
#include "trace_probe.h" @@ -1144,6 +1148,88 @@ static bool bpf_d_path_allowed(const struct bpf_prog *prog) .allowed = bpf_d_path_allowed, };
+#define BTF_F_ALL (BTF_F_COMPACT | BTF_F_NONAME | \ + BTF_F_PTR_RAW | BTF_F_ZERO) + +BPF_CALL_5(bpf_btf_snprintf, char *, str, u32, str_size, struct btf_ptr *, ptr, + u32, btf_ptr_size, u64, flags) +{ + u8 btf_kind = BTF_KIND_TYPEDEF; + char type_name[KSYM_NAME_LEN]; + const struct btf_type *t; + const struct btf *btf; + const char *btf_type; + s32 btf_id; + int ret; + + if (unlikely(flags & ~(BTF_F_ALL))) + return -EINVAL; + + if (btf_ptr_size != sizeof(struct btf_ptr)) + return -EINVAL; + + btf = bpf_get_btf_vmlinux(); + + if (IS_ERR_OR_NULL(btf)) + return PTR_ERR(btf); + + if (ptr->type != NULL) { + ret = copy_from_kernel_nofault(type_name, ptr->type, + sizeof(type_name)); + if (ret) + return ret; + + btf_type = type_name; + + if (strncmp(btf_type, "struct ", strlen("struct ")) == 0) { + btf_kind = BTF_KIND_STRUCT; + btf_type += strlen("struct "); + } else if (strncmp(btf_type, "union ", strlen("union ")) == 0) { + btf_kind = BTF_KIND_UNION; + btf_type += strlen("union "); + } else if (strncmp(btf_type, "enum ", strlen("enum ")) == 0) { + btf_kind = BTF_KIND_ENUM; + btf_type += strlen("enum "); + } + + if (strlen(btf_type) == 0) + return -EINVAL; + + /* Assume type specified is a typedef as there's not much + * benefit in specifying int types other than wasting time + * on BTF lookups; we optimize for the most useful path. + * + * Fall back to BTF_KIND_INT if this fails. + */ + btf_id = btf_find_by_name_kind(btf, btf_type, btf_kind); + if (btf_id < 0) + btf_id = btf_find_by_name_kind(btf, btf_type, + BTF_KIND_INT); + } else if (ptr->type_id > 0) + btf_id = ptr->type_id; + else + return -EINVAL; + + if (btf_id > 0) + t = btf_type_by_id(btf, btf_id); + if (btf_id <= 0 || !t) + return -ENOENT; + + return btf_type_snprintf_show(btf, btf_id, ptr->ptr, str, str_size, + flags); +} + +const struct bpf_func_proto bpf_btf_snprintf_proto = { + .func = bpf_btf_snprintf, + .gpl_only = false, + .ret_type = RET_INTEGER, + .arg1_type = ARG_PTR_TO_MEM, + .arg2_type = ARG_CONST_SIZE, + .arg3_type = ARG_PTR_TO_MEM, + .arg4_type = ARG_CONST_SIZE, + .arg5_type = ARG_ANYTHING, +}; + const struct bpf_func_proto * bpf_tracing_func_proto(enum bpf_func_id func_id, const struct bpf_prog *prog) { @@ -1230,6 +1316,8 @@ static bool bpf_d_path_allowed(const struct bpf_prog *prog) return &bpf_get_task_stack_proto; case BPF_FUNC_copy_from_user: return prog->aux->sleepable ? &bpf_copy_from_user_proto : NULL; + case BPF_FUNC_btf_snprintf: + return &bpf_btf_snprintf_proto; default: return NULL; } diff --git a/scripts/bpf_helpers_doc.py b/scripts/bpf_helpers_doc.py index 0838817..7d86fdd 100755 --- a/scripts/bpf_helpers_doc.py +++ b/scripts/bpf_helpers_doc.py @@ -433,6 +433,7 @@ class PrinterHelpers(Printer): 'struct sk_msg_md', 'struct xdp_md', 'struct path', + 'struct btf_ptr', ] known_types = { '...', @@ -474,6 +475,7 @@ class PrinterHelpers(Printer): 'struct udp6_sock', 'struct task_struct', 'struct path', + 'struct btf_ptr', } mapped_types = { 'u8': '__u8', diff --git a/tools/include/uapi/linux/bpf.h b/tools/include/uapi/linux/bpf.h index 7dd3141..9b89b67 100644 --- a/tools/include/uapi/linux/bpf.h +++ b/tools/include/uapi/linux/bpf.h @@ -3579,6 +3579,41 @@ struct bpf_stack_build_id { * the data in *dst*. This is a wrapper of **copy_from_user**\ (). * Return * 0 on success, or a negative error in case of failure. + * + * long bpf_btf_snprintf(char *str, u32 str_size, struct btf_ptr *ptr, u32 btf_ptr_size, u64 flags) + * Description + * Use BTF to store a string representation of *ptr*->ptr in *str*, + * using *ptr*->type name or *ptr*->type_id. These values should + * specify the type *ptr*->ptr points to. Traversing that + * data structure using BTF, the type information and values are + * stored in the first *str_size* - 1 bytes of *str*. Safe copy of + * the pointer data is carried out to avoid kernel crashes during + * operation. Smaller types can use string space on the stack; + * larger programs can use map data to store the string + * representation. + * + * The string can be subsequently shared with userspace via + * bpf_perf_event_output() or ring buffer interfaces. + * bpf_trace_printk() is to be avoided as it places too small + * a limit on string size to be useful. + * + * *flags* is a combination of + * + * **BTF_F_COMPACT** + * no formatting around type information + * **BTF_F_NONAME** + * no struct/union member names/types + * **BTF_F_PTR_RAW** + * show raw (unobfuscated) pointer values; + * equivalent to printk specifier %px. + * **BTF_F_ZERO** + * show zero-valued struct/union members; they + * are not displayed by default + * + * Return + * The number of bytes that were written (or would have been + * written if output had to be truncated due to string size), + * or a negative error in cases of failure. */ #define __BPF_FUNC_MAPPER(FN) \ FN(unspec), \ @@ -3730,6 +3765,7 @@ struct bpf_stack_build_id { FN(inode_storage_delete), \ FN(d_path), \ FN(copy_from_user), \ + FN(btf_snprintf), \ /* */
/* integer value in 'imm' field of BPF_CALL instruction selects which helper @@ -4838,4 +4874,36 @@ struct bpf_sk_lookup { __u32 local_port; /* Host byte order */ };
+/* + * struct btf_ptr is used for typed pointer representation; the + * additional type string/BTF type id are used to render the pointer + * data as the appropriate type via the bpf_btf_snprintf() helper + * above. A flags field - potentially to specify additional details + * about the BTF pointer (rather than its mode of display) - is + * present for future use. Display flags - BTF_F_* - are + * passed to bpf_btf_snprintf separately. + */ +struct btf_ptr { + void *ptr; + const char *type; + __u32 type_id; + __u32 flags; /* BTF ptr flags; unused at present. */ +}; + +/* + * Flags to control bpf_btf_snprintf() behaviour. + * - BTF_F_COMPACT: no formatting around type information + * - BTF_F_NONAME: no struct/union member names/types + * - BTF_F_PTR_RAW: show raw (unobfuscated) pointer values; + * equivalent to %px. + * - BTF_F_ZERO: show zero-valued struct/union members; they + * are not displayed by default + */ +enum { + BTF_F_COMPACT = (1ULL << 0), + BTF_F_NONAME = (1ULL << 1), + BTF_F_PTR_RAW = (1ULL << 2), + BTF_F_ZERO = (1ULL << 3), +}; + #endif /* _UAPI__LINUX_BPF_H__ */
Tests verifying snprintf()ing of various data structures, flags combinations using a tp_btf program.
Signed-off-by: Alan Maguire alan.maguire@oracle.com --- .../selftests/bpf/prog_tests/btf_snprintf.c | 55 +++++ .../selftests/bpf/progs/netif_receive_skb.c | 260 +++++++++++++++++++++ 2 files changed, 315 insertions(+) create mode 100644 tools/testing/selftests/bpf/prog_tests/btf_snprintf.c create mode 100644 tools/testing/selftests/bpf/progs/netif_receive_skb.c
diff --git a/tools/testing/selftests/bpf/prog_tests/btf_snprintf.c b/tools/testing/selftests/bpf/prog_tests/btf_snprintf.c new file mode 100644 index 0000000..8f277a5 --- /dev/null +++ b/tools/testing/selftests/bpf/prog_tests/btf_snprintf.c @@ -0,0 +1,55 @@ +// SPDX-License-Identifier: GPL-2.0 +#include <test_progs.h> +#include <linux/btf.h> +#include "netif_receive_skb.skel.h" + +/* Demonstrate that bpf_btf_snprintf succeeds with non-zero return values, + * and that string representation of kernel data can then be displayed + * via bpf_trace_printk(). + */ +void test_btf_snprintf(void) +{ + struct netif_receive_skb *skel; + struct netif_receive_skb__bss *bss; + int err, duration = 0; + + skel = netif_receive_skb__open(); + if (CHECK(!skel, "skel_open", "failed to open skeleton\n")) + return; + + err = netif_receive_skb__load(skel); + if (CHECK(err, "skel_load", "failed to load skeleton: %d\n", err)) + goto cleanup; + + bss = skel->bss; + + err = netif_receive_skb__attach(skel); + if (CHECK(err, "skel_attach", "skeleton attach failed: %d\n", err)) + goto cleanup; + + /* generate receive event */ + system("ping -c 1 127.0.0.1 > /dev/null"); + + /* + * Make sure netif_receive_skb program was triggered + * and it set expected return values from bpf_trace_printk()s + * and all tests ran. + */ + if (CHECK(bss->ret <= 0, + "bpf_btf_snprintf: got return value", + "ret <= 0 %ld test %d\n", bss->ret, bss->ran_subtests)) + goto cleanup; + + if (CHECK(bss->ran_subtests == 0, "check if subtests ran", + "no subtests ran, did BPF program run?")) + goto cleanup; + + if (CHECK(bss->num_subtests != bss->ran_subtests, + "check all subtests ran", + "only ran %d of %d tests\n", bss->num_subtests, + bss->ran_subtests)) + goto cleanup; + +cleanup: + netif_receive_skb__destroy(skel); +} diff --git a/tools/testing/selftests/bpf/progs/netif_receive_skb.c b/tools/testing/selftests/bpf/progs/netif_receive_skb.c new file mode 100644 index 0000000..dd08a7d --- /dev/null +++ b/tools/testing/selftests/bpf/progs/netif_receive_skb.c @@ -0,0 +1,260 @@ +// SPDX-License-Identifier: GPL-2.0 +/* Copyright (c) 2020, Oracle and/or its affiliates. */ + +#include "vmlinux.h" +#include <bpf/bpf_helpers.h> +#include <bpf/bpf_tracing.h> +#include <errno.h> + +long ret = 0; +int num_subtests = 0; +int ran_subtests = 0; + +#define STRSIZE 2048 +#define EXPECTED_STRSIZE 256 + +#ifndef ARRAY_SIZE +#define ARRAY_SIZE(x) (sizeof(x) / sizeof((x)[0])) +#endif + +struct { + __uint(type, BPF_MAP_TYPE_PERCPU_ARRAY); + __uint(max_entries, 1); + __type(key, __u32); + __type(value, char[STRSIZE]); +} strdata SEC(".maps"); + +static int __strncmp(const void *m1, const void *m2, size_t len) +{ + const unsigned char *s1 = m1; + const unsigned char *s2 = m2; + int i, delta = 0; + +#pragma clang loop unroll(full) + for (i = 0; i < len; i++) { + delta = s1[i] - s2[i]; + if (delta || s1[i] == 0 || s2[i] == 0) + break; + } + return delta; +} + +/* Use __builtin_btf_type_id to test btf_snprintf by type id instead of name */ +#if __has_builtin(__builtin_btf_type_id) +#define TEST_BTF_BY_ID(_str, _typestr, _ptr, _hflags) \ + do { \ + int _expected_ret = ret; \ + _ptr.type = 0; \ + _ptr.type_id = __builtin_btf_type_id(_typestr, 0); \ + ret = bpf_btf_snprintf(_str, STRSIZE, &_ptr, \ + sizeof(_ptr), _hflags); \ + if (ret != _expected_ret) { \ + bpf_printk("expected ret (%d), got (%d)", \ + _expected_ret, ret); \ + ret = -EBADMSG; \ + } \ + } while (0) +#else +#define TEST_BTF_BY_ID(_str, _typestr, _ptr, _hflags) \ + do { } while (0) +#endif + +#define TEST_BTF(_str, _type, _flags, _expected, ...) \ + do { \ + static const char _expectedval[EXPECTED_STRSIZE] = \ + _expected; \ + static const char _ptrtype[64] = #_type; \ + __u64 _hflags = _flags | BTF_F_COMPACT; \ + static _type _ptrdata = __VA_ARGS__; \ + static struct btf_ptr _ptr = { }; \ + int _cmp; \ + \ + _ptr.type = _ptrtype; \ + _ptr.ptr = &_ptrdata; \ + \ + ++num_subtests; \ + if (ret < 0) \ + break; \ + ++ran_subtests; \ + ret = bpf_btf_snprintf(_str, STRSIZE, \ + &_ptr, sizeof(_ptr), _hflags); \ + if (ret) \ + break; \ + _cmp = __strncmp(_str, _expectedval, EXPECTED_STRSIZE); \ + if (_cmp != 0) { \ + bpf_printk("(%d) got %s", _cmp, _str); \ + bpf_printk("(%d) expected %s", _cmp, \ + _expectedval); \ + ret = -EBADMSG; \ + break; \ + } \ + TEST_BTF_BY_ID(_str, #_type, _ptr, _hflags); \ + } while (0) + +/* Use where expected data string matches its stringified declaration */ +#define TEST_BTF_C(_str, _type, _flags, ...) \ + TEST_BTF(_str, _type, _flags, "(" #_type ")" #__VA_ARGS__, \ + __VA_ARGS__) + +/* TRACE_EVENT(netif_receive_skb, + * TP_PROTO(struct sk_buff *skb), + */ +SEC("tp_btf/netif_receive_skb") +int BPF_PROG(trace_netif_receive_skb, struct sk_buff *skb) +{ + static __u64 flags[] = { 0, BTF_F_COMPACT, BTF_F_ZERO, BTF_F_PTR_RAW, + BTF_F_NONAME, BTF_F_COMPACT | BTF_F_ZERO | + BTF_F_PTR_RAW | BTF_F_NONAME }; + static const char skbtype[] = "struct sk_buff"; + static struct btf_ptr p = { }; + __u32 key = 0; + int i, __ret; + char *str; + + str = bpf_map_lookup_elem(&strdata, &key); + if (!str) + return 0; + + /* Ensure we can write skb string representation */ + p.type = skbtype; + p.ptr = skb; + for (i = 0; i < ARRAY_SIZE(flags); i++) { + ++num_subtests; + ret = bpf_btf_snprintf(str, STRSIZE, &p, sizeof(p), 0); + if (ret < 0) + bpf_printk("returned %d when writing skb", ret); + ++ran_subtests; + } + + /* Check invalid ptr value */ + p.ptr = 0; + __ret = bpf_btf_snprintf(str, STRSIZE, &p, sizeof(p), 0); + if (__ret >= 0) { + bpf_printk("printing NULL should generate error, got (%d)", + __ret); + ret = -ERANGE; + } + + /* Verify type display for various types. */ + + /* simple int */ + TEST_BTF_C(str, int, 0, 1234); + TEST_BTF(str, int, BTF_F_NONAME, "1234", 1234); + /* zero value should be printed at toplevel */ + TEST_BTF(str, int, 0, "(int)0", 0); + TEST_BTF(str, int, BTF_F_NONAME, "0", 0); + TEST_BTF(str, int, BTF_F_ZERO, "(int)0", 0); + TEST_BTF(str, int, BTF_F_NONAME | BTF_F_ZERO, "0", 0); + TEST_BTF_C(str, int, 0, -4567); + TEST_BTF(str, int, BTF_F_NONAME, "-4567", -4567); + + /* simple char */ + TEST_BTF_C(str, char, 0, 100); + TEST_BTF(str, char, BTF_F_NONAME, "100", 100); + /* zero value should be printed at toplevel */ + TEST_BTF(str, char, 0, "(char)0", 0); + TEST_BTF(str, char, BTF_F_NONAME, "0", 0); + TEST_BTF(str, char, BTF_F_ZERO, "(char)0", 0); + TEST_BTF(str, char, BTF_F_NONAME | BTF_F_ZERO, "0", 0); + + /* simple typedef */ + TEST_BTF_C(str, uint64_t, 0, 100); + TEST_BTF(str, u64, BTF_F_NONAME, "1", 1); + /* zero value should be printed at toplevel */ + TEST_BTF(str, u64, 0, "(u64)0", 0); + TEST_BTF(str, u64, BTF_F_NONAME, "0", 0); + TEST_BTF(str, u64, BTF_F_ZERO, "(u64)0", 0); + TEST_BTF(str, u64, BTF_F_NONAME|BTF_F_ZERO, "0", 0); + + /* typedef struct */ + TEST_BTF_C(str, atomic_t, 0, {.counter = (int)1,}); + TEST_BTF(str, atomic_t, BTF_F_NONAME, "{1,}", {.counter = 1,}); + /* typedef with 0 value should be printed at toplevel */ + TEST_BTF(str, atomic_t, 0, "(atomic_t){}", {.counter = 0,}); + TEST_BTF(str, atomic_t, BTF_F_NONAME, "{}", {.counter = 0,}); + TEST_BTF(str, atomic_t, BTF_F_ZERO, "(atomic_t){.counter = (int)0,}", + {.counter = 0,}); + TEST_BTF(str, atomic_t, BTF_F_NONAME|BTF_F_ZERO, + "{0,}", {.counter = 0,}); + + /* enum where enum value does (and does not) exist */ + TEST_BTF_C(str, enum bpf_cmd, 0, BPF_MAP_CREATE); + TEST_BTF(str, enum bpf_cmd, 0, "(enum bpf_cmd)BPF_MAP_CREATE", 0); + TEST_BTF(str, enum bpf_cmd, BTF_F_NONAME, "BPF_MAP_CREATE", + BPF_MAP_CREATE); + TEST_BTF(str, enum bpf_cmd, BTF_F_NONAME|BTF_F_ZERO, + "BPF_MAP_CREATE", 0); + + TEST_BTF(str, enum bpf_cmd, BTF_F_ZERO, "(enum bpf_cmd)BPF_MAP_CREATE", + BPF_MAP_CREATE); + TEST_BTF(str, enum bpf_cmd, BTF_F_NONAME|BTF_F_ZERO, + "BPF_MAP_CREATE", BPF_MAP_CREATE); + TEST_BTF_C(str, enum bpf_cmd, 0, 2000); + TEST_BTF(str, enum bpf_cmd, BTF_F_NONAME, "2000", 2000); + + /* simple struct */ + TEST_BTF_C(str, struct btf_enum, 0, + {.name_off = (__u32)3,.val = (__s32)-1,}); + TEST_BTF(str, struct btf_enum, BTF_F_NONAME, "{3,-1,}", + { .name_off = 3, .val = -1,}); + TEST_BTF(str, struct btf_enum, BTF_F_NONAME, "{-1,}", + { .name_off = 0, .val = -1,}); + TEST_BTF(str, struct btf_enum, BTF_F_NONAME|BTF_F_ZERO, "{0,-1,}", + { .name_off = 0, .val = -1,}); + /* empty struct should be printed */ + TEST_BTF(str, struct btf_enum, 0, "(struct btf_enum){}", + { .name_off = 0, .val = 0,}); + TEST_BTF(str, struct btf_enum, BTF_F_NONAME, "{}", + { .name_off = 0, .val = 0,}); + TEST_BTF(str, struct btf_enum, BTF_F_ZERO, + "(struct btf_enum){.name_off = (__u32)0,.val = (__s32)0,}", + { .name_off = 0, .val = 0,}); + + /* struct with pointers */ + TEST_BTF(str, struct list_head, BTF_F_PTR_RAW, + "(struct list_head){.next = (struct list_head *)0x0000000000000001,}", + { .next = (struct list_head *)1 }); + /* NULL pointer should not be displayed */ + TEST_BTF(str, struct list_head, BTF_F_PTR_RAW, + "(struct list_head){}", + { .next = (struct list_head *)0 }); + + /* struct with char array */ + TEST_BTF(str, struct bpf_prog_info, 0, + "(struct bpf_prog_info){.name = (char[])['f','o','o',],}", + { .name = "foo",}); + TEST_BTF(str, struct bpf_prog_info, BTF_F_NONAME, + "{['f','o','o',],}", + {.name = "foo",}); + /* leading null char means do not display string */ + TEST_BTF(str, struct bpf_prog_info, 0, + "(struct bpf_prog_info){}", + {.name = {'\0', 'f', 'o', 'o'}}); + /* handle non-printable characters */ + TEST_BTF(str, struct bpf_prog_info, 0, + "(struct bpf_prog_info){.name = (char[])[1,2,3,],}", + { .name = {1, 2, 3, 0}}); + + /* struct with non-char array */ + TEST_BTF(str, struct __sk_buff, 0, + "(struct __sk_buff){.cb = (__u32[])[1,2,3,4,5,],}", + { .cb = {1, 2, 3, 4, 5,},}); + TEST_BTF(str, struct __sk_buff, BTF_F_NONAME, + "{[1,2,3,4,5,],}", + { .cb = { 1, 2, 3, 4, 5},}); + /* For non-char, arrays, show non-zero values only */ + TEST_BTF(str, struct __sk_buff, 0, + "(struct __sk_buff){.cb = (__u32[])[1,],}", + { .cb = { 0, 0, 1, 0, 0},}); + + /* struct with bitfields */ + TEST_BTF_C(str, struct bpf_insn, 0, + {.code = (__u8)1,.dst_reg = (__u8)0x2,.src_reg = (__u8)0x3,.off = (__s16)4,.imm = (__s32)5,}); + TEST_BTF(str, struct bpf_insn, BTF_F_NONAME, "{1,0x2,0x3,4,5,}", + {.code = 1, .dst_reg = 0x2, .src_reg = 0x3, .off = 4, + .imm = 5,}); + + return 0; +} + +char _license[] SEC("license") = "GPL";
A helper is added to allow seq file writing of kernel data structures using vmlinux BTF. Its signature is
long bpf_seq_btf_write(struct seq_file *m, struct btf_ptr *ptr, u32 btf_ptr_size, u64 flags);
Flags and struct btf_ptr definitions/use are identical to the bpf_btf_snprintf helper, and the helper returns 0 on success or a negative error value.
Suggested-by: Alexei Starovoitov alexei.starovoitov@gmail.com Signed-off-by: Alan Maguire alan.maguire@oracle.com --- include/linux/btf.h | 3 ++ include/uapi/linux/bpf.h | 10 ++++++ kernel/bpf/btf.c | 17 +++++++--- kernel/trace/bpf_trace.c | 75 +++++++++++++++++++++++++++++++++--------- tools/include/uapi/linux/bpf.h | 10 ++++++ 5 files changed, 96 insertions(+), 19 deletions(-)
diff --git a/include/linux/btf.h b/include/linux/btf.h index 3e5cdc2..eed23a4 100644 --- a/include/linux/btf.h +++ b/include/linux/btf.h @@ -69,6 +69,9 @@ const struct btf_type *btf_type_id_size(const struct btf *btf, void btf_type_seq_show(const struct btf *btf, u32 type_id, void *obj, struct seq_file *m);
+int btf_type_seq_show_flags(const struct btf *btf, u32 type_id, void *obj, + struct seq_file *m, u64 flags); + /* * Copy len bytes of string representation of obj of BTF type_id into buf. * diff --git a/include/uapi/linux/bpf.h b/include/uapi/linux/bpf.h index 9b89b67..c0815f1 100644 --- a/include/uapi/linux/bpf.h +++ b/include/uapi/linux/bpf.h @@ -3614,6 +3614,15 @@ struct bpf_stack_build_id { * The number of bytes that were written (or would have been * written if output had to be truncated due to string size), * or a negative error in cases of failure. + * + * long bpf_seq_btf_write(struct seq_file *m, struct btf_ptr *ptr, u32 ptr_size, u64 flags) + * Description + * Use BTF to write to seq_write a string representation of + * *ptr*->ptr, using *ptr*->type name or *ptr*->type_id as per + * bpf_btf_snprintf() above. *flags* are identical to those + * used for bpf_btf_snprintf. + * Return + * 0 on success or a negative error in case of failure. */ #define __BPF_FUNC_MAPPER(FN) \ FN(unspec), \ @@ -3766,6 +3775,7 @@ struct bpf_stack_build_id { FN(d_path), \ FN(copy_from_user), \ FN(btf_snprintf), \ + FN(seq_btf_write), \ /* */
/* integer value in 'imm' field of BPF_CALL instruction selects which helper diff --git a/kernel/bpf/btf.c b/kernel/bpf/btf.c index 70f5b88..0902464 100644 --- a/kernel/bpf/btf.c +++ b/kernel/bpf/btf.c @@ -5328,17 +5328,26 @@ static void btf_seq_show(struct btf_show *show, const char *fmt, ...) va_end(args); }
-void btf_type_seq_show(const struct btf *btf, u32 type_id, void *obj, - struct seq_file *m) +int btf_type_seq_show_flags(const struct btf *btf, u32 type_id, void *obj, + struct seq_file *m, u64 flags) { struct btf_show sseq;
sseq.target = m; sseq.showfn = btf_seq_show; - sseq.flags = BTF_SHOW_NONAME | BTF_SHOW_COMPACT | BTF_SHOW_ZERO | - BTF_SHOW_UNSAFE; + sseq.flags = flags;
btf_type_show(btf, type_id, obj, &sseq); + + return sseq.state.status; +} + +void btf_type_seq_show(const struct btf *btf, u32 type_id, void *obj, + struct seq_file *m) +{ + (void) btf_type_seq_show_flags(btf, type_id, obj, m, + BTF_SHOW_NONAME | BTF_SHOW_COMPACT | + BTF_SHOW_ZERO | BTF_SHOW_UNSAFE); }
struct btf_show_snprintf { diff --git a/kernel/trace/bpf_trace.c b/kernel/trace/bpf_trace.c index f171e03..eee36a8 100644 --- a/kernel/trace/bpf_trace.c +++ b/kernel/trace/bpf_trace.c @@ -71,6 +71,10 @@ static struct bpf_raw_event_map *bpf_get_raw_tracepoint_module(const char *name) u64 bpf_get_stackid(u64 r1, u64 r2, u64 r3, u64 r4, u64 r5); u64 bpf_get_stack(u64 r1, u64 r2, u64 r3, u64 r4, u64 r5);
+static int bpf_btf_printf_prepare(struct btf_ptr *ptr, u32 btf_ptr_size, + u64 flags, const struct btf **btf, + s32 *btf_id); + /** * trace_call_bpf - invoke BPF program * @call: tracepoint event @@ -780,6 +784,30 @@ struct bpf_seq_printf_buf { .btf_id = bpf_seq_write_btf_ids, };
+BPF_CALL_4(bpf_seq_btf_write, struct seq_file *, m, struct btf_ptr *, ptr, + u32, btf_ptr_size, u64, flags) +{ + const struct btf *btf; + s32 btf_id; + int ret; + + ret = bpf_btf_printf_prepare(ptr, btf_ptr_size, flags, &btf, &btf_id); + if (ret) + return ret; + + return btf_type_seq_show_flags(btf, btf_id, ptr->ptr, m, flags); +} + +static const struct bpf_func_proto bpf_seq_btf_write_proto = { + .func = bpf_seq_btf_write, + .gpl_only = true, + .ret_type = RET_INTEGER, + .arg1_type = ARG_PTR_TO_BTF_ID, + .arg2_type = ARG_PTR_TO_MEM, + .arg3_type = ARG_CONST_SIZE_OR_ZERO, + .btf_id = bpf_seq_write_btf_ids, +}; + static __always_inline int get_map_perf_counter(struct bpf_map *map, u64 flags, u64 *value, u64 *enabled, u64 *running) @@ -1151,15 +1179,14 @@ static bool bpf_d_path_allowed(const struct bpf_prog *prog) #define BTF_F_ALL (BTF_F_COMPACT | BTF_F_NONAME | \ BTF_F_PTR_RAW | BTF_F_ZERO)
-BPF_CALL_5(bpf_btf_snprintf, char *, str, u32, str_size, struct btf_ptr *, ptr, - u32, btf_ptr_size, u64, flags) +static int bpf_btf_printf_prepare(struct btf_ptr *ptr, u32 btf_ptr_size, + u64 flags, const struct btf **btf, + s32 *btf_id) { u8 btf_kind = BTF_KIND_TYPEDEF; char type_name[KSYM_NAME_LEN]; const struct btf_type *t; - const struct btf *btf; const char *btf_type; - s32 btf_id; int ret;
if (unlikely(flags & ~(BTF_F_ALL))) @@ -1168,10 +1195,10 @@ static bool bpf_d_path_allowed(const struct bpf_prog *prog) if (btf_ptr_size != sizeof(struct btf_ptr)) return -EINVAL;
- btf = bpf_get_btf_vmlinux(); + *btf = bpf_get_btf_vmlinux();
- if (IS_ERR_OR_NULL(btf)) - return PTR_ERR(btf); + if (IS_ERR_OR_NULL(*btf)) + return PTR_ERR(*btf);
if (ptr->type != NULL) { ret = copy_from_kernel_nofault(type_name, ptr->type, @@ -1201,20 +1228,34 @@ static bool bpf_d_path_allowed(const struct bpf_prog *prog) * * Fall back to BTF_KIND_INT if this fails. */ - btf_id = btf_find_by_name_kind(btf, btf_type, btf_kind); - if (btf_id < 0) - btf_id = btf_find_by_name_kind(btf, btf_type, - BTF_KIND_INT); + *btf_id = btf_find_by_name_kind(*btf, btf_type, btf_kind); + if (*btf_id < 0) + *btf_id = btf_find_by_name_kind(*btf, btf_type, + BTF_KIND_INT); } else if (ptr->type_id > 0) - btf_id = ptr->type_id; + *btf_id = ptr->type_id; else return -EINVAL;
- if (btf_id > 0) - t = btf_type_by_id(btf, btf_id); - if (btf_id <= 0 || !t) + if (*btf_id > 0) + t = btf_type_by_id(*btf, *btf_id); + if (*btf_id <= 0 || !t) return -ENOENT;
+ return 0; +} + +BPF_CALL_5(bpf_btf_snprintf, char *, str, u32, str_size, struct btf_ptr *, ptr, + u32, btf_ptr_size, u64, flags) +{ + const struct btf *btf; + s32 btf_id; + int ret; + + ret = bpf_btf_printf_prepare(ptr, btf_ptr_size, flags, &btf, &btf_id); + if (ret) + return ret; + return btf_type_snprintf_show(btf, btf_id, ptr->ptr, str, str_size, flags); } @@ -1715,6 +1756,10 @@ static void put_bpf_raw_tp_regs(void) return prog->expected_attach_type == BPF_TRACE_ITER ? &bpf_seq_write_proto : NULL; + case BPF_FUNC_seq_btf_write: + return prog->expected_attach_type == BPF_TRACE_ITER ? + &bpf_seq_btf_write_proto : + NULL; case BPF_FUNC_d_path: return &bpf_d_path_proto; default: diff --git a/tools/include/uapi/linux/bpf.h b/tools/include/uapi/linux/bpf.h index 9b89b67..c0815f1 100644 --- a/tools/include/uapi/linux/bpf.h +++ b/tools/include/uapi/linux/bpf.h @@ -3614,6 +3614,15 @@ struct bpf_stack_build_id { * The number of bytes that were written (or would have been * written if output had to be truncated due to string size), * or a negative error in cases of failure. + * + * long bpf_seq_btf_write(struct seq_file *m, struct btf_ptr *ptr, u32 ptr_size, u64 flags) + * Description + * Use BTF to write to seq_write a string representation of + * *ptr*->ptr, using *ptr*->type name or *ptr*->type_id as per + * bpf_btf_snprintf() above. *flags* are identical to those + * used for bpf_btf_snprintf. + * Return + * 0 on success or a negative error in case of failure. */ #define __BPF_FUNC_MAPPER(FN) \ FN(unspec), \ @@ -3766,6 +3775,7 @@ struct bpf_stack_build_id { FN(d_path), \ FN(copy_from_user), \ FN(btf_snprintf), \ + FN(seq_btf_write), \ /* */
/* integer value in 'imm' field of BPF_CALL instruction selects which helper
On Fri, Sep 18, 2020 at 02:34:34PM +0100, Alan Maguire wrote:
diff --git a/include/uapi/linux/bpf.h b/include/uapi/linux/bpf.h index 9b89b67..c0815f1 100644 --- a/include/uapi/linux/bpf.h +++ b/include/uapi/linux/bpf.h @@ -3614,6 +3614,15 @@ struct bpf_stack_build_id {
The number of bytes that were written (or would have been
written if output had to be truncated due to string size),
or a negative error in cases of failure.
- long bpf_seq_btf_write(struct seq_file *m, struct btf_ptr *ptr, u32 ptr_size, u64 flags)
- Description
Use BTF to write to seq_write a string representation of
*ptr*->ptr, using *ptr*->type name or *ptr*->type_id as per
bpf_btf_snprintf() above. *flags* are identical to those
used for bpf_btf_snprintf.
- Return
*/
0 on success or a negative error in case of failure.
#define __BPF_FUNC_MAPPER(FN) \ FN(unspec), \ @@ -3766,6 +3775,7 @@ struct bpf_stack_build_id { FN(d_path), \ FN(copy_from_user), \ FN(btf_snprintf), \
- FN(seq_btf_write), \
bpf_btf_ prefix is a bit mouthful.
May be bpf_print_btf() and bpf_seq_print_btf() ?
-void btf_type_seq_show(const struct btf *btf, u32 type_id, void *obj,
struct seq_file *m)
+int btf_type_seq_show_flags(const struct btf *btf, u32 type_id, void *obj,
struct seq_file *m, u64 flags)
{ struct btf_show sseq; sseq.target = m; sseq.showfn = btf_seq_show;
- sseq.flags = BTF_SHOW_NONAME | BTF_SHOW_COMPACT | BTF_SHOW_ZERO |
BTF_SHOW_UNSAFE;
- sseq.flags = flags;
could you roll this change into patch 2?
-BPF_CALL_5(bpf_btf_snprintf, char *, str, u32, str_size, struct btf_ptr *, ptr,
u32, btf_ptr_size, u64, flags)
+static int bpf_btf_printf_prepare(struct btf_ptr *ptr, u32 btf_ptr_size,
u64 flags, const struct btf **btf,
s32 *btf_id)
Similarly pls introduce bpf_print_prepare() helper in the patch 3 right away. It reads odd when something that was just introduced in the previous patch gets refactored in the next.
The rest looks great to me. Please address build bot issues and respin.
Add a test verifying iterating over tasks and displaying BTF representation of data succeeds. Note here that we do not display the task_struct itself, as it will overflow the PAGE_SIZE limit on seq data; instead we write task->fs (a struct fs_struct).
Suggested-by: Alexei Starovoitov alexei.starovoitov@gmail.com Signed-off-by: Alan Maguire alan.maguire@oracle.com --- tools/testing/selftests/bpf/prog_tests/bpf_iter.c | 66 ++++++++++++++++++++++ .../selftests/bpf/progs/bpf_iter_task_btf.c | 49 ++++++++++++++++ 2 files changed, 115 insertions(+) create mode 100644 tools/testing/selftests/bpf/progs/bpf_iter_task_btf.c
diff --git a/tools/testing/selftests/bpf/prog_tests/bpf_iter.c b/tools/testing/selftests/bpf/prog_tests/bpf_iter.c index fe1a83b9..b9f13f9 100644 --- a/tools/testing/selftests/bpf/prog_tests/bpf_iter.c +++ b/tools/testing/selftests/bpf/prog_tests/bpf_iter.c @@ -7,6 +7,7 @@ #include "bpf_iter_task.skel.h" #include "bpf_iter_task_stack.skel.h" #include "bpf_iter_task_file.skel.h" +#include "bpf_iter_task_btf.skel.h" #include "bpf_iter_tcp4.skel.h" #include "bpf_iter_tcp6.skel.h" #include "bpf_iter_udp4.skel.h" @@ -167,6 +168,69 @@ static void test_task_file(void) bpf_iter_task_file__destroy(skel); }
+#define FSBUFSZ 8192 + +static char fsbuf[FSBUFSZ]; + +static void do_btf_read(struct bpf_program *prog) +{ + int iter_fd = -1, len = 0, bufleft = FSBUFSZ; + struct bpf_link *link; + char *buf = fsbuf; + + link = bpf_program__attach_iter(prog, NULL); + if (CHECK(IS_ERR(link), "attach_iter", "attach_iter failed\n")) + return; + + iter_fd = bpf_iter_create(bpf_link__fd(link)); + if (CHECK(iter_fd < 0, "create_iter", "create_iter failed\n")) + goto free_link; + + do { + len = read(iter_fd, buf, bufleft); + if (len > 0) { + buf += len; + bufleft -= len; + } + } while (len > 0); + + if (CHECK(len < 0, "read", "read failed: %s\n", strerror(errno))) + goto free_link; + + CHECK(strstr(fsbuf, "(struct fs_struct)") == NULL, + "check for btf representation of fs_struct in iter data", + "struct fs_struct not found"); +free_link: + if (iter_fd > 0) + close(iter_fd); + bpf_link__destroy(link); +} + +static void test_task_btf(void) +{ + struct bpf_iter_task_btf__bss *bss; + struct bpf_iter_task_btf *skel; + + skel = bpf_iter_task_btf__open_and_load(); + if (CHECK(!skel, "bpf_iter_task_btf__open_and_load", + "skeleton open_and_load failed\n")) + return; + + bss = skel->bss; + + do_btf_read(skel->progs.dump_task_fs_struct); + + if (CHECK(bss->tasks == 0, "check if iterated over tasks", + "no task iteration, did BPF program run?\n")) + goto cleanup; + + CHECK(bss->seq_err != 0, "check for unexpected err", + "bpf_seq_btf_write returned %ld", bss->seq_err); + +cleanup: + bpf_iter_task_btf__destroy(skel); +} + static void test_tcp4(void) { struct bpf_iter_tcp4 *skel; @@ -957,6 +1021,8 @@ void test_bpf_iter(void) test_task_stack(); if (test__start_subtest("task_file")) test_task_file(); + if (test__start_subtest("task_btf")) + test_task_btf(); if (test__start_subtest("tcp4")) test_tcp4(); if (test__start_subtest("tcp6")) diff --git a/tools/testing/selftests/bpf/progs/bpf_iter_task_btf.c b/tools/testing/selftests/bpf/progs/bpf_iter_task_btf.c new file mode 100644 index 0000000..0451682 --- /dev/null +++ b/tools/testing/selftests/bpf/progs/bpf_iter_task_btf.c @@ -0,0 +1,49 @@ +// SPDX-License-Identifier: GPL-2.0 +/* Copyright (c) 2020, Oracle and/or its affiliates. */ +#include "bpf_iter.h" +#include <bpf/bpf_helpers.h> +#include <bpf/bpf_tracing.h> +#include <errno.h> + +char _license[] SEC("license") = "GPL"; + +long tasks = 0; +long seq_err = 0; + +/* struct task_struct's BTF representation will overflow PAGE_SIZE so cannot + * be used here; instead dump a structure associated with each task. + */ +SEC("iter/task") +int dump_task_fs_struct(struct bpf_iter__task *ctx) +{ + static const char fs_type[] = "struct fs_struct"; + struct seq_file *seq = ctx->meta->seq; + struct task_struct *task = ctx->task; + struct fs_struct *fs = (void *)0; + static struct btf_ptr ptr = { }; + long ret; + + if (task) + fs = task->fs; + + ptr.type = fs_type; + ptr.ptr = fs; + + if (ctx->meta->seq_num == 0) + BPF_SEQ_PRINTF(seq, "Raw BTF fs_struct per task\n"); + + ret = bpf_seq_btf_write(seq, &ptr, sizeof(ptr), 0); + switch (ret) { + case 0: + tasks++; + break; + case -ERANGE: + /* NULL task or task->fs, don't count it as an error. */ + break; + default: + seq_err = ret; + break; + } + + return 0; +}
linux-kselftest-mirror@lists.linaro.org