Pahole fails to encode BTF for some Go projects (e.g. Kubernetes and Podman) due to recursive type definitions that create reference loops not representable in C. These recursive typedefs trigger a failure in the BTF deduplication algorithm.
This patch extends btf_dedup_struct_types() to properly handle potential recursion for BTF_KIND_TYPEDEF, similar to how recursion is already handled for BTF_KIND_STRUCT. This allows pahole to successfully generate BTF for Go binaries using recursive types without impacting existing C-based workflows.
Changes in v2: 1. Patch 1: Refactored code to prevent copying existing logic. Instead of adding a new function () we modify the existing btf_dedup_struct_type() function to handle the BTF_KIND_TYPEDEF case. Calls to btf_hash_struct() and btf_shallow_equal_struct() are replaced with calls to functions that select btf_hash_struct() / btf_hash_typedef() based on the type. 2. Patch 2: Added tests
v1: https://lore.kernel.org/lkml/20251107153408.159342-1-paulhoussel2@gmail.com/
Paul Houssel (2): libbpf: fix BTF dedup to support recursive typedef definitions selftests/bpf: add BTF dedup tests for recursive typedef definitions
tools/lib/bpf/btf.c | 59 +++++++++++++++---- tools/testing/selftests/bpf/prog_tests/btf.c | 61 ++++++++++++++++++++ 2 files changed, 110 insertions(+), 10 deletions(-)
Handle recursive typedefs in BTF deduplication
Pahole fails to encode BTF for some Go projects (e.g. Kubernetes and Podman) due to recursive type definitions that create reference loops not representable in C. These recursive typedefs trigger a failure in the BTF deduplication algorithm.
This patch extends btf_dedup_ref_type() to properly handle potential recursion for BTF_KIND_TYPEDEF, similar to how recursion is already handled for BTF_KIND_STRUCT. This allows pahole to successfully generate BTF for Go binaries using recursive types without impacting existing C-based workflows.
Co-developed-by: Martin Horth martin.horth@telecom-sudparis.eu Signed-off-by: Martin Horth martin.horth@telecom-sudparis.eu Co-developed-by: Ouail Derghal ouail.derghal@imt-atlantique.fr Signed-off-by: Ouail Derghal ouail.derghal@imt-atlantique.fr Co-developed-by: Guilhem Jazeron guilhem.jazeron@inria.fr Signed-off-by: Guilhem Jazeron guilhem.jazeron@inria.fr Co-developed-by: Ludovic Paillat ludovic.paillat@inria.fr Signed-off-by: Ludovic Paillat ludovic.paillat@inria.fr Co-developed-by: Robin Theveniaut robin.theveniaut@irit.fr Signed-off-by: Robin Theveniaut robin.theveniaut@irit.fr Suggested-by: Tristan d'Audibert tristan.daudibert@gmail.com Signed-off-by: Paul Houssel paul.houssel@orange.com
---
The issue was originally observed when attempting to encode BTF for Kubernetes binaries (kubectl, kubeadm):
$ git clone --depth 1 https://github.com/kubernetes/kubernetes $ cd ./kubernetes $ make kubeadm DBG=1 $ pahole --btf_encode_detached=kubeadm.btf _output/bin/kubeadm btf_encoder__encode: btf__dedup failed! Failed to encode BTF
The root cause lies in recursive type definitions that cannot exist in C but are valid in Go.
program.go:
"package main
type Foo func() Foo
func main() { bar() }
func bar() Foo { return nil }"
Building and encoding this program with pahole triggers the same deduplication failure:
$ go build -gcflags "all=-N -l" ./program.go $ pahole --btf_encode_detached=program.btf program btf_encoder__encode: btf__dedup failed! Failed to encode BTF
As noted in the comment of btf_dedup_ref_type(), the deduplication logic previously assumed recursion only occurs through structs or unions:
"[...] there is no danger of encountering cycles because in C type system the only way to form type cycle is through struct/union, so any chain of reference types, even those taking part in a type cycle, will inevitably reach struct/union at some point."
However, Go allows such recursion through typedef-like constructs (function types, aliases), requiring a special case for BTF_KIND_TYPEDEF.
This patch introduces that special handling, ensuring pahole can handle Go-generated BTFs while maintaining compatibility with existing C workflows. --- tools/lib/bpf/btf.c | 59 +++++++++++++++++++++++++++++++++++++-------- 1 file changed, 49 insertions(+), 10 deletions(-)
diff --git a/tools/lib/bpf/btf.c b/tools/lib/bpf/btf.c index 9f141395c074..5e7c2cae7b27 100644 --- a/tools/lib/bpf/btf.c +++ b/tools/lib/bpf/btf.c @@ -3901,6 +3901,20 @@ static int btf_dedup_strings(struct btf_dedup *d) return err; }
+/* + * Calculate type signature hash of TYPEDEF, ignoring referenced type IDs, + * as referenced type IDs equivalence is established separately during type + * graph equivalence check algorithm. + */ +static long btf_hash_typedef(struct btf_type *t) +{ + long h; + + h = hash_combine(0, t->name_off); + h = hash_combine(h, t->info); + return h; +} + static long btf_hash_common(struct btf_type *t) { long h; @@ -3918,6 +3932,13 @@ static bool btf_equal_common(struct btf_type *t1, struct btf_type *t2) t1->size == t2->size; }
+/* Check structural compatibility of two TYPEDEF. */ +static bool btf_equal_typedef(struct btf_type *t1, struct btf_type *t2) +{ + return t1->name_off == t2->name_off && + t1->info == t2->info; +} + /* Calculate type signature hash of INT or TAG. */ static long btf_hash_int_decl_tag(struct btf_type *t) { @@ -4844,14 +4865,31 @@ static void btf_dedup_merge_hypot_map(struct btf_dedup *d) } }
+static inline long btf_hash_by_kind(struct btf_type *t, __u16 kind) +{ + if (kind == BTF_KIND_TYPEDEF) + return btf_hash_typedef(t); + else + return btf_hash_struct(t); +} + +static inline bool btf_equal_by_kind(struct btf_type *t1, struct btf_type *t2, __u16 kind) +{ + if (kind == BTF_KIND_TYPEDEF) + return btf_equal_typedef(t1, t2); + else + return btf_shallow_equal_struct(t1, t2); +} + /* - * Deduplicate struct/union types. + * Deduplicate struct/union and typedef types. * * For each struct/union type its type signature hash is calculated, taking * into account type's name, size, number, order and names of fields, but * ignoring type ID's referenced from fields, because they might not be deduped - * completely until after reference types deduplication phase. This type hash - * is used to iterate over all potential canonical types, sharing same hash. + * completely until after reference types deduplication phase. For each typedef + * type, the hash is computed based on the type’s name and size. This type hash + * is used to iterate over all potential canonical types, sharingsame hash. * For each canonical candidate we check whether type graphs that they form * (through referenced types in fields and so on) are equivalent using algorithm * implemented in `btf_dedup_is_equiv`. If such equivalence is found and @@ -4882,18 +4920,20 @@ static int btf_dedup_struct_type(struct btf_dedup *d, __u32 type_id) t = btf_type_by_id(d->btf, type_id); kind = btf_kind(t);
- if (kind != BTF_KIND_STRUCT && kind != BTF_KIND_UNION) + if (kind != BTF_KIND_STRUCT && + kind != BTF_KIND_UNION && + kind != BTF_KIND_TYPEDEF) return 0;
- h = btf_hash_struct(t); + h = btf_hash_by_kind(t, kind); for_each_dedup_cand(d, hash_entry, h) { __u32 cand_id = hash_entry->value; int eq;
/* * Even though btf_dedup_is_equiv() checks for - * btf_shallow_equal_struct() internally when checking two - * structs (unions) for equivalence, we need to guard here + * btf_equal_by_kind() internally when checking two + * structs (unions) or typedefs for equivalence, we need to guard here * from picking matching FWD type as a dedup candidate. * This can happen due to hash collision. In such case just * relying on btf_dedup_is_equiv() would lead to potentially @@ -4901,7 +4941,7 @@ static int btf_dedup_struct_type(struct btf_dedup *d, __u32 type_id) * FWD and compatible STRUCT/UNION are considered equivalent. */ cand_type = btf_type_by_id(d->btf, cand_id); - if (!btf_shallow_equal_struct(t, cand_type)) + if (!btf_equal_by_kind(t, cand_type, kind)) continue;
btf_dedup_clear_hypot_map(d); @@ -4939,7 +4979,7 @@ static int btf_dedup_struct_types(struct btf_dedup *d) /* * Deduplicate reference type. * - * Once all primitive and struct/union types got deduplicated, we can easily + * Once all primitive, struct/union and typedef types got deduplicated, we can easily * deduplicate all other (reference) BTF types. This is done in two steps: * * 1. Resolve all referenced type IDs into their canonical type IDs. This @@ -4982,7 +5022,6 @@ static int btf_dedup_ref_type(struct btf_dedup *d, __u32 type_id) case BTF_KIND_VOLATILE: case BTF_KIND_RESTRICT: case BTF_KIND_PTR: - case BTF_KIND_TYPEDEF: case BTF_KIND_FUNC: case BTF_KIND_TYPE_TAG: ref_type_id = btf_dedup_ref_type(d, t->type);
On Wed, 2025-11-12 at 15:11 +0100, Paul Houssel wrote:
Handle recursive typedefs in BTF deduplication
Pahole fails to encode BTF for some Go projects (e.g. Kubernetes and Podman) due to recursive type definitions that create reference loops not representable in C. These recursive typedefs trigger a failure in the BTF deduplication algorithm.
This patch extends btf_dedup_ref_type() to properly handle potential recursion for BTF_KIND_TYPEDEF, similar to how recursion is already handled for BTF_KIND_STRUCT. This allows pahole to successfully generate BTF for Go binaries using recursive types without impacting existing C-based workflows.
Co-developed-by: Martin Horth martin.horth@telecom-sudparis.eu Signed-off-by: Martin Horth martin.horth@telecom-sudparis.eu Co-developed-by: Ouail Derghal ouail.derghal@imt-atlantique.fr Signed-off-by: Ouail Derghal ouail.derghal@imt-atlantique.fr Co-developed-by: Guilhem Jazeron guilhem.jazeron@inria.fr Signed-off-by: Guilhem Jazeron guilhem.jazeron@inria.fr Co-developed-by: Ludovic Paillat ludovic.paillat@inria.fr Signed-off-by: Ludovic Paillat ludovic.paillat@inria.fr Co-developed-by: Robin Theveniaut robin.theveniaut@irit.fr Signed-off-by: Robin Theveniaut robin.theveniaut@irit.fr Suggested-by: Tristan d'Audibert tristan.daudibert@gmail.com Signed-off-by: Paul Houssel paul.houssel@orange.com
No differences in BTF generated for kernel when using pahole built against libbpf with and without this patch.
Acked-by: Eduard Zingerman eddyz87@gmail.com
@@ -4939,7 +4979,7 @@ static int btf_dedup_struct_types(struct btf_dedup *d) /*
- Deduplicate reference type.
- Once all primitive and struct/union types got deduplicated, we can easily
- Once all primitive, struct/union and typedef types got deduplicated, we can easily
- deduplicate all other (reference) BTF types. This is done in two steps:
- Resolve all referenced type IDs into their canonical type IDs. This
Nit: this passage continues as:
* There is no danger of encountering cycles because in C type * system the only way to form type cycle is through struct/union, so any chain * of reference types, even those taking part in a type cycle, will inevitably * reach struct/union at some point.
I think it needs adjustment to refer to typedef as well.
[...]
Add several ./test_progs tests: 1. btf/dedup:recursive typedef ensures that deduplication no longer fails on recursive typedefs. 2. btf/dedup:typedef ensures that typedefs are deduplicated correctly just as they were before this patch.
Signed-off-by: Paul Houssel paul.houssel@orange.com --- tools/testing/selftests/bpf/prog_tests/btf.c | 61 ++++++++++++++++++++ 1 file changed, 61 insertions(+)
diff --git a/tools/testing/selftests/bpf/prog_tests/btf.c b/tools/testing/selftests/bpf/prog_tests/btf.c index 8a9ba4292109..a19db159475a 100644 --- a/tools/testing/selftests/bpf/prog_tests/btf.c +++ b/tools/testing/selftests/bpf/prog_tests/btf.c @@ -7495,6 +7495,67 @@ static struct btf_dedup_test dedup_tests[] = { BTF_STR_SEC("\0t\0m1\0m2\0tag1\0tag2\0tag3"), }, }, +{ + .descr = "dedup: recursive typedef", + /* + * This test simulates a recursive typedef, which in GO is defined as such: + * + * type Foo func() Foo + * + * In BTF terms, this is represented as a TYPEDEF referencing + * a FUNC_PROTO that returns the same TYPEDEF. + */ + .input = { + .raw_types = { + /* + * [1] typedef Foo -> func() Foo + * [2] func_proto() -> Foo + */ + BTF_TYPEDEF_ENC(NAME_NTH(1), 2), /* [1] */ + BTF_FUNC_PROTO_ENC(1, 0), /* [2] */ + BTF_END_RAW, + }, + BTF_STR_SEC("\0Foo"), + }, + .expect = { + .raw_types = { + BTF_TYPEDEF_ENC(NAME_NTH(1), 2), /* [1] */ + BTF_FUNC_PROTO_ENC(1, 0), /* [2] */ + BTF_END_RAW, + }, + BTF_STR_SEC("\0Foo"), + }, +}, +{ + .descr = "dedup: typedef", + /* + * // CU 1: + * typedef int foo; + * + * // CU 2: + * typedef int foo; + */ + .input = { + .raw_types = { + /* CU 1 */ + BTF_TYPE_INT_ENC(0, BTF_INT_SIGNED, 0, 32, 4), /* [1] */ + BTF_TYPEDEF_ENC(NAME_NTH(1), 1), /* [2] */ + /* CU 2 */ + BTF_TYPE_INT_ENC(0, BTF_INT_SIGNED, 0, 32, 4), /* [3] */ + BTF_TYPEDEF_ENC(NAME_NTH(1), 3), /* [4] */ + BTF_END_RAW, + }, + BTF_STR_SEC("\0foo"), + }, + .expect = { + .raw_types = { + BTF_TYPE_INT_ENC(0, BTF_INT_SIGNED, 0, 32, 4), /* [1] */ + BTF_TYPEDEF_ENC(NAME_NTH(1), 1), /* [2] */ + BTF_END_RAW, + }, + BTF_STR_SEC("\0foo"), + }, +}, { .descr = "dedup: typedef tags", .input = {
On Wed, 2025-11-12 at 15:11 +0100, Paul Houssel wrote:
Lgtm, one nit below:
+{
- .descr = "dedup: recursive typedef",
- /*
* This test simulates a recursive typedef, which in GO is defined as such:** type Foo func() Foo** In BTF terms, this is represented as a TYPEDEF referencing* a FUNC_PROTO that returns the same TYPEDEF.*/- .input = {
.raw_types = {/** [1] typedef Foo -> func() Foo* [2] func_proto() -> Foo*/BTF_TYPEDEF_ENC(NAME_NTH(1), 2), /* [1] */BTF_FUNC_PROTO_ENC(1, 0), /* [2] */
Nit: Maybe repeat the above two types, just to make sure that deduplication happens?
BTF_END_RAW,},BTF_STR_SEC("\0Foo"),- },
- .expect = {
.raw_types = {BTF_TYPEDEF_ENC(NAME_NTH(1), 2), /* [1] */BTF_FUNC_PROTO_ENC(1, 0), /* [2] */BTF_END_RAW,},BTF_STR_SEC("\0Foo"),- },
+},
[...]
linux-kselftest-mirror@lists.linaro.org