Hi, Enclosed are a pair of patches for an oops that can occur if an exception is generated while a bpf subprogram is running. One of the bpf_prog_aux entries for the subprograms are missing an extable. This can lead to an exception that would otherwise be handled turning into a NULL pointer bug.
The bulk of the change here is simply adding a pair of programs for the selftest. The proposed fix in this iteration is a 1-line change.
These changes were tested via the verifier and progs selftests and no regressions were observed.
Changes from v1:
- Add a selftest (Feedback From Alexei Starovoitov) - Move to a 1-line verifier change instead of searching multiple extables
Krister Johansen (2): Add a selftest for subprogram extables bpf: ensure main program has an extable
kernel/bpf/verifier.c | 1 + .../bpf/prog_tests/subprogs_extable.c | 35 +++++++++ .../bpf/progs/test_subprogs_extable.c | 71 +++++++++++++++++++ 3 files changed, 107 insertions(+) create mode 100644 tools/testing/selftests/bpf/prog_tests/subprogs_extable.c create mode 100644 tools/testing/selftests/bpf/progs/test_subprogs_extable.c
In certain situations a program with subprograms may have a NULL extable entry. This should not happen, and when it does, it turns a single trap into multiple. Add a test case for further debugging and to prevent regressions. N.b: without any other patches this can panic or oops a kernel.
Signed-off-by: Krister Johansen kjlx@templeofstupid.com --- .../bpf/prog_tests/subprogs_extable.c | 35 +++++++++ .../bpf/progs/test_subprogs_extable.c | 71 +++++++++++++++++++ 2 files changed, 106 insertions(+) create mode 100644 tools/testing/selftests/bpf/prog_tests/subprogs_extable.c create mode 100644 tools/testing/selftests/bpf/progs/test_subprogs_extable.c
diff --git a/tools/testing/selftests/bpf/prog_tests/subprogs_extable.c b/tools/testing/selftests/bpf/prog_tests/subprogs_extable.c new file mode 100644 index 000000000000..18169b7eedf8 --- /dev/null +++ b/tools/testing/selftests/bpf/prog_tests/subprogs_extable.c @@ -0,0 +1,35 @@ +// SPDX-License-Identifier: GPL-2.0 +/* Copyright (c) 2020 Facebook */ + +#include <test_progs.h> +#include <stdbool.h> +#include "test_subprogs_extable.skel.h" + +static int duration; + +void test_subprogs_extable(void) +{ + const int READ_SZ = 456; + struct test_subprogs_extable *skel; + int err; + + skel = test_subprogs_extable__open(); + if (CHECK(!skel, "skel_open", "failed to open skeleton\n")) + return; + + err = test_subprogs_extable__load(skel); + if (CHECK(err, "skel_load", "failed to load skeleton\n")) + return; + + err = test_subprogs_extable__attach(skel); + if (CHECK(err, "skel_attach", "skeleton attach failed: %d\n", err)) + goto cleanup; + + /* trigger tracepoint */ + ASSERT_OK(trigger_module_test_read(READ_SZ), "trigger_read"); + + test_subprogs_extable__detach(skel); + +cleanup: + test_subprogs_extable__destroy(skel); +} diff --git a/tools/testing/selftests/bpf/progs/test_subprogs_extable.c b/tools/testing/selftests/bpf/progs/test_subprogs_extable.c new file mode 100644 index 000000000000..408137eaaa07 --- /dev/null +++ b/tools/testing/selftests/bpf/progs/test_subprogs_extable.c @@ -0,0 +1,71 @@ +// SPDX-License-Identifier: GPL-2.0 +/* Copyright (c) 2020 Facebook */ + +#include "vmlinux.h" +#include <bpf/bpf_helpers.h> +#include <bpf/bpf_tracing.h> +#include <bpf/bpf_core_read.h> +#include "../bpf_testmod/bpf_testmod.h" + +struct { + __uint(type, BPF_MAP_TYPE_ARRAY); + __uint(max_entries, 8); + __type(key, __u32); + __type(value, __u64); +} test_array SEC(".maps"); + +static __u64 test_cb(struct bpf_map *map, __u32 *key, __u64 *val, void *data) +{ + return 1; +} + +static __u64 test_cb2(struct bpf_map *map, __u32 *key, __u64 *val, void *data) +{ + return 1; +} + +static __u64 test_cb3(struct bpf_map *map, __u32 *key, __u64 *val, void *data) +{ + return 1; +} + +SEC("fexit/bpf_testmod_return_ptr") +int BPF_PROG(handle_fexit_ret_subprogs, int arg, struct file *ret) +{ + long buf = 0; + + bpf_probe_read_kernel(&buf, 8, ret); + bpf_probe_read_kernel(&buf, 8, (char *)ret + 256); + *(volatile long long *)ret; + *(volatile int *)&ret->f_mode; + bpf_for_each_map_elem(&test_array, test_cb, NULL, 0); + return 0; +} + +SEC("fexit/bpf_testmod_return_ptr") +int BPF_PROG(handle_fexit_ret_subprogs2, int arg, struct file *ret) +{ + long buf = 0; + + bpf_probe_read_kernel(&buf, 8, ret); + bpf_probe_read_kernel(&buf, 8, (char *)ret + 256); + *(volatile long long *)ret; + *(volatile int *)&ret->f_mode; + bpf_for_each_map_elem(&test_array, test_cb2, NULL, 0); + return 0; +} + +SEC("fexit/bpf_testmod_return_ptr") +int BPF_PROG(handle_fexit_ret_subprogs3, int arg, struct file *ret) +{ + long buf = 0; + + bpf_probe_read_kernel(&buf, 8, ret); + bpf_probe_read_kernel(&buf, 8, (char *)ret + 256); + *(volatile long long *)ret; + *(volatile int *)&ret->f_mode; + bpf_for_each_map_elem(&test_array, test_cb3, NULL, 0); + return 0; +} + +char _license[] SEC("license") = "GPL";
On 6/7/23 2:04 PM, Krister Johansen wrote:
In certain situations a program with subprograms may have a NULL extable entry. This should not happen, and when it does, it turns a single trap into multiple. Add a test case for further debugging and to prevent regressions. N.b: without any other patches this can panic or oops a kernel.
Signed-off-by: Krister Johansen kjlx@templeofstupid.com
.../bpf/prog_tests/subprogs_extable.c | 35 +++++++++ .../bpf/progs/test_subprogs_extable.c | 71 +++++++++++++++++++ 2 files changed, 106 insertions(+) create mode 100644 tools/testing/selftests/bpf/prog_tests/subprogs_extable.c create mode 100644 tools/testing/selftests/bpf/progs/test_subprogs_extable.c
diff --git a/tools/testing/selftests/bpf/prog_tests/subprogs_extable.c b/tools/testing/selftests/bpf/prog_tests/subprogs_extable.c new file mode 100644 index 000000000000..18169b7eedf8 --- /dev/null +++ b/tools/testing/selftests/bpf/prog_tests/subprogs_extable.c @@ -0,0 +1,35 @@ +// SPDX-License-Identifier: GPL-2.0 +/* Copyright (c) 2020 Facebook */
This copyright is not correct.
+#include <test_progs.h> +#include <stdbool.h>
stdbool.h is not needed.
+#include "test_subprogs_extable.skel.h"
+static int duration;
+void test_subprogs_extable(void) +{
- const int READ_SZ = 456;
- struct test_subprogs_extable *skel;
- int err;
- skel = test_subprogs_extable__open();
- if (CHECK(!skel, "skel_open", "failed to open skeleton\n"))
return;
Please use ASSERT_* macros instead of CHECK macro. The same for below. See some examples in prog_tests directory.
- err = test_subprogs_extable__load(skel);
- if (CHECK(err, "skel_load", "failed to load skeleton\n"))
return;
goto cleanup;
- err = test_subprogs_extable__attach(skel);
- if (CHECK(err, "skel_attach", "skeleton attach failed: %d\n", err))
goto cleanup;
- /* trigger tracepoint */
- ASSERT_OK(trigger_module_test_read(READ_SZ), "trigger_read");
- test_subprogs_extable__detach(skel);
+cleanup:
- test_subprogs_extable__destroy(skel);
+} diff --git a/tools/testing/selftests/bpf/progs/test_subprogs_extable.c b/tools/testing/selftests/bpf/progs/test_subprogs_extable.c new file mode 100644 index 000000000000..408137eaaa07 --- /dev/null +++ b/tools/testing/selftests/bpf/progs/test_subprogs_extable.c @@ -0,0 +1,71 @@ +// SPDX-License-Identifier: GPL-2.0 +/* Copyright (c) 2020 Facebook */
the above copyright is not correct.
+#include "vmlinux.h" +#include <bpf/bpf_helpers.h> +#include <bpf/bpf_tracing.h> +#include <bpf/bpf_core_read.h>
There is no CORE related operation in the program. The above header is not needed.
+#include "../bpf_testmod/bpf_testmod.h"
This one is not needed too.
+struct {
- __uint(type, BPF_MAP_TYPE_ARRAY);
- __uint(max_entries, 8);
- __type(key, __u32);
- __type(value, __u64);
+} test_array SEC(".maps");
+static __u64 test_cb(struct bpf_map *map, __u32 *key, __u64 *val, void *data) +{
- return 1;
+}
+static __u64 test_cb2(struct bpf_map *map, __u32 *key, __u64 *val, void *data) +{
- return 1;
+}
+static __u64 test_cb3(struct bpf_map *map, __u32 *key, __u64 *val, void *data) +{
- return 1;
+}
We can just have one test_cb and used for all programs, right? Or more subprograms increase the chance of the test failure?
+SEC("fexit/bpf_testmod_return_ptr") +int BPF_PROG(handle_fexit_ret_subprogs, int arg, struct file *ret) +{
- long buf = 0;
- bpf_probe_read_kernel(&buf, 8, ret);
- bpf_probe_read_kernel(&buf, 8, (char *)ret + 256);
The above bpf_probe_read_kernel() things are not necessary, right?
- *(volatile long long *)ret;
just 'volatile long' should be enough.
- *(volatile int *)&ret->f_mode;
- bpf_for_each_map_elem(&test_array, test_cb, NULL, 0);
- return 0;
+}
+SEC("fexit/bpf_testmod_return_ptr") +int BPF_PROG(handle_fexit_ret_subprogs2, int arg, struct file *ret) +{
- long buf = 0;
- bpf_probe_read_kernel(&buf, 8, ret);
- bpf_probe_read_kernel(&buf, 8, (char *)ret + 256);
- *(volatile long long *)ret;
- *(volatile int *)&ret->f_mode;
- bpf_for_each_map_elem(&test_array, test_cb2, NULL, 0);
- return 0;
+}
+SEC("fexit/bpf_testmod_return_ptr") +int BPF_PROG(handle_fexit_ret_subprogs3, int arg, struct file *ret) +{
- long buf = 0;
- bpf_probe_read_kernel(&buf, 8, ret);
- bpf_probe_read_kernel(&buf, 8, (char *)ret + 256);
- *(volatile long long *)ret;
- *(volatile int *)&ret->f_mode;
- bpf_for_each_map_elem(&test_array, test_cb3, NULL, 0);
- return 0;
+}
+char _license[] SEC("license") = "GPL";
On 6/7/23 2:04 PM, Krister Johansen wrote:
In certain situations a program with subprograms may have a NULL extable entry. This should not happen, and when it does, it turns a single trap into multiple. Add a test case for further debugging and to prevent regressions. N.b: without any other patches this can panic or oops a kernel.
Also, it would be great if you can show the kernel oops stack trace.
Signed-off-by: Krister Johansen kjlx@templeofstupid.com
.../bpf/prog_tests/subprogs_extable.c | 35 +++++++++ .../bpf/progs/test_subprogs_extable.c | 71 +++++++++++++++++++ 2 files changed, 106 insertions(+) create mode 100644 tools/testing/selftests/bpf/prog_tests/subprogs_extable.c create mode 100644 tools/testing/selftests/bpf/progs/test_subprogs_extable.c
[...]
On Thu, Jun 8, 2023 at 10:40 AM Yonghong Song yhs@meta.com wrote:
On 6/7/23 2:04 PM, Krister Johansen wrote:
In certain situations a program with subprograms may have a NULL extable entry. This should not happen, and when it does, it turns a single trap into multiple. Add a test case for further debugging and to prevent regressions. N.b: without any other patches this can panic or oops a kernel.
Also, it would be great if you can show the kernel oops stack trace.
+1
Also please reorder the patches. patch 1 - fix patch 2 - test for the fix.
When bpf subprograms are in use, the main program is not jit'd after the subprograms because jit_subprogs sets a value for prog->bpf_func upon success. Subsequent calls to the JIT are bypassed when this value is non-NULL. This leads to a situation where the main program and its func[0] counterpart are both in the bpf kallsyms tree, but only func[0] has an extable. Extables are only created during JIT. Now there are two nearly identical program ksym entries in the tree, but only one has an extable. Depending upon how the entries are placed, there's a chance that a fault will call search_extable on the aux with the NULL entry.
Since jit_subprogs already copies state from func[0] to the main program, include the extable pointer in this state duplication. The alternative is to skip adding the main program to the bpf_kallsyms table, but that would mean adding a check for subprograms into the middle of bpf_prog_load.
Cc: stable@vger.kernel.org Fixes: 1c2a088a6626 ("bpf: x64: add JIT support for multi-function programs") Signed-off-by: Krister Johansen kjlx@templeofstupid.com --- kernel/bpf/verifier.c | 1 + 1 file changed, 1 insertion(+)
diff --git a/kernel/bpf/verifier.c b/kernel/bpf/verifier.c index 5871aa78d01a..d6939db9fbf9 100644 --- a/kernel/bpf/verifier.c +++ b/kernel/bpf/verifier.c @@ -17242,6 +17242,7 @@ static int jit_subprogs(struct bpf_verifier_env *env) prog->jited = 1; prog->bpf_func = func[0]->bpf_func; prog->jited_len = func[0]->jited_len; + prog->aux->extable = func[0]->aux->extable; prog->aux->func = func; prog->aux->func_cnt = env->subprog_cnt; bpf_prog_jit_attempt_done(prog);
On 6/7/23 2:04 PM, Krister Johansen wrote:
When bpf subprograms are in use, the main program is not jit'd after the subprograms because jit_subprogs sets a value for prog->bpf_func upon success. Subsequent calls to the JIT are bypassed when this value is non-NULL. This leads to a situation where the main program and its func[0] counterpart are both in the bpf kallsyms tree, but only func[0] has an extable. Extables are only created during JIT. Now there are two nearly identical program ksym entries in the tree, but only one has an extable. Depending upon how the entries are placed, there's a chance that a fault will call search_extable on the aux with the NULL entry.
Since jit_subprogs already copies state from func[0] to the main program, include the extable pointer in this state duplication. The alternative is to skip adding the main program to the bpf_kallsyms table, but that would mean adding a check for subprograms into the middle of bpf_prog_load.
I think having two early identical program ksym entries is bad. When people 'cat /proc/kallsyms | grep <their program name>', they will find two programs with identical kernel address but different hash value. This is just very confusing. I think removing the duplicate in kallsyms is better from user's perspective.
Cc: stable@vger.kernel.org Fixes: 1c2a088a6626 ("bpf: x64: add JIT support for multi-function programs") Signed-off-by: Krister Johansen kjlx@templeofstupid.com
kernel/bpf/verifier.c | 1 + 1 file changed, 1 insertion(+)
diff --git a/kernel/bpf/verifier.c b/kernel/bpf/verifier.c index 5871aa78d01a..d6939db9fbf9 100644 --- a/kernel/bpf/verifier.c +++ b/kernel/bpf/verifier.c @@ -17242,6 +17242,7 @@ static int jit_subprogs(struct bpf_verifier_env *env) prog->jited = 1; prog->bpf_func = func[0]->bpf_func; prog->jited_len = func[0]->jited_len;
- prog->aux->extable = func[0]->aux->extable; prog->aux->func = func; prog->aux->func_cnt = env->subprog_cnt; bpf_prog_jit_attempt_done(prog);
On Thu, Jun 08, 2023 at 10:38:12AM -0700, Yonghong Song wrote:
On 6/7/23 2:04 PM, Krister Johansen wrote:
When bpf subprograms are in use, the main program is not jit'd after the subprograms because jit_subprogs sets a value for prog->bpf_func upon success. Subsequent calls to the JIT are bypassed when this value is non-NULL. This leads to a situation where the main program and its func[0] counterpart are both in the bpf kallsyms tree, but only func[0] has an extable. Extables are only created during JIT. Now there are two nearly identical program ksym entries in the tree, but only one has an extable. Depending upon how the entries are placed, there's a chance that a fault will call search_extable on the aux with the NULL entry.
Since jit_subprogs already copies state from func[0] to the main program, include the extable pointer in this state duplication. The alternative is to skip adding the main program to the bpf_kallsyms table, but that would mean adding a check for subprograms into the middle of bpf_prog_load.
I think having two early identical program ksym entries is bad. When people 'cat /proc/kallsyms | grep <their program name>', they will find two programs with identical kernel address but different hash value. This is just very confusing. I think removing the duplicate in kallsyms is better from user's perspective.
Thanks for all the feedback.
In terms of resolving this confusion my inclination is to use the main program. That way users see in kallsyms the same tag that is reported by bpftool. On the other hand, the tag in kallsyms won't match the sha1 of that actual chunk of code. Is anything relying on the hash in the tag and the digest of the code agreeing?
-K
On Wed, Jun 7, 2023 at 2:04 PM Krister Johansen kjlx@templeofstupid.com wrote:
When bpf subprograms are in use, the main program is not jit'd after the subprograms because jit_subprogs sets a value for prog->bpf_func upon success. Subsequent calls to the JIT are bypassed when this value is non-NULL. This leads to a situation where the main program and its func[0] counterpart are both in the bpf kallsyms tree, but only func[0] has an extable. Extables are only created during JIT. Now there are two nearly identical program ksym entries in the tree, but only one has an extable. Depending upon how the entries are placed, there's a chance that a fault will call search_extable on the aux with the NULL entry.
Since jit_subprogs already copies state from func[0] to the main program, include the extable pointer in this state duplication. The alternative is to skip adding the main program to the bpf_kallsyms table, but that would mean adding a check for subprograms into the middle of bpf_prog_load.
adding a check to bpf_prog_load() isn't great. that's true, but...
Cc: stable@vger.kernel.org Fixes: 1c2a088a6626 ("bpf: x64: add JIT support for multi-function programs") Signed-off-by: Krister Johansen kjlx@templeofstupid.com
kernel/bpf/verifier.c | 1 + 1 file changed, 1 insertion(+)
diff --git a/kernel/bpf/verifier.c b/kernel/bpf/verifier.c index 5871aa78d01a..d6939db9fbf9 100644 --- a/kernel/bpf/verifier.c +++ b/kernel/bpf/verifier.c @@ -17242,6 +17242,7 @@ static int jit_subprogs(struct bpf_verifier_env *env) prog->jited = 1; prog->bpf_func = func[0]->bpf_func; prog->jited_len = func[0]->jited_len;
prog->aux->extable = func[0]->aux->extable;
Why not to do this hunk and what I suggested earlier: start from func=1 ? That will address double ksym insertion that Yonghong mentioned.
On Thu, Jun 08, 2023 at 03:01:36PM -0700, Alexei Starovoitov wrote:
On Wed, Jun 7, 2023 at 2:04 PM Krister Johansen kjlx@templeofstupid.com wrote:
Cc: stable@vger.kernel.org Fixes: 1c2a088a6626 ("bpf: x64: add JIT support for multi-function programs") Signed-off-by: Krister Johansen kjlx@templeofstupid.com
kernel/bpf/verifier.c | 1 + 1 file changed, 1 insertion(+)
diff --git a/kernel/bpf/verifier.c b/kernel/bpf/verifier.c index 5871aa78d01a..d6939db9fbf9 100644 --- a/kernel/bpf/verifier.c +++ b/kernel/bpf/verifier.c @@ -17242,6 +17242,7 @@ static int jit_subprogs(struct bpf_verifier_env *env) prog->jited = 1; prog->bpf_func = func[0]->bpf_func; prog->jited_len = func[0]->jited_len;
prog->aux->extable = func[0]->aux->extable;
Why not to do this hunk and what I suggested earlier: start from func=1 ? That will address double ksym insertion that Yonghong mentioned.
Sure thing. Yonghong and you have convinced me.
I'll send out a v3 with all changes requested so far.
-K
linux-kselftest-mirror@lists.linaro.org