On Fri, Dec 13, 2024 at 03:02:11PM GMT, Andrii Nakryiko wrote:
On Thu, Dec 12, 2024 at 3:23 PM Daniel Xu dxu@dxuuu.xyz wrote:
This commit allows progs to elide a null check on statically known map lookup keys. In other words, if the verifier can statically prove that the lookup will be in-bounds, allow the prog to drop the null check.
This is useful for two reasons:
- Large numbers of nullness checks (especially when they cannot fail) unnecessarily pushes prog towards BPF_COMPLEXITY_LIMIT_JMP_SEQ.
- It forms a tighter contract between programmer and verifier.
For (1), bpftrace is starting to make heavier use of percpu scratch maps. As a result, for user scripts with large number of unrolled loops, we are starting to hit jump complexity verification errors. These percpu lookups cannot fail anyways, as we only use static key values. Eliding nullness probably results in less work for verifier as well.
For (2), percpu scratch maps are often used as a larger stack, as the currrent stack is limited to 512 bytes. In these situations, it is desirable for the programmer to express: "this lookup should never fail, and if it does, it means I messed up the code". By omitting the null check, the programmer can "ask" the verifier to double check the logic.
Tests also have to be updated in sync with these changes, as the verifier is more efficient with this change. Notable, iters.c tests had to be changed to use a map type that still requires null checks, as it's exercising verifier tracking logic w.r.t iterators.
Signed-off-by: Daniel Xu dxu@dxuuu.xyz
kernel/bpf/verifier.c | 80 ++++++++++++++++++- tools/testing/selftests/bpf/progs/iters.c | 14 ++-- .../selftests/bpf/progs/map_kptr_fail.c | 2 +- .../selftests/bpf/progs/verifier_map_in_map.c | 2 +- .../testing/selftests/bpf/verifier/map_kptr.c | 2 +- 5 files changed, 87 insertions(+), 13 deletions(-)
Eduard has great points. I've added a few more comments below.
pw-bot: cr
diff --git a/kernel/bpf/verifier.c b/kernel/bpf/verifier.c index 58b36cc96bd5..4947ef884a18 100644 --- a/kernel/bpf/verifier.c +++ b/kernel/bpf/verifier.c @@ -287,6 +287,7 @@ struct bpf_call_arg_meta { u32 ret_btf_id; u32 subprogno; struct btf_field *kptr_field;
s64 const_map_key;
};
struct bpf_kfunc_call_arg_meta { @@ -9163,6 +9164,53 @@ static int check_reg_const_str(struct bpf_verifier_env *env, return 0; }
+/* Returns constant key value if possible, else -1 */ +static s64 get_constant_map_key(struct bpf_verifier_env *env,
struct bpf_reg_state *key,
u32 key_size)
+{
struct bpf_func_state *state = func(env, key);
struct bpf_reg_state *reg;
int zero_size = 0;
int stack_off;
u8 *stype;
int slot;
int spi;
int i;
if (!env->bpf_capable)
return -1;
if (key->type != PTR_TO_STACK)
return -1;
if (!tnum_is_const(key->var_off))
return -1;
stack_off = key->off + key->var_off.value;
slot = -stack_off - 1;
spi = slot / BPF_REG_SIZE;
/* First handle precisely tracked STACK_ZERO, up to BPF_REG_SIZE */
stype = state->stack[spi].slot_type;
for (i = 0; i < BPF_REG_SIZE && stype[i] == STACK_ZERO; i++)
it's Friday and I'm lazy, but please double-check that this works for both big-endian and little-endian :)
Any tips? Are the existing tests running thru s390x hosts in CI sufficient or should I add some tests writen in C (and not BPF assembler)? I can never think about endianness correctly...
with Eduard's suggestion this also becomes interesting when you have 000mmm mix (as one example), because that gives you a small range, and all values might be valid keys for arrays
Can you define what "small range" means? What range is there with 0's? Any pointers would be helpful.
zero_size++;
if (zero_size == key_size)
return 0;
if (!is_spilled_reg(&state->stack[spi]))
/* Not pointer to stack */
!is_spilled_reg and "Not pointer to stack" seem to be not exactly the same things?
You're right - comment is not helpful. I'll make the change to use is_spilled_scalar_reg() which is probably as clear as it gets.
[..]