Re: [PATCH bpf-next v2] bpf: Allow compiler to inline most of bpf_local_storage_lookup()

11 Feb 2024


      Hello:
This patch was applied to bpf/bpf-next.git (master)
by Martin KaFai Lau martin.lau@kernel.org:
On Wed,  7 Feb 2024 13:26:17 +0100 you wrote:
...
In various performance profiles of kernels with BPF programs attached,
bpf_local_storage_lookup() appears as a significant portion of CPU
cycles spent. To enable the compiler generate more optimal code, turn
bpf_local_storage_lookup() into a static inline function, where only the
cache insertion code path is outlined
Notably, outlining cache insertion helps avoid bloating callers by
duplicating setting up calls to raw_spin_{lock,unlock}_irqsave() (on
architectures which do not inline spin_lock/unlock, such as x86), which
would cause the compiler produce worse code by deciding to outline
otherwise inlinable functions. The call overhead is neutral, because we
make 2 calls either way: either calling raw_spin_lock_irqsave() and
raw_spin_unlock_irqsave(); or call __bpf_local_storage_insert_cache(),
which calls raw_spin_lock_irqsave(), followed by a tail-call to
raw_spin_unlock_irqsave() where the compiler can perform TCO and (in
optimized uninstrumented builds) turns it into a plain jump. The call to
__bpf_local_storage_insert_cache() can be elided entirely if
cacheit_lockit is a false constant expression.
[...]
Here is the summary with links:
  - [bpf-next,v2] bpf: Allow compiler to inline most of bpf_local_storage_lookup()
    https://git.kernel.org/bpf/bpf-next/c/68bc61c26cac
You are awesome, thank you!
-- 
Deet-doot-dot, I am a bot.
https://korg.docs.kernel.org/patchwork/pwbot.html

2025

2024

2023

2022

2021

2020

2019

2018

2017

Re: [PATCH bpf-next v2] bpf: Allow compiler to inline most of bpf_local_storage_lookup()