Re: [RFC PATCH net-next v6 00/15] Device Memory TCP

6 Mar 2024


      On 2024/3/6 3:38, Mina Almasry wrote:
...
On Tue, Mar 5, 2024 at 4:54 AM Yunsheng Lin linyunsheng@huawei.com wrote:
...
On 2024/3/5 10:01, Mina Almasry wrote:
...
...
Perf - page-pool benchmark:
bench_page_pool_simple.ko tests with and without these changes:
https://pastebin.com/raw/ncHDwAbn
AFAIK the number that really matters in the perf tests is the
'tasklet_page_pool01_fast_path Per elem'. This one measures at about 8
cycles without the changes but there is some 1 cycle noise in some
results.
With the patches this regresses to 9 cycles with the changes but there
is 1 cycle noise occasionally running this test repeatedly.
Lastly I tried disable the static_branch_unlikely() in
netmem_is_net_iov() check. To my surprise disabling the
static_branch_unlikely() check reduces the fast path back to 8 cycles,
but the 1 cycle noise remains.
The last sentence seems to be suggesting the above 1 ns regresses is caused
by the static_branch_unlikely() checking?
Note it's not a 1ns regression, it's looks like maybe a 1 cycle
regression (slightly less than 1ns if I'm reading the output of the
test correctly):
# clean net-next
time_bench: Type:tasklet_page_pool01_fast_path Per elem: 8 cycles(tsc)
2.993 ns (step:0)
# with patches
time_bench: Type:tasklet_page_pool01_fast_path Per elem: 9 cycles(tsc)
3.679 ns (step:0)
# with patches and with diff that disables static branching:
...
time_bench: Type:tasklet_page_pool01_fast_path Per elem: 8 cycles(tsc)
3.248 ns (step:0)
I do see noise in the test results between run and run, and any
regression (if any) is slightly obfuscated by the noise, so it's a bit
hard to make confident statements. So far it looks like a ~0.25ns
regression without static branch and about ~0.65ns with static branch.
Honestly when I saw all 3 results were within some noise I did not
investigate more, but if this looks concerning to you I can dig
further. I likely need to gather a few test runs to filter out the
noise and maybe investigate the assembly my compiler is generating to
maybe narrow down what changes there.
Yes, that is confusing enough that need more investigation.
...

2025

2024

2023

2022

2021

2020

2019

2018

2017

Re: [RFC PATCH net-next v6 00/15] Device Memory TCP

Perf - page-pool benchmark: