On 2024/3/6 3:38, Mina Almasry wrote:
On Tue, Mar 5, 2024 at 4:54 AM Yunsheng Lin linyunsheng@huawei.com wrote:
On 2024/3/5 10:01, Mina Almasry wrote:
...
Perf - page-pool benchmark:
bench_page_pool_simple.ko tests with and without these changes: https://pastebin.com/raw/ncHDwAbn
AFAIK the number that really matters in the perf tests is the 'tasklet_page_pool01_fast_path Per elem'. This one measures at about 8 cycles without the changes but there is some 1 cycle noise in some results.
With the patches this regresses to 9 cycles with the changes but there is 1 cycle noise occasionally running this test repeatedly.
Lastly I tried disable the static_branch_unlikely() in netmem_is_net_iov() check. To my surprise disabling the static_branch_unlikely() check reduces the fast path back to 8 cycles, but the 1 cycle noise remains.
The last sentence seems to be suggesting the above 1 ns regresses is caused by the static_branch_unlikely() checking?
Note it's not a 1ns regression, it's looks like maybe a 1 cycle regression (slightly less than 1ns if I'm reading the output of the test correctly):
# clean net-next time_bench: Type:tasklet_page_pool01_fast_path Per elem: 8 cycles(tsc) 2.993 ns (step:0)
# with patches time_bench: Type:tasklet_page_pool01_fast_path Per elem: 9 cycles(tsc) 3.679 ns (step:0)
# with patches and with diff that disables static branching:
time_bench: Type:tasklet_page_pool01_fast_path Per elem: 8 cycles(tsc) 3.248 ns (step:0)
I do see noise in the test results between run and run, and any regression (if any) is slightly obfuscated by the noise, so it's a bit hard to make confident statements. So far it looks like a ~0.25ns regression without static branch and about ~0.65ns with static branch.
Honestly when I saw all 3 results were within some noise I did not investigate more, but if this looks concerning to you I can dig further. I likely need to gather a few test runs to filter out the noise and maybe investigate the assembly my compiler is generating to maybe narrow down what changes there.
Yes, that is confusing enough that need more investigation.