[linus:master] [cpuidle] db86f55bf8: lmbench3.PIPE.latency.us 11.5% improvement

7 Nov 2025


      Hello,
kernel test robot noticed a 11.5% improvement of lmbench3.PIPE.latency.us on:
commit: db86f55bf81a3a297be05ee8775ae9a8c6e3a599 ("cpuidle: governors: menu: Select polling state in some more cases")
https://git.kernel.org/cgit/linux/kernel/git/torvalds/linux.git master
testcase: lmbench3
config: x86_64-rhel-9.4
compiler: gcc-14
test machine: 224 threads 2 sockets Intel(R) Xeon(R) Platinum 8480CTDX (Sapphire Rapids) with 512G memory
parameters:
test_memory_size: 50%
    nr_threads: 20%
    mode: development
    test: PIPE
    cpufreq_governor: performance
In addition to that, the commit also has significant impact on the following tests:
+------------------+---------------------------------------------------------------------------------------------+
| testcase: change | will-it-scale: will-it-scale.per_thread_ops 13.4% improvement                               |
| test machine     | 224 threads 2 sockets Intel(R) Xeon(R) Platinum 8480CTDX (Sapphire Rapids) with 512G memory |
| test parameters  | cpufreq_governor=performance                                                                |
|                  | test=context_switch1                                                                        |
+------------------+---------------------------------------------------------------------------------------------+
Details are as below:
-------------------------------------------------------------------------------------------------->
The kernel config and materials to reproduce are available at:
https://download.01.org/0day-ci/archive/20251107/202511071439.d081322d-lkp@i...
=========================================================================================
compiler/cpufreq_governor/kconfig/mode/nr_threads/rootfs/tbox_group/test/test_memory_size/testcase:
  gcc-14/performance/x86_64-rhel-9.4/development/20%/debian-12-x86_64-20240206.cgz/lkp-spr-2sp4/PIPE/50%/lmbench3
commit: 
  v6.18-rc3
  db86f55bf8 ("cpuidle: governors: menu: Select polling state in some more cases")
v6.18-rc3 db86f55bf81a3a297be05ee8775 
---------------- --------------------------- 
         %stddev     %change         %stddev
             \          |                \  
 2.984e+08 ±  3%     +13.0%  3.373e+08 ±  2%  cpuidle..usage
   3870548 ±  3%      +8.9%    4215418 ±  3%  vmstat.system.cs
      5.11           -11.5%       4.52        lmbench3.PIPE.latency.us
 2.949e+08 ±  3%     +13.0%  3.334e+08 ±  2%  lmbench3.time.voluntary_context_switches
   1474808 ±  2%     +13.7%    1676175 ±  2%  sched_debug.cpu.nr_switches.avg
    908098 ±  4%     +11.5%    1012241 ±  5%  sched_debug.cpu.nr_switches.stddev
     35.76 ±  6%     +70.7%      61.02 ± 36%  perf-sched.wait_and_delay.avg.ms.schedule_hrtimeout_range_clock.poll_schedule_timeout.constprop.0.do_poll
    438.60 ±  6%     -42.6%     251.80 ± 28%  perf-sched.wait_and_delay.count.schedule_hrtimeout_range_clock.poll_schedule_timeout.constprop.0.do_poll
     35.75 ±  6%     +70.7%      61.01 ± 36%  perf-sched.wait_time.avg.ms.schedule_hrtimeout_range_clock.poll_schedule_timeout.constprop.0.do_poll
 6.834e+09 ±  2%      +8.8%  7.438e+09 ±  2%  perf-stat.i.branch-instructions
   4003246 ±  3%      +9.0%    4365130 ±  4%  perf-stat.i.context-switches
     14211 ±  7%     +30.7%      18567 ±  6%  perf-stat.i.cycles-between-cache-misses
 3.305e+10            +7.8%  3.563e+10 ±  2%  perf-stat.i.instructions
     17.81 ±  3%      +9.1%      19.42 ±  4%  perf-stat.i.metric.K/sec
 6.738e+09 ±  2%      +8.8%  7.328e+09 ±  2%  perf-stat.ps.branch-instructions
   3917194 ±  3%      +9.0%    4267905 ±  4%  perf-stat.ps.context-switches
 3.257e+10            +7.8%   3.51e+10 ±  2%  perf-stat.ps.instructions
 4.907e+12 ±  5%     +11.7%  5.481e+12 ±  3%  perf-stat.total.instructions
***************************************************************************************************
lkp-spr-2sp4: 224 threads 2 sockets Intel(R) Xeon(R) Platinum 8480CTDX (Sapphire Rapids) with 512G memory
=========================================================================================
compiler/cpufreq_governor/kconfig/rootfs/tbox_group/test/testcase:
  gcc-14/performance/x86_64-rhel-9.4/debian-13-x86_64-20250902.cgz/lkp-spr-2sp4/context_switch1/will-it-scale
commit: 
  v6.18-rc3
  db86f55bf8 ("cpuidle: governors: menu: Select polling state in some more cases")
v6.18-rc3 db86f55bf81a3a297be05ee8775 
---------------- --------------------------- 
         %stddev     %change         %stddev
             \          |                \  
   8360618 ± 14%     +37.6%   11502592 ±  9%  meminfo.DirectMap2M
      0.52 ±  4%      +0.2        0.68 ±  2%  mpstat.cpu.all.irq%
      0.07 ±  2%      +0.0        0.08 ±  2%  mpstat.cpu.all.soft%
  24410019           +11.1%   27123411        sched_debug.cpu.nr_switches.avg
  56464105 ±  3%     +17.9%   66574674 ±  5%  sched_debug.cpu.nr_switches.max
    473411            +2.6%     485915        proc-vmstat.nr_active_anon
   1205948            +1.0%    1218407        proc-vmstat.nr_file_pages
    292446            +4.3%     304940        proc-vmstat.nr_shmem
    473411            +2.6%     485915        proc-vmstat.nr_zone_active_anon
      4.03 ±  3%      -2.5        1.49 ± 21%  turbostat.C1%
      4.83 ±  9%      -1.0        3.86 ± 13%  turbostat.C1E%
 1.087e+08 ±  3%     +59.1%  1.729e+08 ±  3%  turbostat.IRQ
      0.03 ± 13%      +2.0        2.03        turbostat.POLL%
 4.745e+10            +8.5%  5.147e+10        perf-stat.i.branch-instructions
      0.94            -0.0        0.90 ±  3%  perf-stat.i.branch-miss-rate%
 2.567e+08            +9.1%    2.8e+08        perf-stat.i.branch-misses
  48551481            +8.1%   52493759        perf-stat.i.context-switches
      2.76            -4.8%       2.63 ±  2%  perf-stat.i.cpi
    593.92            +5.6%     627.34        perf-stat.i.cpu-migrations
 2.367e+11            +8.2%  2.561e+11        perf-stat.i.instructions
      0.62            +9.8%       0.68        perf-stat.i.ipc
    216.75            +8.1%     234.36        perf-stat.i.metric.K/sec
      1.85            -7.0%       1.73        perf-stat.overall.cpi
      0.54            +7.5%       0.58        perf-stat.overall.ipc
    204618            +2.0%     208750        perf-stat.overall.path-length
 4.674e+10            +8.4%  5.066e+10        perf-stat.ps.branch-instructions
 2.532e+08            +9.0%   2.76e+08        perf-stat.ps.branch-misses
  47800355            +8.1%   51652366        perf-stat.ps.context-switches
    591.27            +5.5%     623.50        perf-stat.ps.cpu-migrations
 2.332e+11            +8.1%  2.521e+11        perf-stat.ps.instructions
 7.417e+13            +8.1%  8.019e+13        perf-stat.total.instructions
    534510 ±  9%     +24.8%     666973 ±  8%  will-it-scale.1.linear
    471854 ±  2%     +26.9%     598959 ± 13%  will-it-scale.1.processes
    534510 ±  9%     +24.8%     666973 ±  8%  will-it-scale.1.threads
  59865194 ±  9%     +24.8%   74700994 ±  8%  will-it-scale.112.linear
  52446572 ±  3%     +17.2%   61467049        will-it-scale.112.processes
     52.12 ±  2%     -11.2%      46.28        will-it-scale.112.processes_idle
  89797792 ±  9%     +24.8%  1.121e+08 ±  8%  will-it-scale.168.linear
  90159107            +5.3%   94951409        will-it-scale.168.processes
     23.53            -8.4%      21.56        will-it-scale.168.processes_idle
 1.197e+08 ±  9%     +24.8%  1.494e+08 ±  8%  will-it-scale.224.linear
  29932597 ±  9%     +24.8%   37350497 ±  8%  will-it-scale.56.linear
  24228097 ±  2%     +22.6%   29712218 ±  2%  will-it-scale.56.processes
     80.80            -3.6%      77.89        will-it-scale.56.processes_idle
    495994           +13.5%     562996 ±  2%  will-it-scale.per_process_ops
    211008 ±  5%     +13.4%     239359 ±  4%  will-it-scale.per_thread_ops
      5748            +1.7%       5848        will-it-scale.time.percent_of_cpu_this_job_got
     16823            +1.5%      17082        will-it-scale.time.system_time
      1410            +4.2%       1469        will-it-scale.time.user_time
 3.625e+08            +6.0%  3.842e+08        will-it-scale.workload
      6.75 ±  9%      -3.4        3.33 ±  4%  perf-profile.calltrace.cycles-pp.intel_idle.cpuidle_enter_state.cpuidle_enter.cpuidle_idle_call.do_idle
      8.01 ±  8%      -1.7        6.26 ±  6%  perf-profile.calltrace.cycles-pp.cpuidle_enter.cpuidle_idle_call.do_idle.cpu_startup_entry.start_secondary
      7.89 ±  8%      -1.7        6.14 ±  6%  perf-profile.calltrace.cycles-pp.cpuidle_enter_state.cpuidle_enter.cpuidle_idle_call.do_idle.cpu_startup_entry
     10.30 ±  6%      -1.7        8.60 ±  4%  perf-profile.calltrace.cycles-pp.cpuidle_idle_call.do_idle.cpu_startup_entry.start_secondary.common_startup_64
      4.36 ±  2%      -0.4        3.99 ±  5%  perf-profile.calltrace.cycles-pp.ksys_write.do_syscall_64.entry_SYSCALL_64_after_hwframe
      3.59 ±  2%      -0.4        3.23 ±  5%  perf-profile.calltrace.cycles-pp.anon_pipe_write.vfs_write.ksys_write.do_syscall_64.entry_SYSCALL_64_after_hwframe
      3.91 ±  2%      -0.4        3.55 ±  5%  perf-profile.calltrace.cycles-pp.vfs_write.ksys_write.do_syscall_64.entry_SYSCALL_64_after_hwframe
      2.75 ±  2%      -0.4        2.39 ±  6%  perf-profile.calltrace.cycles-pp.__wake_up_sync_key.anon_pipe_write.vfs_write.ksys_write.do_syscall_64
      2.50 ±  2%      -0.4        2.14 ±  6%  perf-profile.calltrace.cycles-pp.autoremove_wake_function.__wake_up_common.__wake_up_sync_key.anon_pipe_write.vfs_write
      2.44 ±  2%      -0.4        2.09 ±  6%  perf-profile.calltrace.cycles-pp.try_to_wake_up.autoremove_wake_function.__wake_up_common.__wake_up_sync_key.anon_pipe_write
      2.56 ±  2%      -0.4        2.21 ±  6%  perf-profile.calltrace.cycles-pp.__wake_up_common.__wake_up_sync_key.anon_pipe_write.vfs_write.ksys_write
      1.73 ±  2%      -0.3        1.39 ±  8%  perf-profile.calltrace.cycles-pp.ttwu_queue_wakelist.try_to_wake_up.autoremove_wake_function.__wake_up_common.__wake_up_sync_key
      1.48 ±  2%      -0.3        1.14 ± 10%  perf-profile.calltrace.cycles-pp.__smp_call_single_queue.ttwu_queue_wakelist.try_to_wake_up.autoremove_wake_function.__wake_up_common
      1.35 ±  2%      -0.3        1.02 ± 11%  perf-profile.calltrace.cycles-pp.call_function_single_prep_ipi.__smp_call_single_queue.ttwu_queue_wakelist.try_to_wake_up.autoremove_wake_function
     61.64            +1.2       62.81        perf-profile.calltrace.cycles-pp.__schedule.schedule.anon_pipe_read.vfs_read.ksys_read
     62.24            +1.2       63.45        perf-profile.calltrace.cycles-pp.schedule.anon_pipe_read.vfs_read.ksys_read.do_syscall_64
      0.00            +1.7        1.72 ± 23%  perf-profile.calltrace.cycles-pp.poll_idle.cpuidle_enter_state.cpuidle_enter.cpuidle_idle_call.do_idle
      6.82 ±  9%      -3.5        3.36 ±  4%  perf-profile.children.cycles-pp.intel_idle
      8.10 ±  8%      -1.8        6.34 ±  6%  perf-profile.children.cycles-pp.cpuidle_enter
      8.04 ±  8%      -1.8        6.28 ±  6%  perf-profile.children.cycles-pp.cpuidle_enter_state
     10.45 ±  6%      -1.7        8.72 ±  4%  perf-profile.children.cycles-pp.cpuidle_idle_call
      4.41 ±  2%      -0.4        4.04 ±  5%  perf-profile.children.cycles-pp.ksys_write
      2.76 ±  2%      -0.4        2.40 ±  5%  perf-profile.children.cycles-pp.__wake_up_sync_key
      3.63 ±  2%      -0.4        3.27 ±  5%  perf-profile.children.cycles-pp.anon_pipe_write
      2.51 ±  2%      -0.4        2.15 ±  6%  perf-profile.children.cycles-pp.autoremove_wake_function
      2.47 ±  2%      -0.4        2.11 ±  6%  perf-profile.children.cycles-pp.try_to_wake_up
      3.94 ±  2%      -0.4        3.58 ±  5%  perf-profile.children.cycles-pp.vfs_write
      2.57 ±  2%      -0.4        2.22 ±  6%  perf-profile.children.cycles-pp.__wake_up_common
      1.74 ±  2%      -0.3        1.40 ±  8%  perf-profile.children.cycles-pp.ttwu_queue_wakelist
      1.49 ±  2%      -0.3        1.15 ± 10%  perf-profile.children.cycles-pp.__smp_call_single_queue
      1.36 ±  3%      -0.3        1.03 ± 11%  perf-profile.children.cycles-pp.call_function_single_prep_ipi
      0.22 ±  2%      +0.0        0.25 ±  8%  perf-profile.children.cycles-pp.switch_mm_irqs_off
      0.23 ±  4%      +0.0        0.28 ±  3%  perf-profile.children.cycles-pp.local_clock_noinstr
     62.26            +1.2       63.47        perf-profile.children.cycles-pp.schedule
     66.06            +1.4       67.43        perf-profile.children.cycles-pp.__schedule
      0.00            +1.8        1.75 ± 23%  perf-profile.children.cycles-pp.poll_idle
      6.82 ±  9%      -3.5        3.36 ±  4%  perf-profile.self.cycles-pp.intel_idle
      1.35 ±  3%      -0.3        1.02 ± 11%  perf-profile.self.cycles-pp.call_function_single_prep_ipi
      0.26 ±  4%      -0.1        0.16 ±  4%  perf-profile.self.cycles-pp.flush_smp_call_function_queue
      0.24 ±  3%      -0.0        0.20 ±  3%  perf-profile.self.cycles-pp.set_next_entity
      0.05            +0.0        0.06        perf-profile.self.cycles-pp.local_clock_noinstr
      0.00            +1.7        1.66 ± 23%  perf-profile.self.cycles-pp.poll_idle
Disclaimer:
Results have been estimated based on internal Intel analysis and are provided
for informational purposes only. Any difference in system hardware or software
design or configuration may affect actual performance.
-- 
0-DAY CI Kernel Test Service
https://github.com/intel/lkp-tests/wiki

2025

2024

2023

2022

2021

2020

2019

2018

2017

[linus:master] [cpuidle] db86f55bf8: lmbench3.PIPE.latency.us 11.5% improvement