On 03/09/16 00:27, markivx@codeaurora.org wrote:
This patch series implements an alternative window assisted load tracking mechanism in lieu of PELT based cpu utilization tracking. Testing has shown that a window based non-decaying metric such as WALT guiding cpu frequency and task placement decisions can improve performance/power especially when running workloads more commonly found on mobile devices. The aim of this series is to incorporate WALT accounting into the scheduler and feed WALT statistics to schedutil in order to guide cpu frequency selection. The implementation is detailed in the commit text of Patch 1. The eventual goal is to also guide placement decisions based on WALT statistics.
WALT has existed in out-of-tree kernels for ARM/ARM64 commercialized devices for a few years. This is an effort to bring WALT to mainline as well as to test on multiple architectures and with varied workloads.
This RFC version is mainly to preview what the code will look like on mainline. Future RFC revisions will include a theoretical discussion and benchmark results.
Tested on an Intel x86_64 machine (on top of 4.7-rc6). (Benchmark results will be sent out separately and as part of this message in the next RFC version).
Patch 1: Adds WALT tracking to the scheduler
Patches 2-3: Temporary patches to bring in EAS/sched-freq like capacity table and to use Intel PMC counters for more accurate frequency invariant load tracking on X86. Included for completeness but not meant for merging.
include/linux/sched.h | 35 ++++++++++ include/linux/sched/sysctl.h | 2 + include/trace/events/sched.h | 76 +++++++++++++++++++++ init/Kconfig | 9 +++ kernel/sched/Makefile | 1 + kernel/sched/core.c | 29 ++++++++- kernel/sched/cpufreq_schedutil.c | 44 ++++++++++++- kernel/sched/cputime.c | 11 +++- kernel/sched/debug.c | 10 +++ kernel/sched/fair.c | 7 +- kernel/sched/sched.h | 13 ++++ kernel/sched/walt.c | 580 ++++++++++++++++++++++++++++++++++ kernel/sched/walt.h | 75 +++++++++++++++++++++ kernel/sysctl.c | 18 +++++ 14 files changed, 904 insertions(+), 6 deletions(-)
I caught a WALT related hard lockup on a v4.7 kernel with only patch 1 on top. Fairly easy to reproduce by watching a video in firefox browser on Ubuntu 16.04.
$ addr2line -e vmlinux ffffffff810d835e kernel/sched/sched.h:1542
$ addr2line -e vmlinux ffffffff810d29b0 kernel/sched/sched.h:1538
1531 static inline int _double_lock_balance(struct rq *this_rq, struct rq *busiest) 1532 __releases(this_rq->lock) 1533 __acquires(busiest->lock) 1534 __acquires(this_rq->lock) 1535 { 1536 int ret = 0; 1537 1538 if (unlikely(!raw_spin_trylock(&busiest->lock))) { 1539 if (busiest < this_rq) { 1540 raw_spin_unlock(&this_rq->lock); 1541 raw_spin_lock(&busiest->lock); 1542 raw_spin_lock_nested(&this_rq->lock, 1543 SINGLE_DEPTH_NESTING); 1544 ret = 1; 1545 } else 1546 raw_spin_lock_nested(&busiest->lock, 1547 SINGLE_DEPTH_NESTING); 1548 } 1549 return ret; 1550 }
[ 118.795603] ============================================= [ 118.795606] [ INFO: possible recursive locking detected ] [ 118.795609] 4.7.0-walt-v4 #3 Not tainted [ 118.795612] --------------------------------------------- [ 118.795615] rtkit-daemon/3133 is trying to acquire lock: [ 118.795619] (&rq->lock){-.-.-.}, at: [<ffffffff810d835e>] walt_fixup_busy_time+0x1ee/0x300 [ 118.795635] [ 118.795635] but task is already holding lock: [ 118.795638] (&rq->lock){-.-.-.}, at: [<ffffffff810d29b0>] push_rt_task.part.39+0xb0/0x2a0 [ 118.795650] [ 118.795650] other info that might help us debug this: [ 118.795653] Possible unsafe locking scenario: [ 118.795653] [ 118.795656] CPU0 [ 118.795659] ---- [ 118.795661] lock(&rq->lock); [ 118.795667] lock(&rq->lock); [ 118.795673] [ 118.795673] *** DEADLOCK *** [ 118.795673] [ 118.795676] May be due to missing lock nesting notation [ 118.795676] [ 118.795680] 1 lock held by rtkit-daemon/3133: [ 118.795682] #0: (&rq->lock){-.-.-.}, at: [<ffffffff810d29b0>] push_rt_task.part.39+0xb0/0x2a0 [ 118.795692] [ 118.795692] stack backtrace: [ 118.795697] CPU: 1 PID: 3133 Comm: rtkit-daemon Not tainted 4.7.0-walt-v4 #3 [ 118.795700] Hardware name: LENOVO 2537Z5F/2537Z5F, BIOS 6IET74WW (1.34 ) 10/25/2010 [ 118.795703] 0000000000000000 ffff8800ad7e77a8 ffffffff8143001c ffffffff829e8fc0 [ 118.795711] ffffffff829e8fc0 ffff8800ad7e7848 ffffffff810e5eab ffff880000000000 [ 118.795722] 000000000003e01f ffffffff8235f800 ffff8800af3ccd40 000000000000032f [ 118.795729] Call Trace: [ 118.795735] BUG: sleeping function called from invalid context at kernel/irq/manage.c:110 [ 118.795736] in_atomic(): 1, irqs_disabled(): 1, pid: 3133, name: rtkit-daemon [ 118.795736] INFO: lockdep is turned off. [ 118.795737] irq event stamp: 330 [ 118.795741] hardirqs last enabled at (329): [<ffffffff818b675c>] _raw_spin_unlock_irq+0x2c/0x40 [ 118.795743] hardirqs last disabled at (330): [<ffffffff818b6eeb>] _raw_spin_lock_irqsave+0x2b/0x90 [ 118.795747] softirqs last enabled at (0): [<ffffffff810827b1>] copy_process.part.30+0x5c1/0x1e60 [ 118.795749] softirqs last disabled at (0): [< (null)>] (null) [ 118.795750] CPU: 1 PID: 3133 Comm: rtkit-daemon Not tainted 4.7.0-walt-v4 #3 [ 118.795751] Hardware name: LENOVO 2537Z5F/2537Z5F, BIOS 6IET74WW (1.34 ) 10/25/2010 [ 118.795753] 0000000000000001 ffff8800ad7e7390 ffffffff8143001c ffff8800af3ccd40 [ 118.795754] ffffffff81ca0267 ffff8800ad7e73b8 ffffffff810b3490 ffffffff81ca0267 [ 118.795756] 000000000000006e 0000000000000000 ffff8800ad7e73e0 ffffffff810b3599 [ 118.795756] Call Trace: [ 118.795763] [<ffffffff8143001c>] dump_stack+0x85/0xc9 [ 118.795766] [<ffffffff810b3490>] ___might_sleep+0x180/0x240 [ 118.795768] [<ffffffff810b3599>] __might_sleep+0x49/0x80 [ 118.795771] [<ffffffff810fc838>] synchronize_irq+0x38/0xa0 [ 118.795772] [<ffffffff810fbdfe>] ? __irq_put_desc_unlock+0x1e/0x40 [ 118.795774] [<ffffffff810fcae9>] ? __disable_irq_nosync+0x49/0x70 [ 118.795775] [<ffffffff810fcb3c>] disable_irq+0x1c/0x30 [ 118.795787] [<ffffffffc0172a02>] e1000_netpoll+0xf2/0x120 [e1000e] [ 118.795791] [<ffffffff817a3518>] netpoll_poll_dev+0x78/0x2c0 [ 118.795793] [<ffffffff817a3900>] netpoll_send_skb_on_dev+0x1a0/0x290 [ 118.795795] [<ffffffff817a3ccf>] netpoll_send_udp+0x2df/0x470 [ 118.795798] [<ffffffffc012ab32>] write_msg+0xb2/0xf0 [netconsole] [ 118.795800] [<ffffffff810f9489>] call_console_drivers.constprop.23+0x149/0x1e0 [ 118.795802] [<ffffffff810fa334>] console_unlock+0x4e4/0x5b0 [ 118.795803] [<ffffffff810fa7ae>] vprintk_emit+0x3ae/0x5d0 [ 118.795805] [<ffffffff810fab29>] vprintk_default+0x29/0x40 [ 118.795808] [<ffffffff811b7be2>] printk+0x4d/0x4f [ 118.795812] [<ffffffff810372c2>] show_trace_log_lvl+0x32/0x60 [ 118.795814] [<ffffffff8103681f>] show_stack_log_lvl+0xff/0x180 [ 118.795816] [<ffffffff81037335>] show_stack+0x25/0x50 [ 118.795818] [<ffffffff8143001c>] dump_stack+0x85/0xc9 [ 118.795821] [<ffffffff810e5eab>] __lock_acquire+0x193b/0x1940 [ 118.795823] [<ffffffff810deab4>] ? cpuacct_charge+0xd4/0x1d0 [ 118.795825] [<ffffffff8103d319>] ? sched_clock+0x9/0x10 [ 118.795826] [<ffffffff810d1dda>] ? update_curr_rt+0x15a/0x300 [ 118.795828] [<ffffffff810e6533>] lock_acquire+0xd3/0x220 [ 118.795830] [<ffffffff810d835e>] ? walt_fixup_busy_time+0x1ee/0x300 [ 118.795831] [<ffffffff818b638d>] _raw_spin_lock+0x3d/0x80 [ 118.795833] [<ffffffff810d835e>] ? walt_fixup_busy_time+0x1ee/0x300 [ 118.795834] [<ffffffff810d835e>] walt_fixup_busy_time+0x1ee/0x300 [ 118.795836] [<ffffffff810b783c>] set_task_cpu+0xac/0x2e0 [ 118.795837] [<ffffffff810d2a53>] push_rt_task.part.39+0x153/0x2a0 [ 118.795839] [<ffffffff810d2cb7>] push_rt_tasks+0x17/0x30 [ 118.795841] [<ffffffff811b6d3b>] __balance_callback+0x45/0x5c [ 118.795844] [<ffffffff818b0d96>] __schedule+0xaf6/0xbb0 [ 118.795846] [<ffffffff818b0e8c>] schedule+0x3c/0x90 [ 118.795847] [<ffffffff818b6053>] schedule_hrtimeout_range_clock+0xe3/0x140 [ 118.795850] [<ffffffff811125c0>] ? hrtimer_init+0x230/0x230 [ 118.795852] [<ffffffff818b6047>] ? schedule_hrtimeout_range_clock+0xd7/0x140 [ 118.795853] [<ffffffff818b60c3>] schedule_hrtimeout_range+0x13/0x20 [ 118.795858] [<ffffffff81261604>] poll_schedule_timeout+0x54/0x80 [ 118.795859] [<ffffffff81262e67>] do_sys_poll+0x3a7/0x510 [ 118.795861] [<ffffffff8103d319>] ? sched_clock+0x9/0x10 [ 118.795864] [<ffffffff8129ac30>] ? ep_poll_callback+0x120/0x360 [ 118.795866] [<ffffffff8103d319>] ? sched_clock+0x9/0x10 [ 118.795867] [<ffffffff810d60c0>] ? __wake_up_sync_key+0x50/0x60 [ 118.795869] [<ffffffff812617d0>] ? poll_select_copy_remaining+0x150/0x150 [ 118.795871] [<ffffffff812617d0>] ? poll_select_copy_remaining+0x150/0x150 [ 118.795873] [<ffffffff8103d319>] ? sched_clock+0x9/0x10 [ 118.795875] [<ffffffff811eeced>] ? __might_fault+0x4d/0xa0 [ 118.795877] [<ffffffff810e497d>] ? __lock_acquire+0x40d/0x1940 [ 118.795879] [<ffffffff8103d319>] ? sched_clock+0x9/0x10 [ 118.795880] [<ffffffff81261a2a>] ? poll_select_set_timeout+0x5a/0x90 [ 118.795883] [<ffffffff8111a3a4>] ? ktime_get_ts64+0x84/0x180 [ 118.795885] [<ffffffff810e424d>] ? trace_hardirqs_on+0xd/0x10 [ 118.795886] [<ffffffff8111a3d6>] ? ktime_get_ts64+0xb6/0x180 [ 118.795888] [<ffffffff81261a2a>] ? poll_select_set_timeout+0x5a/0x90 [ 118.795889] [<ffffffff81263095>] SyS_poll+0x65/0xf0 [ 118.795891] [<ffffffff818b7080>] entry_SYSCALL_64_fastpath+0x23/0xc1 [ 118.796149] [<ffffffff8143001c>] dump_stack+0x85/0xc9 [ 118.796153] [<ffffffff810e5eab>] __lock_acquire+0x193b/0x1940 [ 118.796156] [<ffffffff810deab4>] ? cpuacct_charge+0xd4/0x1d0 [ 118.796159] [<ffffffff8103d319>] ? sched_clock+0x9/0x10 [ 118.796163] [<ffffffff810d1dda>] ? update_curr_rt+0x15a/0x300 [ 118.796166] [<ffffffff810e6533>] lock_acquire+0xd3/0x220 [ 118.796169] [<ffffffff810d835e>] ? walt_fixup_busy_time+0x1ee/0x300 [ 118.796173] [<ffffffff818b638d>] _raw_spin_lock+0x3d/0x80 [ 118.796176] [<ffffffff810d835e>] ? walt_fixup_busy_time+0x1ee/0x300 [ 118.796179] [<ffffffff810d835e>] walt_fixup_busy_time+0x1ee/0x300 [ 118.796183] [<ffffffff810b783c>] set_task_cpu+0xac/0x2e0 [ 118.796187] [<ffffffff810d2a53>] push_rt_task.part.39+0x153/0x2a0 [ 118.796190] [<ffffffff810d2cb7>] push_rt_tasks+0x17/0x30 [ 118.796194] [<ffffffff811b6d3b>] __balance_callback+0x45/0x5c [ 118.796198] [<ffffffff818b0d96>] __schedule+0xaf6/0xbb0 [ 118.796201] [<ffffffff818b0e8c>] schedule+0x3c/0x90 [ 118.796204] [<ffffffff818b6053>] schedule_hrtimeout_range_clock+0xe3/0x140 [ 118.796207] [<ffffffff811125c0>] ? hrtimer_init+0x230/0x230 [ 118.796211] [<ffffffff818b6047>] ? schedule_hrtimeout_range_clock+0xd7/0x140 [ 118.796215] [<ffffffff818b60c3>] schedule_hrtimeout_range+0x13/0x20 [ 118.796218] [<ffffffff81261604>] poll_schedule_timeout+0x54/0x80 [ 118.796221] [<ffffffff81262e67>] do_sys_poll+0x3a7/0x510 [ 118.796225] [<ffffffff8103d319>] ? sched_clock+0x9/0x10 [ 118.796228] [<ffffffff8129ac30>] ? ep_poll_callback+0x120/0x360 [ 118.796232] [<ffffffff8103d319>] ? sched_clock+0x9/0x10 [ 118.796235] [<ffffffff810d60c0>] ? __wake_up_sync_key+0x50/0x60 [ 118.796239] [<ffffffff812617d0>] ? poll_select_copy_remaining+0x150/0x150 [ 118.796242] [<ffffffff812617d0>] ? poll_select_copy_remaining+0x150/0x150 [ 118.796246] [<ffffffff8103d319>] ? sched_clock+0x9/0x10 [ 118.796249] [<ffffffff811eeced>] ? __might_fault+0x4d/0xa0 [ 118.796253] [<ffffffff810e497d>] ? __lock_acquire+0x40d/0x1940 [ 118.796257] [<ffffffff8103d319>] ? sched_clock+0x9/0x10 [ 118.796260] [<ffffffff81261a2a>] ? poll_select_set_timeout+0x5a/0x90 [ 118.796264] [<ffffffff8111a3a4>] ? ktime_get_ts64+0x84/0x180 [ 118.796268] [<ffffffff810e424d>] ? trace_hardirqs_on+0xd/0x10 [ 118.796271] [<ffffffff8111a3d6>] ? ktime_get_ts64+0xb6/0x180 [ 118.796275] [<ffffffff81261a2a>] ? poll_select_set_timeout+0x5a/0x90 [ 118.796278] [<ffffffff81263095>] SyS_poll+0x65/0xf0 [ 118.796281] [<ffffffff818b7080>] entry_SYSCALL_64_fastpath+0x23/0xc1 [ 128.972478] NMI watchdog: Watchdog detected hard LOCKUP on cpu 0dModules linked in: intel_powerclamp coretemp kvm_intel kvm irqbypass crct10dif_pclmul crc32_pclmul ghash_clmulni_intel arc4 aesni_intel iwldvm aes_x86_64 lrw gf128mul mac80211 glue_helper ablk_helper cryptd joydev iwlwifi snd_hda_codec_hdmi serio_raw snd_hda_codec_conexant snd_hda_codec_generic snd_hda_intel intel_ips snd_hda_codec cfg80211 snd_hda_core snd_hwdep snd_pcm snd_seq_midi snd_seq_midi_event snd_rawmidi thinkpad_acpi snd_seq lpc_ich snd_seq_device mei_me snd_timer nvram mei snd soundcore mac_hid shpchp netconsole configfs parport_pc ppdev lp parport autofs4 hid_generic nouveau mxm_wmi i2c_algo_bit ttm drm_kms_helper syscopyarea firewire_ohci sysfillrect usbhid sysimgblt ahci fb_sys_fops e1000e psmouse hid libahci sdhci_pci firewire_core crc_itu_t sdhci drm ptp pps_core video wmi [ 128.972479] irq event stamp: 1026850 [ 128.972480] hardirqs last enabled at (1026849): [<ffffffff811243a6>] tick_nohz_idle_enter+0x46/0x80 [ 128.972481] hardirqs last disabled at (1026850): [<ffffffff810d6a7d>] cpu_startup_entry+0xcd/0x450 [ 128.972481] softirqs last enabled at (1026834): [<ffffffff8108b181>] _local_bh_enable+0x21/0x50 [ 128.972482] softirqs last disabled at (1026833): [<ffffffff8108c2b2>] irq_enter+0x72/0xa0