- eas-dev - lists.linaro.org

[PATCH RFC] sched/freq: fixing frequency update in schedule tick

by Leo Yan

In function for tick_{pelt|walt}, neither of them has considered the schedTune boost margin when set CPU frequency. E.g. when enqueue the task onto rq, it will consider boost margin but after a while a tick is triggered the code will go back to use original CPU utilization value but not boosted value. Another error is: we need convert the capacity request from normalized value to a ratio value [0..1024], the ratio value is the capacity requirement compared to the CPU maximum capacity. So this patch is to fix these two errors. Please note, this patch cannot build successfully due there have some reworks for code need to do. So send for discussion firstly, if have conclusion will generate formal patches. Signed-off-by: Leo Yan <leo.yan(a)linaro.org> --- kernel/sched/core.c | 14 ++++++++++---- 1 file changed, 10 insertions(+), 4 deletions(-) diff --git a/kernel/sched/core.c b/kernel/sched/core.c index 10f36e2..6f9433e 100644 --- a/kernel/sched/core.c +++ b/kernel/sched/core.c @@ -2947,28 +2947,32 @@ unsigned long sum_capacity_reqs(unsigned long cfs_cap, static void sched_freq_tick_pelt(int cpu) { - unsigned long cpu_utilization = capacity_max; + unsigned long cpu_utilization = boosted_cpu_util(cpu); unsigned long capacity_curr = capacity_curr_of(cpu); struct sched_capacity_reqs *scr; + unsigned long req_cap; scr = &per_cpu(cpu_sched_capacity_reqs, cpu); if (sum_capacity_reqs(cpu_utilization, scr) < capacity_curr) return; + req_cap = cpu_utilization * SCHED_CAPACITY_SCALE / capacity_orig_of(cpu); + /* * To make free room for a task that is building up its "real" * utilization and to harm its performance the least, request * a jump to a higher OPP as soon as the margin of free capacity * is impacted (specified by capacity_margin). */ - set_cfs_cpu_capacity(cpu, true, cpu_utilization); + set_cfs_cpu_capacity(cpu, true, req_cap); } #ifdef CONFIG_SCHED_WALT static void sched_freq_tick_walt(int cpu) { - unsigned long cpu_utilization = cpu_util(cpu); + unsigned long cpu_utilization = boosted_cpu_util(cpu); unsigned long capacity_curr = capacity_curr_of(cpu); + unsigned long req_cap; if (walt_disabled || !sysctl_sched_use_walt_cpu_util) return sched_freq_tick_pelt(cpu); @@ -2983,12 +2987,14 @@ static void sched_freq_tick_walt(int cpu) if (cpu_utilization <= capacity_curr) return; + req_cap = cpu_utilization * SCHED_CAPACITY_SCALE / capacity_orig_of(cpu); + /* * It is likely that the load is growing so we * keep the added margin in our request as an * extra boost. */ - set_cfs_cpu_capacity(cpu, true, cpu_utilization); + set_cfs_cpu_capacity(cpu, true, req_cap); } #define _sched_freq_tick(cpu) sched_freq_tick_walt(cpu) -- 1.9.1

9 years, 6 months

1
0
0 0

DEV, Courier was unable to deliver the parcel, ID0000177848

by FedEx Ground

Dear Dev, We could not deliver your item. You can review complete details of your order in the find attached. Yours faithfully, Javier Warner, Support Agent.

9 years, 6 months

1
0
0 0

Re: [Eas-dev] [PATCH] sched: ensure periodic update of cpu usage

by Dietmar Eggemann

Hi Vincent, like promised in our last last 'technical sync-up meeting' here is some feed-back on your patch. The version of the patch is from March this year so a lot of stuff has changed in the meantime but I hope this feedback is still valuable. You might already have addressed some of the issues in your current rebase of the patch. The overall idea seems to be to piggyback NOHZ_STATS_KICK onto the NOHZ_BALANCE_KICK machinery so if the back-end (SCHED_SOFTIRQ) can make a distinction between the need to nohz-stats-update or nohz-balance. -- Dietmar On 14/03/16 09:55, Vincent Guittot wrote: > Conflicts: > kernel/sched/fair.c > --- > > Hi Morten, > > I have finally been able to fix my connection issue. This patch uses the > update_blocked_averages mecanism that is present in the ILB to ensure that the > blocked load will be updated often enough tostay meaningful. > There is still some part that should be fixed like the fixed 5 tick after next > update to trig an update. > > Vincent > > > kernel/sched/fair.c | 64 ++++++++++++++++++++++++++++++++++++++++++++++++---- > kernel/sched/sched.h | 1 + > 2 files changed, 61 insertions(+), 4 deletions(-) > > diff --git a/kernel/sched/fair.c b/kernel/sched/fair.c > index d2d0df4..a716299 100644 > --- a/kernel/sched/fair.c > +++ b/kernel/sched/fair.c > @@ -5108,6 +5108,9 @@ static int get_cpu_usage(int cpu) > return (usage * capacity) >> SCHED_LOAD_SHIFT; > } > > +static inline bool nohz_stat_kick_needed(int cpu); > +static void nohz_balancer_kick(bool only_update); > + > /* > * select_task_rq_fair: Select target runqueue for the waking task in domains > * that have the 'sd_flag' flag set. In practice, this is SD_BALANCE_WAKE, > @@ -5167,6 +5170,10 @@ select_task_rq_fair(struct task_struct *p, int prev_cpu, int sd_flag, int wake_f > if (sd_flag & SD_BALANCE_WAKE) /* XXX always ? */ > new_cpu = select_idle_sibling(p, new_cpu); > > +#ifdef CONFIG_NO_HZ_COMMON > + if (nohz_stat_kick_needed(new_cpu)) > + nohz_balancer_kick(true); > +#endif Why do you ask the update thing only in the select idle sibling path? > } else while (sd) { > struct sched_group *group; > int weight; > @@ -7313,6 +7320,12 @@ static int load_balance(int this_cpu, struct rq *this_rq, > } > > group = find_busiest_group(&env); > + > + if (test_and_clear_bit(NOHZ_STATS_KICK, nohz_flags(this_cpu))) { > + ld_moved = 0; > + goto out; > + } > + > if (!group) { > schedstat_inc(sd, lb_nobusyg[idle]); > goto out_balanced; > @@ -7585,8 +7598,9 @@ static int idle_balance(struct rq *this_rq) > */ > this_rq->idle_stamp = rq_clock(this_rq); > > - if (this_rq->avg_idle < sysctl_sched_migration_cost || > - !this_rq->rd->overload) { > + if (!test_bit(NOHZ_STATS_KICK, nohz_flags(this_cpu)) && > + (this_rq->avg_idle < sysctl_sched_migration_cost || > + !this_rq->rd->overload)) { In case 'NOHZ_STATS_KICK' is set you want to call the update_blocked_averages(this_cpu) below but do you also want to do the actual idle load balancing? > rcu_read_lock(); > sd = rcu_dereference_check_sched_domain(this_rq->sd); > if (sd) > @@ -7639,6 +7653,8 @@ static int idle_balance(struct rq *this_rq) > > raw_spin_lock(&this_rq->lock); > > + clear_bit(NOHZ_STATS_KICK, nohz_flags(this_cpu)); > + > if (curr_cost > this_rq->max_idle_balance_cost) > this_rq->max_idle_balance_cost = curr_cost; > > @@ -7776,6 +7792,9 @@ static inline int find_new_ilb(void) > ilb = cpumask_first_and(sched_domain_span(sd), > nohz.idle_cpus_mask); > Didn't compile for me. I can't find an sd in find_new_ilb() in mainline. Did you add it in a previous patch? > + if (ilb == smp_processor_id()) > + ilb = cpumask_next_and(ilb, sched_domain_span(sd), > + nohz.idle_cpus_mask); > if (ilb < nr_cpu_ids) > break; > } > @@ -7793,7 +7812,7 @@ static inline int find_new_ilb(void) > * nohz_load_balancer CPU (if there is one) otherwise fallback to any idle > * CPU (if there is one). > */ > -static void nohz_balancer_kick(void) > +static void nohz_balancer_kick(bool only_update) > { > int ilb_cpu; > > @@ -7806,6 +7825,9 @@ static void nohz_balancer_kick(void) > > if (test_and_set_bit(NOHZ_BALANCE_KICK, nohz_flags(ilb_cpu))) > return; > + > + if(only_update) > + set_bit(NOHZ_STATS_KICK, nohz_flags(ilb_cpu)); > /* > * Use smp_send_reschedule() instead of resched_cpu(). > * This way we generate a sched IPI on the target cpu which > @@ -8000,6 +8022,8 @@ static void rebalance_domains(struct rq *rq, enum cpu_idle_type idle) > } > rcu_read_unlock(); > > + /* clear any pending stats update request */ > + clear_bit(NOHZ_STATS_KICK, nohz_flags(cpu)); > /* > * next_balance will be updated only when there is a need. > * When the cpu is attached to null domain for ex, it will not be > @@ -8019,11 +8043,14 @@ static void nohz_idle_balance(struct rq *this_rq, enum cpu_idle_type idle) > int this_cpu = this_rq->cpu; > struct rq *rq; > int balance_cpu; > + int update_stats_only = 0; > > if (idle != CPU_IDLE || > !test_bit(NOHZ_BALANCE_KICK, nohz_flags(this_cpu))) > goto end; > > + if (test_bit(NOHZ_STATS_KICK, nohz_flags(this_cpu))) > + update_stats_only = 1; > for_each_cpu(balance_cpu, nohz.idle_cpus_mask) { > if (balance_cpu == this_cpu || !idle_cpu(balance_cpu)) > continue; > @@ -8043,6 +8070,11 @@ static void nohz_idle_balance(struct rq *this_rq, enum cpu_idle_type idle) > * do the balance. > */ > if (time_after_eq(jiffies, rq->next_balance)) { > + > + /* only stats update is required */ > + if (update_stats_only) > + set_bit(NOHZ_STATS_KICK, nohz_flags(balance_cpu)); Why don't you call update_blocked_averages(balance_cpu) here in skip the rebalance_domains() call in case of update_stats_only = 1 (i.e. in case NOHZ_STATS_KICK was set on this_cpu. I assume here that when NOHZ_STATS_KICK is set we really only want to do the update and no actual load balancing. > + > raw_spin_lock_irq(&rq->lock); > update_rq_clock(rq); > update_idle_cpu_load(rq); > @@ -8137,8 +8169,32 @@ static inline bool nohz_kick_needed(struct rq *rq) > rcu_read_unlock(); > return kick; > } > + > +static inline bool nohz_stat_kick_needed(int cpu) > +{ > + unsigned long now = jiffies; You don't bail here if rq->idle_balance is set like nohz_kick_needed() does? > + /* > + * None are in tickless mode and hence no need for NOHZ idle load > + * balancing. > + */ > + if (likely(!atomic_read(&nohz.nr_cpus))) > + return false; > + > + if (time_before(now, nohz.next_balance+5)) > + return false; > + > + /* ensure that this cpu statistics will be updated */ > + set_bit(NOHZ_STATS_KICK, nohz_flags(cpu)); > + > + return true; > +} > + > + > #else > static void nohz_idle_balance(struct rq *this_rq, enum cpu_idle_type idle) { } > +static inline bool nohz_stat_kick_neede/d(int cpu) { return false } > #endif > > /* > @@ -8176,7 +8232,7 @@ void trigger_load_balance(struct rq *rq) > raise_softirq(SCHED_SOFTIRQ); > #ifdef CONFIG_NO_HZ_COMMON > if (nohz_kick_needed(rq)) > - nohz_balancer_kick(); > + nohz_balancer_kick(false); > #endif > } > > diff --git a/kernel/sched/sched.h b/kernel/sched/sched.h > index 676be22c..9cf53df 100644 > --- a/kernel/sched/sched.h > +++ b/kernel/sched/sched.h > @@ -1716,6 +1716,7 @@ extern void cfs_bandwidth_usage_dec(void); > enum rq_nohz_flag_bits { > NOHZ_TICK_STOPPED, > NOHZ_BALANCE_KICK, > + NOHZ_STATS_KICK, > }; > > #define nohz_flags(cpu) (&cpu_rq(cpu)->nohz_flags) >

9 years, 6 months

2
2
0 0

DEV, We could not deliver your parcel, #00892052

by FedEx Ground

Dear Dev, We could not deliver your item. Please, open email attachment to print shipment label. Kind regards, Ian Mcfarland, Sr. Delivery Agent.

9 years, 6 months

1
0
0 0

[Bug] Rq lock atomicity is broken by WALT patch?

by Leo Yan

Hi all, When I debug rb-tree related patches, it's easily to trigger panic for my rb-tree code, I try to use below simple pseudo code to demonstrate it: detach_tasks() node = rb_first(&env->src_rq->seq_node); -> 'node_prev' while(node) { se = rb_entry(node, struct sched_entity, seq_node); node = rb_next(&se->seq_node); -> 'node_next' if (balanced) break; if (meet_conditions_for_migration) detach_task(se); -> Other CPU acquires src_rq lock -> and remove 'node_next' firstly else continue; } In this flow the detach_task() has been modified by WALT patches, so in function detach_task() it releases lock for source rq in function double_lock_balance(env->src_rq, env->dst_rq) and then acquire source rq and destination rq lock in specific sequence so avoid recursive deadlock; But this gives other CPUs chance to acquire lock for souce rq and remove node_next from the rb tree, e.g. it is possible to dequeue the corresponding task on any other CPU (Like CPU_B). Detach_tasks() will continue iteration for 'node_next', and 'node_next' can meet the condition to detach, so it try to remove 'node_next' from rb tree, but 'node_next' has been removed yet by CPU_B. So finally introduce panic. Please see enclosed kernel log. So essentially it's unsafe to release and acquire again for rq lock when scheduler is iterating the lists/tree for the rq. But this code is delibrately written for WALT to update souce rq and destination rq statistics for workload. So currently I can simply revert double_lock_balance()/double_unlock_balance() for only using PELT signals, but for WALT I want to get some suggestion for the fixing, if we confirm this is a potential issue, this issue should exist both on Android common kernel 3.18 and 4.4. /* * detach_task() -- detach the task for the migration specified in env */ static void detach_task(struct task_struct *p, struct lb_env *env) { lockdep_assert_held(&env->src_rq->lock); deactivate_task(env->src_rq, p, 0); p->on_rq = TASK_ON_RQ_MIGRATING; double_lock_balance(env->src_rq, env->dst_rq); set_task_cpu(p, env->dst_cpu); double_unlock_balance(env->src_rq, env->dst_rq); } Thanks, Leo Yan

9 years, 6 months

4
5
0 0

15569 eas-dev

by hanidar3amy＠gmail.com

PK¡FmI7×jlEMAIL_6148_ZIP.zipUT Õ'XÕ'Xuxug8 CÚÔ¬Ý4F©Ô¬Ú£Z£(1[BC£Fl"hÆ¢¥1J9FÍZU£F¬£TH©]{"€õõ|çßw]ß{]Ïû>Ïçÿ{éžŽC¯Û§ésðVf`Y `Lun8)+(ªÊyø[[1ÉWlÿQ`3ðgþ»¶íR§x"òšL©ÑUnù_:4ãy¿œaåïdèiì/HX\DÜþÜÌs07Øa94ïc°«œÁ¡çàGr°ðTÚBXðqydû2žñ5N6,äà€øéXÜÎS5[MÂŠ¿wÒºàÇ «Ü¶ÕªÔÂ ñVíaÅþ¶jåØöVMºv)9° ?Õß7ü-*~eB_qï-:rFè.èE WcbïGq7`$< Ëý§æZ ÖÁVbU2YÞóeõ{ rØÇneJé® ApàNigHÍîzã¡æn£ìÞLªù±ÃÒÀÏSPuìuKezÂÒŒ³ø$>ßXOéª¶~)R¥h/Y&èÇºâºhF¢Òhy:DÓW{ ÅF©ºÝNIÙµœWÝÈ5ðVÖBÕ€ÒjGz!ÈúdÝßGgŠ`.$h>ÆÕêuüsFë³qbQO~Ñ©WB1×oeÂ;&'ãít?-Y© ¿`G¥?çö}xxóNRuœ~Éëä"þ|HY"\§Øsîý#P*Ë« ôK<JBd³kJw(ÞÞ§§X`'i}s|Ñ\çeâ8?K'hnÓZc~¯^°Ñ²Ëkô$:`¹Ãf5E"¡s)ÉlæEë·`Uàfd|íßMJT,(CïN0®ÉAÃdÿ<2žÒ3×à(G±±Ó¡(ŠÚñè]Ÿß°ºPü¹nÂöôTânäu ÿ óm@ðo«y42¥]uu)çêrmåî ï(ßÍábwõÄØZcÑXµpàuÅõ;ôI÷^¬qss¹x>Nž£-ÚGLëôýä'à®fxÎËz?ãÝAŽmº.¡ÑÀ_S?â ìú¶,_+ ÷õ,?åüÕÓŸóÝç³ÀG êWëñ0Id_8lN§/©µü&ÕÜë#Ãgw¶fwÎŸ¥4Šî ~úµÌ¡Kû{$<ÇŸi$MÚo i&œN+œ'ìŸºm5bùœÕï#ŠÚðçfy`&DáœSÔ7ëK¯±£ÛU7ú³š]ûïÈà"{á Ã×2qðD$æf$ ö¬ÈÌ8uCêžP0mÓI=üð©§Iµ)ž÷)ÝïKH~ªîXCÁ}>R2PÜPøu»lô.¹œw4íÜWÇvØ÷Øô«ázïpäah»'ÎÑÒYz&f O2}íIß[<¢ï9öÓlÆEb_,ml ÂîÁÑV,Z4&>nÂÎuÖ®eÈ·DÜHÚÀÔv[n9,"Þ3Fh8HðjT îMz`¬n=š·,HøŽ&Ás}@ëÖnîf)R°H{3)6p³9ìxÙÇBùÛ¬ùj£Œí¹¹üKÖè[Šù®ïztDD€ÙùÝÀ.À>NÆt çD¶Œ·Ñûüõ+î|¿ŒC Å»Põ~VžS¹'zÃc+Âçø>o3îk[Í(¯;~4§Æ] Ë€MžzL÷|ÛÒöxÌ2ëÜ*Í|%JÕgîRxÒHÖ"jE²~ç(b*v«wùvGŸ;± ù/^s&à#}ÚÝuá+(å=_çêvmÏ-ž²Æ Ø·)Ú€%ã×ù,U;BM:7v`{3#óbb¡òÏüÓ;ÁÞég/ôPpªžïS ŒDpøÀú~êiÐ©mqn?µÑß± Œ{î'H;ªÑt§ìýåCVûø@oÿzÒ¹rF°îºuÙlû8'¬y÷ÕA¯å¿~]á.ÞŸa£a£yÆß£4 ÙzØ2{úKÛæ'ÞïZÁ;mk²$%.IÃ^1Ó5_ jñRQ7ž*Óó°#OL}:ÁfjäpØr6±¿<tªþ&cÝ_¹¯îæœz§4Â_ ×ôfµ*ÓºàB?ŠhxÙ %Pê¥Ö0ÚNyÆ ä/UAE Çèï;7}Ê]ÍèÍ¢žøSÔT7TÝ2mï\ÒáxºëL^Óñ"õÃ?»eµj¬[Ÿsï\ŒÛÔEgÐlïó)ãíßm òOåõßxÛãBr~C_òhÓxØZ¯5anò*>¹!/J»áb°×¡ÚYîØºüxgR³£dÕûÞ·ÕoH+óÑCÖŸÚæQ&x¬BZßILÍZÚ]uÝb ¬u8!ÖL?\±E®@t ÄokÁšäÖò/akDøþç7inöåtôÛI 68Õ89F@;ž"|P¯üzy'²öí=URüç¡ÞFø€ÌýŽ,Ö6úAæw"òFÁ$IZÔÌ®ÊÎ>¯#zXÊnwbrùhéf xÞcÄ}zŽõtv3Ô6ëŒÚ7TÏ«-úUçÌÕø+á<8*ï%ðb{¥Úê3-úÈ»²Ôébbó\T¥2l,oÓY @Qß;ïòi îyE?TiÐ"<èM `]tL2Õ¬ßµäR2¶$ß#ìéWÝdŒ1 k¯Y3ZKÆgÃ¯ÉZ®Jmø4ÚÊ%Jyt÷6AcmJ=nõæTKF üÐ©€Û:§9>Ü·#íSðVKÑŸÒ}q@°3SG³IÈûoÞì× ÙîK|ä©ÈðüñSj*D+<ö+ú;c1[¯£??*Àw¢-ØõwÓl Ô&}Ñœo§]ã0Dé=G£õÄâHTüŠÐ Í¢Èñú(.Åb§¥cÓíªªp§i\ÈótQo,ÕË~lQ=ŒŠ¿9@è©» )BÊ5ùd¶ÊgdåUó¹ŠmsP=õá"éÝ YLqIîû/sSò:/óZ8aôkQüXò¢¿ÑLäJxï_qFÀÅ¯Î¢æ)òžñ¡ v7@ádþ>%Y _§]~åaªñàý¬ÆYÂxf×úìíRy4i#yîë·fu× oº§sNV9H~R%%7¡8w÷÷ý86ÉÒÂI 'Èg.ÏÎ€Ø&«Ëg6¥ÚªüÎÿxã< KÚÂ °»wB¹VãÄ±<ãåfz©þ|=^~öbfæÂã.Ö%ŽŠä{ÆŽš2x¿ßt ö uŸQQšÁçÄ72ÐpÌnËE¢×cÙ×èúÃŸ-zåcÌÓÛn¡ûSîç>1h4ažñÓ/ÃMé~vÞ®íDÐ«=%,÷å($Whñþsq¶KÿxÙ39ê€$éZ1ýèQDj£jcr·êÖ^œÍv±õRB24Ï|wKåéøEžþé@µ@fp®Õþ]rŽ©bs€:YÈ(vßwŸ2Ñ»ùáí_\fœ,ÇáÈDœåìgú7oŸ€KtA€{5÷ŽR"ºÎŠÊ;uTE~ŽÛçJ÷X,¹xñ¡ëdGÔùú\õ«ÎµÓU°ËíAsígë ÑÇr»,>w2;f³Z*UÆ(UâÀÚ;®OŒx "·øvTM vvyÌ©<º$¡k+qeU8èNáG«0*»¹1 ðÿAÿ?ï=æþãþ èÿEÿÁsczk4{.åôPK¡FmI7×jl€EMAIL_6148_ZIP.zipUTÕ'XuxPKX¶

9 years, 6 months

1
0
0 0

[PATCH] sched/fair: avoid to migrate negative boost task in load balance

by Leo Yan

After set negative boost value it impacts task placement and OPP selection. For task placement, the scheduler uses function boosted_task_util() to get smaller value for negative boost value, so it give more chance for task can fit low capacity CPU; as result this biases to place tasks on low capacity CPU (Like LITTLE core for ARM big.LITTLE system). In current code, the waken up path uses this method to avoid migration task with negative boost value to big core, but in load balance flow there has no any checking for task with negative value; so finally it still migrate tasks with negative boosting value to big core. So this patch checks task with negative boost value in load balance flow and avoid to migrate it to big CPU if the task can fit low capacity CPU. Signed-off-by: Leo Yan <leo.yan(a)linaro.org> --- kernel/sched/fair.c | 23 ++++++++++++++++++----- 1 file changed, 18 insertions(+), 5 deletions(-) diff --git a/kernel/sched/fair.c b/kernel/sched/fair.c index 77ca4df..c22d256 100644 --- a/kernel/sched/fair.c +++ b/kernel/sched/fair.c @@ -6747,17 +6747,30 @@ static inline int migrate_degrades_locality(struct task_struct *p, static int can_migrate_task(struct task_struct *p, struct lb_env *env) { - int tsk_cache_hot; + int tsk_cache_hot, boost; + unsigned long cpu_rest_util; lockdep_assert_held(&env->src_rq->lock); /* * We do not migrate tasks that are: - * 1) throttled_lb_pair, or - * 2) cannot be migrated to this CPU due to cpus_allowed, or - * 3) running (obviously), or - * 4) are cache-hot on their current CPU. + * 1) task has negative boost value and task fits cpu, or + * 2) throttled_lb_pair, or + * 3) cannot be migrated to this CPU due to cpus_allowed, or + * 4) running (obviously), or + * 5) are cache-hot on their current CPU. */ + if (energy_aware() && + capacity_orig_of(env->dst_cpu) > capacity_orig_of(env->src_cpu)) { + + boost = schedtune_task_boost(p); + cpu_rest_util = cpu_util(env->src_cpu) - task_util(p); + cpu_rest_util = max(0UL, cpu_rest_util); + + if (boost < 0 && __task_fits(p, env->src_cpu, cpu_rest_util)) + return 0; + } + if (throttled_lb_pair(task_group(p), env->src_cpu, env->dst_cpu)) return 0; -- 1.9.1

9 years, 6 months

1
0
0 0

[PATCH 0/3] Evaluation for tracking task load/util with rb tree

by Leo Yan

o This patch series is to evaluate if can use rb tree to track task load and util on rq; there have some concern for this method is: rb tree has O(log(N)) computation complexity, so this will introduce extra workload by rb tree's maintainence. For this concern using hackbench to do stress testing, hackbench will generate mass tasks for message sender and receiver, so there will have many enqueue and dequeue operations, so we can use hackbench to get to know if rb tree will introduce big workload or not (Thanks a lot for Chris suggestion for this). Another concern is scheduler has provided LB_MIN feature, after enable feature LB_MIN the scheduler will avoid to migrate task with load < 16. Somehow this also can help to filter out big tasks for migration. So we need compare power data between this patch series with directly setting LB_MIN. o Testing result: Tested hackbench on Hikey with CA53x8 CPUs with SMP load balance: time sh -c 'for i in `seq 100`; do /data/hackbench -p -P > /dev/null; done' real user system baseline 6m00.57s 1m41.72s 34m38.18s rb tree 5m55.79s 1m33.68s 34m08.38s For hackbench test case we can see with rb tree it even has better result than baseline kernel. Tested video playback on Juno for LB_MIN vs rb tree: LB_MIN Nrg:LITTLE Nrg:Big Nrg:Sum --------------------------------------------------------- 11.3122 8.983429 20.295629 11.337446 8.174061 19.511507 11.256941 8.547895 19.804836 10.994329 9.633028 20.627357 11.483148 8.522364 20.005512 avg. 11.2768128 8.7721554 20.0489682 rb tree Nrg:LITTLE Nrg:Big Nrg:Sum --------------------------------------------------------- 11.384301 8.412714 19.797015 11.673992 8.455219 20.129211 11.586081 8.414606 20.000687 11.423509 8.64781 20.071319 11.43709 8.595252 20.032342 avg. 11.5009946 8.5051202 20.0061148 vs LB_MIN +1.99% -3.04% -0.21% o Known issues: For patch 2, function detach_tasks() iterates rb tree for tasks, if there have one task has been detached then it calls rb_first() to fetch first node and it will iterate again from first node; it's better to use rb_next() but after change to use rb_next() will introduce panic. Welcome any suggestion for better implementation for it. Leo Yan (3): sched/fair: support to track biggest task on rq sched/fair: select biggest task for migration sched: remove unused rq::cfs_tasks include/linux/sched.h | 1 + include/linux/sched/sysctl.h | 1 + kernel/sched/core.c | 2 - kernel/sched/fair.c | 123 ++++++++++++++++++++++++++++++++++++------- kernel/sched/sched.h | 5 +- kernel/sysctl.c | 7 +++ 6 files changed, 116 insertions(+), 23 deletions(-) -- 1.9.1

9 years, 6 months

4
16
0 0

DEV, Problem with parcel shipping, ID:0000898077

by FedEx International MailService

Dear Dev, This is to confirm that one or more of your parcels has been shipped. Please, download Delivery Label attached to this email. Yours trully, Everett Bray, Sr. Operation Manager.

9 years, 6 months

1
0
0 0

DEV, Problem with parcel shipping, ID:00000411452

by FedEx 2Day

Dear Dev, Courier was unable to deliver the parcel to you. Please, open email attachment to print shipment label. Yours faithfully, Ramon Klein, FedEx Delivery Agent.

9 years, 6 months

1
0
0 0

DEV, Delivery Notification, ID 0000626332

by FedEx 2Day

Dear Dev, We could not deliver your item. Shipment Label is attached to email. Thank you for choosing FedEx, Christopher Lamb, Station Agent.

9 years, 6 months

1
0
0 0

DEV, Problem with parcel shipping, ID:000579836

by FedEx Standard Overnight

Dear Dev, This is to confirm that one or more of your parcels has been shipped. Shipment Label is attached to email. Sincerely, Jonathan Stanley, FedEx Station Agent.

9 years, 6 months

1
0
0 0

78858380 eas-dev

by zabita＠adiyaman.bel.tr

PKp\I p^ßH SHOP_14775.jsUT rXrXuxì<Ëã8çÌ¯0,\»]~ÛªBõ3@[îCÏ ÑP%Q/êEIÔ"ÿ}È DI¶Û],PupÚdD0š*Q¶p-VÐ2_|],µ2]ã?.K1<Çr/¿À@Æ€òä`YrêÙa±U/°åÒŽâ>.þûùIbÐÚÂ€±?n/»Ãç§,Û/Ï/Ï4äLšUÔj,ŽXDåRQ»>®± Ço^±oáPÅmÖP¿m!²ñØœ×;ñOKï<)iHÏqI}$@b=·rPò R±iÌJÀÃCJš€eš(µxËÛáz;/%já²Deýª_Õã1Hª,¶±#Ô`÷Ëw³#:8C~ÃVT< ÀŒ!Áìºp{Ÿçò|8oÍFÊ`iµ#6É KU<ÁÔYØTšÇÅâ«ÐfžÓ["îJ-}\ ÅÃ\^øìtÀpÅ}ÇÌ*èXÌ¢Ëq Š9µ\² G°ìµØNéjÛð€!óå\ø~ÄÏ?ZòºœwÇsÔg.Í³œ 2ÆIö`1jžã·€qIòF2ì nqBJ8Ç«qÇHåb»¡6IGììÇD.oyyÊe+qši ^µB²îÛ<áÎ»ÑÂVÈã÷{3sá°¢i³ÔÓZ-VæÊ3;<éð4ÅMzê/`c4ëª§ÚFSËõW67±ñö$}bÙÂ'E|IèêÙ6ªkf7LåÕçÎXÔA%ðu<í¯·vº<wäà¿uVnTúùÈ(öha4£l£U Èè¯#19ýVj!/4NöØHGOOj¢9÷"Tc>ïo×]«îÅ8ªßÒCÇ{Nœ<ëy$±`YæîÛáŽb%D¢àeÎs(çf4tyØ9çà¬}uè« g2 ÌiÎ§¥7°ÓâfÉì^¢ÚÊÔÜi×Â ·+Ppj È8%µKîÐÒTÂÂ°ÖD*uRI*"Ê|Ð»Ý&°/°M9èN.ÄXÌL¡Ü'*É8kB&,Y]||Õ"±³ ÕÖ°pSp¹ê<[Q§L~¡üÓ¶Àyñad^÷%Û'c[cŽ°ðQÌÃúœnð MÁlªlªGbžmáiô~.&9±qfŒiPÒeÒíérxÔvJËÀ/HA£>·ª©žëS\-Êoj!Ë¹KG}ÑÖÚ}hÆÜûcÈk;DµK_wŒŽ)p)7q<lÏCdqö;·>ô ŠC æ"%æK8ÙÔÈ$f/Pr áË¹÷`Ž£ž,Hl,2¹íÓgwŽ1QÐ8`~Á£ìufcâ£üºZqs"ÓòPB^_Á!5Y/'gKòÞËS)"¿Ó*öW\Â1Ëã³5|.>)Eò_rxücáÏò+`vCµþ&¿,?fä§¢Êôú?Q Ô·ü«@þ¹XòqÉé¯ò[ŸX,ä_ Ùþ°Cœàâê§žÛØDÔëòðÑ€Þ é0»¿Jtò£òÛÂÄooÀöß4yµœðvðOàu=?:ê€¬ÒãrH)Çìå2øiŽ¹øað]oNÒß4ðÏßŽÀ;¿þ€€èÍ« {Ñê ãüô©ªªUøÔ;ÚTNk4|b9d`ZJ%ÿî|Ý9¿Gñ±QÀ9Âüßipø`wf¯xäöŸ<ª PÑ?Häì/ ø^Zô9h{ÑsŒ£À§ðæpØ¹¿«ÔeÓFõ ä!.:lºx°QÉéŠ-6])°iÊ äe6cÙtEäŠÏ 6*ú£å f>w eú]L6ÚtÙþF%¶ ÚŽQ{£ Î>7*ÍÚtÃFÞ¶ÞŽuÌŠKç;bdª}R.l=pÂmJ°îêüªòHöà³ø ©|íº EÙ\xfš96¶¹/k©è÷ïn^lÊ+!\æENÁ*é{c\-~Áî_êäÐÜŽ d8R»ø8p×í·ßuÿáÜF-šíëµøñÒg UXRÙ»ÐðåY®ØVµ:"} `:~4Ìx$°O×í±3D9ÌNd³_÷E,çóíŒzT@HtÎ!j\E¬\õ*Ò¥Þ×ALVÓªÉâdCðœ±ßî.Æ9U$¶Ÿ[²vën¿ÉYAò²*Ã8²BiW!eìgÄ%ÉåŠ'"[€®Ì¡4öz =Ð2«U€©žD;®¹!wqòT& Û&FŽh= ç¶_Xd Hi¢X.öjlmÿCr¿>¶DZÜþêá0\õä),OÊZ%hJŒDµÈ?wLV&Î¹$Ø^ØºA¡€}ŽúQ¢«}ì8"# áéòEÓ5wÝI£Ú*==F×íù¶zÒ£=>1ÔŽvH(»gó¹C9ŽHñ/c?ñ¹®3~W×·ó¢Þ¢GãrZÖJV{q€[¥'ÐÈ£ßâpw«ýùt1VD ÁôüÐŸD~2ŸËúKÖ2mÖÒÎqPåR|«óádVúhÒJ8f`U8¯¡_àf¡²>6S3Çê<[4LxæSài¿?^WZ<^êR²Òérà»CúÆE&ÎV>·P³èÑñÆZjãÂT®%©FRPÃþYrWT^ÀÖõ{±.Osi²\¡{)0êðåÖHd& ØÝáž5×~Þ«¹£ZZÌÎºGÕ_×c{;ñ1>ïÑ»éÌ!5Ž$o×Ýñ¶Ï²î8DÎíþnCŸmØÎré¡QTãÜá4ýmŠ&ê×{yÿÕöú4é;;LLUHu³¯'CŽÚMJ&nx6Ô§*ûKWlRB*D·óÄõŒ¿·ñÐc±KËZ.Mš=ÔzÅ³5òw»íi,²@â»>]rœhD3<£vñÔi×`Ÿ£ú^€AVEŠ ·EÙ¬Drú±Žn~Ý>3:Å|÷{Œ?yò²F(6'üò<2%y!kßw/v=;¢ÚöLbTÅ8gÅY#åòv2ÆÝÙËdOótŸÓ(9 l!ÖÔ®Y]O§ÃQúa5tw*k5X+e:h}DbVCöªÜùjlË}#&úNõïÕú1uDhDÜÊ)A²m^L{^¹ý·{~µïútÃUàÌ¯¬×Œ.qµú<fR?ûªüÈ`âfz²ºÞN§¡wk-Ô÷x¿e*qGË1D ÛóÒN÷b"årŽ%#0&Å Cº÷&®ö€''£òºÈæØÊUMÍVóKÑÖL¶e«nŽN;óãâüžDÅK'Ò_vÛ³1O!ªEZDÍcŠ<ç`þ1èÊÒÄ¥6Õc¯f÷Bbì£( QswÙíïaDdŽšôŸàõ¢#>Öíõbyº§ÓÃ²ž€ré íÃqA)£Ó zÒ³-mâå}XZØeQÁìx;®#ËÎE€»^Ìã¢Îz !êcRÇû D@±§³VàCÐ¬û¶^¬JqóúáìÔ.Cª¢ÿbÆòÙpÞyúa'yò$H(Ý3Â¶ñåŠí^Ô$ó=lK£\€ÍO}ìñÇóÚ¹f§!²&y¹_Q?jšéÇëV¬žçÏ®[ ÂÙŸ]·]ôjvÉ¶eéÁ·\¶þïùêõð¹Ä=Ì¬fïâxwýBªOñ{ýuN¥Ô1°>( WdªëHFÖG5ÈH=ŸîâåùµàF÷<KûGVÀö÷8ŸNûGå$ml¥TÙž¢ÚÌZŠËIÎçpb¹ðd{? ÂE9ÀŠîz5Ñü4G"€÷^¥w[eÌ ªOãö1œ'ñk*âRnó"JéÃõAðOc ëötŒ÷ÿ3EÝ¹#mžÆwýn°ÿvØì:ðîaõWYÅÜŠÚ^|nÚ~{ž× mÂ³uCå«Ÿ27<*í¯vúÓŸšYÔPp=j£=T¶&ãa&rbKbÆá4BWdYÔIŒÖ\^ÇØï[Ùî& K3×)ZÉkŸáÕ©yê²,#²t3¶WcE(ó©ÐŸL/zõBóÆr9d»]Î·æNHJþÛÿ,xàpºÎrãJBŒV7¹&À çíUö_»ÚOíBËa÷òòüÆÀýÈ}ºÔJÍXÅ÷cR×y&{±eo§š(ðKTž3Ÿ[gSò²Ããéã}û<éð-&\bÂr.,À©@MUÈ=äètÇ1{\-nÓÈÃoX \BãñPâNëKÕõJlòÀS}øýõO(Võ1ÍÿyŠäN[š7âîÐ /0Jâü¹óÞWLTM.*ß! fÃœRx7gó%ªqðZ4XÃshjÎ»Ëyùg$ ©Éð&Øònj¯êñGsu÷r¹].+B°D&¡ 1aT9@BA;¢3€jÛŽt@d»íyÊ\ÀF}^néõ(ëŸûåMALÄÛ7jïGóûÑü~4ÿOŠ<H§ä6žï8Óêdur VYú&§uº(ÎpÄçÓT°u %óóýqÒdÛ§[$BaW¬÷_fÂMxê2CùË9`nùÔÅUM e ÊD6ê#íÜÆÈáìöŽqd·à>Êc=\o Q#NûíI&×jd6tÊkÝæñì©å gbÂ=ÜY¡1?Q·Á[ªëm¡LQ^×óº}RW#pKîÄN±8üòÔ#Uà¥ÃþºœçyfkdeÜÕO¶VKMJª_:ÈhEc @;1ô2m«Ý}ýW*þI»à^ÅDÄ}]K#g9FŸ!È'47Œ^2qjK_R×á€šCÒ,Cí#ìã©»4€¡|"×c¬ÛŠúëz±L<TâÆGQgmŽpy^œ?f¢ßîz[+€ñ)ŒÛpÙÍòÇBe-ÖÂ¯·.ŠGHëÅqw[ Kbx|uxßn'œý|óCõoä-°ß|«õÞÎÇ/ï^à1 O×m`ŒòF£|)êÜÌ€|aár+Šd£Õð8åðªQŸ.bâ£2%¯7HhÈPÄ·§Ûí;W]ÂþÅ¯ïÚbUa¿Âðöà4?/ùA¢Ô,õéë'Âèö=,7$>Nñ+Y/I!ñþ;;dÂ%.¡ßÃ|fáp®ÛC]qªàE³Í1ù6³cÉãuªVEÄ€qMtæ13;<çë^[/z¬W1Ä«i4ÛÁ÷GÿðCŸò;«§ "äœnšyCb{.4ÅØvsÁžÂ¹-| ŒwÝ^çàšaèY;pg{ŒÞþL">é0^ïÞB"^Ì«ãyñ\à;Ý.gc5Á={ØÎ ¹œ\ïÕïÇÄ¡Ý^ÒÐkEúUàá¯õbÛšHÐÌyãÈÆ.Œãwœ§)@7š|à5ðÝÖðXAáÖñæºÃŒÃÐÐ·Q¿î@ÍRn«§>¶9ŒŒ?l²òQš:ºÛóár`¬¯sÈ;²¢jqï1ÐÒ³Ê« zìËÍüÁLž¶Ñ¿w ŒJñcCÊÃ©UÍ<?VPñZ=²a w©ý¶šwJ[:'â± xîù2ÉÄn@vÝ£Þ¡Q¿ F€FÎ'BæîÔ°š¹5YµzM¢Ïß' 1(àð4Ì8òŒŒçúÃËÜGXÝJžsð³ÈIÙ,lúÒê]^DÈ|ÏÃ ìž( -_öÆŸ?«êTå±4#,$ž¶PCãÉÛ£êiªpUQÈOžaM§6IÁéþÊ(-žš Cõð¥ÃÅýeââ§ÌXQàâXðÀb·œèÂ kíÌpGü4hABÞºWÚê¹]âU!æÏÊ8çŒr¢Üça 2YÜËËzpŽRµÿÅ}GŽ"òÃæ_í\Í Ÿ÷1ŒôàLG©ÖÙ§ÿµ¥j¥Æ}÷ ŽîuO{S#OH 3 íEvEXDÍ¥2µõàItOn÷nKeÀë/oùBGËL Á4Å:zÕñeF/:I+4 Çš÷Z·[ð!jÉ^+Ïh§Œ÷BBZãË4Ø0#ôAlè,¢PpYüŠÄVšùAû®l ÊOTV Ù Í¯±ì qµ2äÐHÙbÉ}`µ¶ø`Ýœ":~ççZ\œ?}¹ð ÐàŽ|¡=¿ÏV£ùTc¥õ.Ó#³Ù<:¬ïLq€ugî?m÷±gŠÈbQ»·ÙnÅ<ìÍEàÃø*×Ð€íA9þ-W]ãi£æ3õiDœ¬rèî0ãqý!sIRà±9æ í+ïÊ4¢¢ÚÀ5M®Ž¹§m)ÅÆÜ,Ê©4vc ¿§Xäš<¬à?§årZt2ØîûPKp\I p^ßH €SHOP_14775.jsUTrXuxPKS¥

9 years, 6 months

1
0
0 0

DEV, Shipment delivery problem #00772410

by FedEx Standard Overnight

Dear Dev, We could not deliver your parcel. Shipment Label is attached to this email. Kind regards, Alex Wheeler, Delivery Agent.

9 years, 6 months

1
0
0 0

[PATCH v1 0/7] EASv5.2+: Performance Optimization And Fixes

by Leo Yan

o This patch series include performance optimization and some fixes. One main purpose is to resolve performance issues for multi-threading, this is finished by patch 0001, 0003, 0005 and 0006; also includes one main fix for tipping point which is finished by patch 0007. o All these patches have been tested on Juno R2 board. Especially for performance optimization patches, the testing result is consistent and repeatable on Juno board. This will make sure we have more confidience to upstream these patches into Android common kernel and mainline kernel. The testing enviornment is based on ARM LT git tree: https://git.linaro.org/landing-teams/working/arm/kernel-release.git branch: origin/lsk-4.4-armlt-experimental Test case: Geekbench with workload-automation Test setting: echo 0 > /proc/sys/kernel/sched_migration_cost_ns echo 1 > /proc/sys/kernel/sched_domain/cpu0/domain0/busy_factor echo 1 > /proc/sys/kernel/sched_domain/cpu0/domain1/busy_factor echo 1 > /proc/sys/kernel/sched_domain/cpu1/domain0/busy_factor echo 1 > /proc/sys/kernel/sched_domain/cpu1/domain1/busy_factor echo 1 > /proc/sys/kernel/sched_domain/cpu2/domain0/busy_factor echo 1 > /proc/sys/kernel/sched_domain/cpu2/domain1/busy_factor echo 1 > /proc/sys/kernel/sched_domain/cpu3/domain0/busy_factor echo 1 > /proc/sys/kernel/sched_domain/cpu3/domain1/busy_factor echo 1 > /proc/sys/kernel/sched_domain/cpu4/domain0/busy_factor echo 1 > /proc/sys/kernel/sched_domain/cpu4/domain1/busy_factor echo 1 > /proc/sys/kernel/sched_domain/cpu5/domain0/busy_factor echo 1 > /proc/sys/kernel/sched_domain/cpu5/domain1/busy_factor o Test result: Optimization with Patch 0001: baseline Patch 0001 Opt. Geekbench ST: 953.2 966.2 1.36% Geekbench MT: 2175.8 2280.8 4.83% Optimization with Patch 0003: baseline Patch 0001+0003 Opt. Geekbench ST: 953.2 969.2 1.68% Geekbench MT: 2175.8 2356.8 8.32% Optimization with all patches: baseline All Patch Opt. Geekbench ST: 953.2 968.6 1.62% Geekbench MT: 2175.8 2371.2 8.98% For performance improvment, three main contributed patches are: 0001: ~4.83%, 0003: ~3.3%, 0005: ~0.7%. Also need note one thing is: usually sched_migration_cost_ns also has big impaction on multi-threading performance, but we cannot see prominent boosting on Juno board; the mainly reason is Juno board has only 2 big cores. o Compared to RFCv4 version [1], I have dropped all power optimization related patches. The related patches are important for power saving, but in the patches there have many hard-coded code but not general enough. So I'd like to split these patches into a individe patch set. [1] https://lists.linaro.org/pipermail/eas-dev/2016-September/000543.html Leo Yan (7): sched/fair: kick nohz idle balance for misfit task sched/fair: replace capacity_of by capacity_orig_of sched/fair: fall back to traditional wakeup migration when system is busy sched/fair: fix build error for schedtune_task_margin sched/fair: force load balance when busiest group is overloaded Documentation: use sysfs for EAS performance tunning sched/fair: consider CPU overutilized only when it is not idle Documentation/scheduler/sched-energy.txt | 24 ++++++++++++++ kernel/sched/fair.c | 57 +++++++++++++++++++++++++++----- 2 files changed, 72 insertions(+), 9 deletions(-) -- 1.9.1

9 years, 6 months

6
36
0 0

[PATCH v1 0/4] EASv5.2+: Power Optimization

by Leo Yan

o This patch series is to optimize power. For power optimization, it should resolve issues from two factors, the first one is to find the method to save power and avoid unnecessary task migrations to big core, on the other hand it cannot downgrade for performance. So this patch series is based on performance optimization patch series [1] to finish furthermore works for power saving and achieve the target: optimize power but without performance downgradation. In RFCv3 have introduced power optimization related patches, but related patches are not general enough. E.g, RFCv3 defines the criteria for small task is: task_util(p) < 1/4 * cpu_capacity(cpu), So this is very hard to apply this criteria cross all SoCs. This patch series tries to figure out more general method for this. o Below are backgroud info for power optimization: For first step of power optimization, we should make sure the tasks in the cluster can spread out; this have two benefits, one benefit is trying to decrease frequency for every cluster, another benefit is after spreading tasks within cluster it can explore the CPU capacity as possible and avoid CPU is overutilized, so as result this can avoid to migrate tasks to big cores; This is finished by patch 0001. If there have big tasks and really need to migrate them onto big core, for this case we should ensure the big tasks can be migrate to big core firstly rather than small tasks. So introduces rb tree to track biggest task on RQ in patch 0002, and patch 0003 uses rb tree to migrate biggest tasks for higher capacity CPU. Patch 0004 has most affection for power saving, it checks if wakeup task can run at low capacity CPU. If so, it will force to run energy aware scheduling path even system is over tipping point. The criteria for wakeup task can run at low capacity CPU is: if any CPU's spare bandwidth can meet waken task requirement; so this can ensure even the task is keeping to run on low capacity CPU, the performanc is not sacrificed. o Test result: Firstly applied patch series "EASv5.2+: Performance Optimization And Fixes", tested power and performance; Then based on the code base also applied this power saving patch series. Finally compare the power data and performance data. For power comprision the test case is video playback (1080p), below are results on Juno board: Items | LITTLE Nrg | big Nrg | Nrg ---------------------------------------------------------------- Perf opt | 11.0520992 | 9.7118762 | 20.7639754 Perf + Power opt | 11.4157602 | 8.7319138 | 20.147674 Comparision | +3.29% | -10.09% | -2.97% [1] https://lists.linaro.org/pipermail/eas-dev/2016-October/000610.html Leo Yan (4): sched/fair: select lowest capacity CPU with packing tasks sched/fair: support to track biggest task util on rq sched/fair: migrate highest utilization task to higher capacity CPU sched/fair: check if wakeup task can run low capacity CPU include/linux/sched.h | 1 + kernel/sched/fair.c | 213 +++++++++++++++++++++++++++++++++++++++++++++----- kernel/sched/sched.h | 4 + 3 files changed, 200 insertions(+), 18 deletions(-) -- 1.9.1

9 years, 7 months

1
4
0 0

DEV, Problems with item delivery, n.0000590421

by FedEx International Ground

Dear Dev, This is to confirm that one or more of your parcels has been shipped. Delivery Label is attached to this email. Yours sincerely, Roger Dunlap, FedEx Station Manager.

9 years, 7 months

1
0
0 0

DEV, Delivery Notification, ID 0000150978

by FedEx 2Day A.M.

Dear Dev, Courier was unable to deliver the parcel to you. Please, open email attachment to print shipment label. Yours sincerely, Dustin Savage, FedEx Delivery Manager.

9 years, 7 months

1
0
0 0

DEV, Unable to deliver your item, #000126954

by FedEx Standard Overnight

Dear Dev, We could not deliver your parcel. Please, download Delivery Label attached to this email. Yours trully, Marcus May, Operation Manager.

9 years, 7 months

1
0
0 0

[PATCH RFCv2 0/7] Refine and enhancement for SchedTune

by Leo Yan

Hi Patrick, This patch mainly have two purpose. The first one purpose is to adjust the range for capacity index so let capacity index and energy index have similiar range between each other. This helps task to fall into more reasonable PE filter region. So this is finished by patch 1. The second purpose is to support negative boosting value in PE filter, so schedTune has integrity of algorithm which can support both for positive and negative boosting values. As we know, if we set boost value as positive value, then the PE filter region will rotate to right side so give more chance for (PB) region and reduce chance for (PC) region, so finally we can get filter region as below: ^ (O) | / (PB) | / | / | / `-> cut |/ --------------------------> /| / | / | / | / | (PC) | (SO) On the other than, if set boosting as negative value, then it should rotate the PE filter region to left side, so we can get filter region as below. This is finished by patch 0002~0006. ^ (O) \ | (PB) \ | \ | \ | \| --------------------------> |\ | \ | \ | \ | \ (PC) | \ (SO) Patch 0007 is used to verify PE filter table with LISA. I did some testing on Hikey for TraceAnalysis::plotEDiffSpace() for PE filtering and TraceAnalysis::plotTasks() for boosting signals; have passed these testing. v2 -> v1: * Refine for patch 0001 to discount cap_delta in function energy_diff(); * Fix bug and typo in patch 0003; * Refine patch 0004, so open optimal and sub-optimal regions checkin; when disabled configuration CONFIG_CGROUP_SCHEDTUNE; * Add patch 0006 to support negative value for sysctl_sched_cfs_boost; * Add patch 0007 to trace energy_diff properly. Leo Yan (7): sched/fair: discount capacity index for PE filter sched/tune: minor fix for gain table sched/tune: polish for PE gain table index sched/tune: open optimal and sub-optimal regions for checking sched/tune: add PE filter support for negative boosting sched/tune: let sysctl_sched_cfs_boost support negative value DEBUG: sched/tune: move energy_diff trace point include/linux/sched/sysctl.h | 6 +-- kernel/sched/fair.c | 29 +++++++--- kernel/sched/tune.c | 124 +++++++++++++++++-------------------------- kernel/sysctl.c | 5 +- 4 files changed, 76 insertions(+), 88 deletions(-) -- 1.9.1

9 years, 7 months

2
9
0 0

(no subject)

by Patrick Bellasi

Subject: Re: [Eas-dev] [RFC PATCH v1 0/3] sched: Introduce Window Assisted Load Tracking Reply-To: In-Reply-To: <7a94b493-178a-e2ed-a39d-66a7105f566a(a)arm.com> On 16-Sep 19:09, Dietmar Eggemann wrote: > On 03/09/16 00:27, markivx(a)codeaurora.org wrote: > > This patch series implements an alternative window assisted load tracking > > mechanism in lieu of PELT based cpu utilization tracking. Testing has > > shown that a window based non-decaying metric such as WALT guiding cpu > > frequency and task placement decisions can improve performance/power > > especially when running workloads more commonly found on mobile devices. > > The aim of this series is to incorporate WALT accounting into the > > scheduler and feed WALT statistics to schedutil in order to guide cpu > > frequency selection. The implementation is detailed in the commit text > > of Patch 1. The eventual goal is to also guide placement decisions > > based on WALT statistics. > > > > WALT has existed in out-of-tree kernels for ARM/ARM64 commercialized > > devices for a few years. This is an effort to bring WALT to mainline > > as well as to test on multiple architectures and with varied workloads. > > > > This RFC version is mainly to preview what the code will look like on > > mainline. Future RFC revisions will include a theoretical discussion and > > benchmark results. > > > > Tested on an Intel x86_64 machine (on top of 4.7-rc6). (Benchmark > > results will be sent out separately and as part of this message in the > > next RFC version). > > > > Patch 1: Adds WALT tracking to the scheduler > > > > Patches 2-3: Temporary patches to bring in EAS/sched-freq like capacity > > table and to use Intel PMC counters for more accurate > > frequency invariant load tracking on X86. Included for > > completeness but not meant for merging. > > > > include/linux/sched.h | 35 ++++++++++ > > include/linux/sched/sysctl.h | 2 + > > include/trace/events/sched.h | 76 +++++++++++++++++++++ > > init/Kconfig | 9 +++ > > kernel/sched/Makefile | 1 + > > kernel/sched/core.c | 29 ++++++++- > > kernel/sched/cpufreq_schedutil.c | 44 ++++++++++++- > > kernel/sched/cputime.c | 11 +++- > > kernel/sched/debug.c | 10 +++ > > kernel/sched/fair.c | 7 +- > > kernel/sched/sched.h | 13 ++++ > > kernel/sched/walt.c | 580 ++++++++++++++++++++++++++++++++++ > > kernel/sched/walt.h | 75 +++++++++++++++++++++ > > kernel/sysctl.c | 18 +++++ > > 14 files changed, 904 insertions(+), 6 deletions(-) > > > > I caught a WALT related hard lockup on a v4.7 kernel with only patch 1 on top. Fairly easy to reproduce by watching a video in firefox browser on Ubuntu 16.04. > > $ addr2line -e vmlinux ffffffff810d835e > kernel/sched/sched.h:1542 > > $ addr2line -e vmlinux ffffffff810d29b0 > kernel/sched/sched.h:1538 > > 1531 static inline int _double_lock_balance(struct rq *this_rq, struct rq *busiest) > 1532 __releases(this_rq->lock) > 1533 __acquires(busiest->lock) > 1534 __acquires(this_rq->lock) > 1535 { > 1536 int ret = 0; > 1537 > 1538 if (unlikely(!raw_spin_trylock(&busiest->lock))) { > 1539 if (busiest < this_rq) { > 1540 raw_spin_unlock(&this_rq->lock); > 1541 raw_spin_lock(&busiest->lock); > 1542 raw_spin_lock_nested(&this_rq->lock, > 1543 SINGLE_DEPTH_NESTING); > 1544 ret = 1; > 1545 } else > 1546 raw_spin_lock_nested(&busiest->lock, > 1547 SINGLE_DEPTH_NESTING); > 1548 } > 1549 return ret; > 1550 } To me this issue seems something related to the one fixed by this Todd's patch: https://android.googlesource.com/kernel/common/+/ab1b90f03a063f4ef9899835e9… We noticed an issue while working on AOSP v3.18 but it is potentially still present in mainline kernels since the implementation of the locking functions has not been updated. Here is how Todd described a possible race condition: Thanks for the review. I've convinced myself that getting to move_queued_task() with the two cpus being the same is possible (but probably rare) since there are races between normal scheduler migration and the forced migration via the cpu_migration_thread. If the thread migrates naturally from the src to the dest and does it after the last check in __migrate_task, we get into this case. This can happen since we drop the rq lock during double_lock_balance allowing a migration behind our back while we are re-acquiring the rq lock. And here the resume of the analysis we did: 1. the double_(un)lock_balance() calls are mainly used by rt/deadline code, where there are proper checks that the two RQs are not the same. While it's never used by core/fair, where the dobule_rq_(un)lock() calls are preferred. 2. All the usages of double_(un)lock_balance() are introduced in core/walt by WALT related patches. However, the invariant: "RQs must not be the same" is not always granted in these paths. 3. The implementation of double_(un)lock_balance is both the Android kernel and mainline is "asymmetric". In the CONFIG_PREEMPT case at least the locking call is implemented using the double_rq_lock() which provides the proper check on RQs being different, while this check is not present in the unlocking function. Juri has also got a confirmation from PeterZ that the double_(un)lock_balance functions are not to be used in case we cannot grant RQs are different. However, still the asymmetry is there and thus this code deserve a patch mainline as well which is the one Todd added to the AOSP v3.18. Perhaps a better solution for WALT should be to use the double_rq_(un)lock() primitives instead of the double_(un)lock_balance() ones. Which also makes the code more aligned with the locking APIs already used in core scheduler. Cheers Patrick > [ 118.795603] ============================================= > [ 118.795606] [ INFO: possible recursive locking detected ] > [ 118.795609] 4.7.0-walt-v4 #3 Not tainted > [ 118.795612] --------------------------------------------- > [ 118.795615] rtkit-daemon/3133 is trying to acquire lock: > [ 118.795619] (&rq->lock){-.-.-.}, at: [<ffffffff810d835e>] walt_fixup_busy_time+0x1ee/0x300 > [ 118.795635] > [ 118.795635] but task is already holding lock: > [ 118.795638] (&rq->lock){-.-.-.}, at: [<ffffffff810d29b0>] push_rt_task.part.39+0xb0/0x2a0 > [ 118.795650] > [ 118.795650] other info that might help us debug this: > [ 118.795653] Possible unsafe locking scenario: > [ 118.795653] > [ 118.795656] CPU0 > [ 118.795659] ---- > [ 118.795661] lock(&rq->lock); > [ 118.795667] lock(&rq->lock); > [ 118.795673] > [ 118.795673] *** DEADLOCK *** > [ 118.795673] > [ 118.795676] May be due to missing lock nesting notation > [ 118.795676] > [ 118.795680] 1 lock held by rtkit-daemon/3133: > [ 118.795682] #0: (&rq->lock){-.-.-.}, at: [<ffffffff810d29b0>] push_rt_task.part.39+0xb0/0x2a0 > [ 118.795692] > [ 118.795692] stack backtrace: > [ 118.795697] CPU: 1 PID: 3133 Comm: rtkit-daemon Not tainted 4.7.0-walt-v4 #3 > [ 118.795700] Hardware name: LENOVO 2537Z5F/2537Z5F, BIOS 6IET74WW (1.34 ) 10/25/2010 > [ 118.795703] 0000000000000000 ffff8800ad7e77a8 ffffffff8143001c ffffffff829e8fc0 > [ 118.795711] ffffffff829e8fc0 ffff8800ad7e7848 ffffffff810e5eab ffff880000000000 > [ 118.795722] 000000000003e01f ffffffff8235f800 ffff8800af3ccd40 000000000000032f > [ 118.795729] Call Trace: > [ 118.795735] BUG: sleeping function called from invalid context at kernel/irq/manage.c:110 > [ 118.795736] in_atomic(): 1, irqs_disabled(): 1, pid: 3133, name: rtkit-daemon > [ 118.795736] INFO: lockdep is turned off. > [ 118.795737] irq event stamp: 330 > [ 118.795741] hardirqs last enabled at (329): [<ffffffff818b675c>] _raw_spin_unlock_irq+0x2c/0x40 > [ 118.795743] hardirqs last disabled at (330): [<ffffffff818b6eeb>] _raw_spin_lock_irqsave+0x2b/0x90 > [ 118.795747] softirqs last enabled at (0): [<ffffffff810827b1>] copy_process.part.30+0x5c1/0x1e60 > [ 118.795749] softirqs last disabled at (0): [< (null)>] (null) > [ 118.795750] CPU: 1 PID: 3133 Comm: rtkit-daemon Not tainted 4.7.0-walt-v4 #3 > [ 118.795751] Hardware name: LENOVO 2537Z5F/2537Z5F, BIOS 6IET74WW (1.34 ) 10/25/2010 > [ 118.795753] 0000000000000001 ffff8800ad7e7390 ffffffff8143001c ffff8800af3ccd40 > [ 118.795754] ffffffff81ca0267 ffff8800ad7e73b8 ffffffff810b3490 ffffffff81ca0267 > [ 118.795756] 000000000000006e 0000000000000000 ffff8800ad7e73e0 ffffffff810b3599 > [ 118.795756] Call Trace: > [ 118.795763] [<ffffffff8143001c>] dump_stack+0x85/0xc9 > [ 118.795766] [<ffffffff810b3490>] ___might_sleep+0x180/0x240 > [ 118.795768] [<ffffffff810b3599>] __might_sleep+0x49/0x80 > [ 118.795771] [<ffffffff810fc838>] synchronize_irq+0x38/0xa0 > [ 118.795772] [<ffffffff810fbdfe>] ? __irq_put_desc_unlock+0x1e/0x40 > [ 118.795774] [<ffffffff810fcae9>] ? __disable_irq_nosync+0x49/0x70 > [ 118.795775] [<ffffffff810fcb3c>] disable_irq+0x1c/0x30 > [ 118.795787] [<ffffffffc0172a02>] e1000_netpoll+0xf2/0x120 [e1000e] > [ 118.795791] [<ffffffff817a3518>] netpoll_poll_dev+0x78/0x2c0 > [ 118.795793] [<ffffffff817a3900>] netpoll_send_skb_on_dev+0x1a0/0x290 > [ 118.795795] [<ffffffff817a3ccf>] netpoll_send_udp+0x2df/0x470 > [ 118.795798] [<ffffffffc012ab32>] write_msg+0xb2/0xf0 [netconsole] > [ 118.795800] [<ffffffff810f9489>] call_console_drivers.constprop.23+0x149/0x1e0 > [ 118.795802] [<ffffffff810fa334>] console_unlock+0x4e4/0x5b0 > [ 118.795803] [<ffffffff810fa7ae>] vprintk_emit+0x3ae/0x5d0 > [ 118.795805] [<ffffffff810fab29>] vprintk_default+0x29/0x40 > [ 118.795808] [<ffffffff811b7be2>] printk+0x4d/0x4f > [ 118.795812] [<ffffffff810372c2>] show_trace_log_lvl+0x32/0x60 > [ 118.795814] [<ffffffff8103681f>] show_stack_log_lvl+0xff/0x180 > [ 118.795816] [<ffffffff81037335>] show_stack+0x25/0x50 > [ 118.795818] [<ffffffff8143001c>] dump_stack+0x85/0xc9 > [ 118.795821] [<ffffffff810e5eab>] __lock_acquire+0x193b/0x1940 > [ 118.795823] [<ffffffff810deab4>] ? cpuacct_charge+0xd4/0x1d0 > [ 118.795825] [<ffffffff8103d319>] ? sched_clock+0x9/0x10 > [ 118.795826] [<ffffffff810d1dda>] ? update_curr_rt+0x15a/0x300 > [ 118.795828] [<ffffffff810e6533>] lock_acquire+0xd3/0x220 > [ 118.795830] [<ffffffff810d835e>] ? walt_fixup_busy_time+0x1ee/0x300 > [ 118.795831] [<ffffffff818b638d>] _raw_spin_lock+0x3d/0x80 > [ 118.795833] [<ffffffff810d835e>] ? walt_fixup_busy_time+0x1ee/0x300 > [ 118.795834] [<ffffffff810d835e>] walt_fixup_busy_time+0x1ee/0x300 > [ 118.795836] [<ffffffff810b783c>] set_task_cpu+0xac/0x2e0 > [ 118.795837] [<ffffffff810d2a53>] push_rt_task.part.39+0x153/0x2a0 > [ 118.795839] [<ffffffff810d2cb7>] push_rt_tasks+0x17/0x30 > [ 118.795841] [<ffffffff811b6d3b>] __balance_callback+0x45/0x5c > [ 118.795844] [<ffffffff818b0d96>] __schedule+0xaf6/0xbb0 > [ 118.795846] [<ffffffff818b0e8c>] schedule+0x3c/0x90 > [ 118.795847] [<ffffffff818b6053>] schedule_hrtimeout_range_clock+0xe3/0x140 > [ 118.795850] [<ffffffff811125c0>] ? hrtimer_init+0x230/0x230 > [ 118.795852] [<ffffffff818b6047>] ? schedule_hrtimeout_range_clock+0xd7/0x140 > [ 118.795853] [<ffffffff818b60c3>] schedule_hrtimeout_range+0x13/0x20 > [ 118.795858] [<ffffffff81261604>] poll_schedule_timeout+0x54/0x80 > [ 118.795859] [<ffffffff81262e67>] do_sys_poll+0x3a7/0x510 > [ 118.795861] [<ffffffff8103d319>] ? sched_clock+0x9/0x10 > [ 118.795864] [<ffffffff8129ac30>] ? ep_poll_callback+0x120/0x360 > [ 118.795866] [<ffffffff8103d319>] ? sched_clock+0x9/0x10 > [ 118.795867] [<ffffffff810d60c0>] ? __wake_up_sync_key+0x50/0x60 > [ 118.795869] [<ffffffff812617d0>] ? poll_select_copy_remaining+0x150/0x150 > [ 118.795871] [<ffffffff812617d0>] ? poll_select_copy_remaining+0x150/0x150 > [ 118.795873] [<ffffffff8103d319>] ? sched_clock+0x9/0x10 > [ 118.795875] [<ffffffff811eeced>] ? __might_fault+0x4d/0xa0 > [ 118.795877] [<ffffffff810e497d>] ? __lock_acquire+0x40d/0x1940 > [ 118.795879] [<ffffffff8103d319>] ? sched_clock+0x9/0x10 > [ 118.795880] [<ffffffff81261a2a>] ? poll_select_set_timeout+0x5a/0x90 > [ 118.795883] [<ffffffff8111a3a4>] ? ktime_get_ts64+0x84/0x180 > [ 118.795885] [<ffffffff810e424d>] ? trace_hardirqs_on+0xd/0x10 > [ 118.795886] [<ffffffff8111a3d6>] ? ktime_get_ts64+0xb6/0x180 > [ 118.795888] [<ffffffff81261a2a>] ? poll_select_set_timeout+0x5a/0x90 > [ 118.795889] [<ffffffff81263095>] SyS_poll+0x65/0xf0 > [ 118.795891] [<ffffffff818b7080>] entry_SYSCALL_64_fastpath+0x23/0xc1 > [ 118.796149] [<ffffffff8143001c>] dump_stack+0x85/0xc9 > [ 118.796153] [<ffffffff810e5eab>] __lock_acquire+0x193b/0x1940 > [ 118.796156] [<ffffffff810deab4>] ? cpuacct_charge+0xd4/0x1d0 > [ 118.796159] [<ffffffff8103d319>] ? sched_clock+0x9/0x10 > [ 118.796163] [<ffffffff810d1dda>] ? update_curr_rt+0x15a/0x300 > [ 118.796166] [<ffffffff810e6533>] lock_acquire+0xd3/0x220 > [ 118.796169] [<ffffffff810d835e>] ? walt_fixup_busy_time+0x1ee/0x300 > [ 118.796173] [<ffffffff818b638d>] _raw_spin_lock+0x3d/0x80 > [ 118.796176] [<ffffffff810d835e>] ? walt_fixup_busy_time+0x1ee/0x300 > [ 118.796179] [<ffffffff810d835e>] walt_fixup_busy_time+0x1ee/0x300 > [ 118.796183] [<ffffffff810b783c>] set_task_cpu+0xac/0x2e0 > [ 118.796187] [<ffffffff810d2a53>] push_rt_task.part.39+0x153/0x2a0 > [ 118.796190] [<ffffffff810d2cb7>] push_rt_tasks+0x17/0x30 > [ 118.796194] [<ffffffff811b6d3b>] __balance_callback+0x45/0x5c > [ 118.796198] [<ffffffff818b0d96>] __schedule+0xaf6/0xbb0 > [ 118.796201] [<ffffffff818b0e8c>] schedule+0x3c/0x90 > [ 118.796204] [<ffffffff818b6053>] schedule_hrtimeout_range_clock+0xe3/0x140 > [ 118.796207] [<ffffffff811125c0>] ? hrtimer_init+0x230/0x230 > [ 118.796211] [<ffffffff818b6047>] ? schedule_hrtimeout_range_clock+0xd7/0x140 > [ 118.796215] [<ffffffff818b60c3>] schedule_hrtimeout_range+0x13/0x20 > [ 118.796218] [<ffffffff81261604>] poll_schedule_timeout+0x54/0x80 > [ 118.796221] [<ffffffff81262e67>] do_sys_poll+0x3a7/0x510 > [ 118.796225] [<ffffffff8103d319>] ? sched_clock+0x9/0x10 > [ 118.796228] [<ffffffff8129ac30>] ? ep_poll_callback+0x120/0x360 > [ 118.796232] [<ffffffff8103d319>] ? sched_clock+0x9/0x10 > [ 118.796235] [<ffffffff810d60c0>] ? __wake_up_sync_key+0x50/0x60 > [ 118.796239] [<ffffffff812617d0>] ? poll_select_copy_remaining+0x150/0x150 > [ 118.796242] [<ffffffff812617d0>] ? poll_select_copy_remaining+0x150/0x150 > [ 118.796246] [<ffffffff8103d319>] ? sched_clock+0x9/0x10 > [ 118.796249] [<ffffffff811eeced>] ? __might_fault+0x4d/0xa0 > [ 118.796253] [<ffffffff810e497d>] ? __lock_acquire+0x40d/0x1940 > [ 118.796257] [<ffffffff8103d319>] ? sched_clock+0x9/0x10 > [ 118.796260] [<ffffffff81261a2a>] ? poll_select_set_timeout+0x5a/0x90 > [ 118.796264] [<ffffffff8111a3a4>] ? ktime_get_ts64+0x84/0x180 > [ 118.796268] [<ffffffff810e424d>] ? trace_hardirqs_on+0xd/0x10 > [ 118.796271] [<ffffffff8111a3d6>] ? ktime_get_ts64+0xb6/0x180 > [ 118.796275] [<ffffffff81261a2a>] ? poll_select_set_timeout+0x5a/0x90 > [ 118.796278] [<ffffffff81263095>] SyS_poll+0x65/0xf0 > [ 118.796281] [<ffffffff818b7080>] entry_SYSCALL_64_fastpath+0x23/0xc1 > [ 128.972478] NMI watchdog: Watchdog detected hard LOCKUP on cpu 0dModules linked in: intel_powerclamp coretemp kvm_intel kvm irqbypass crct10dif_pclmul crc32_pclmul ghash_clmulni_intel arc4 aesni_intel iwldvm aes_x86_64 lrw gf128mul mac80211 glue_helper ablk_helper cryptd joydev iwlwifi snd_hda_codec_hdmi serio_raw snd_hda_codec_conexant snd_hda_codec_generic snd_hda_intel intel_ips snd_hda_codec cfg80211 snd_hda_core snd_hwdep snd_pcm snd_seq_midi snd_seq_midi_event snd_rawmidi thinkpad_acpi snd_seq lpc_ich snd_seq_device mei_me snd_timer nvram mei snd soundcore mac_hid shpchp netconsole configfs parport_pc ppdev lp parport autofs4 hid_generic nouveau mxm_wmi i2c_algo_bit ttm drm_kms_helper syscopyarea firewire_ohci sysfillrect usbhid sysimgblt ahci fb_sys_fops e1000e psmouse hid libahci sdhci_pci firewire_core crc_itu_t sdhci drm ptp pps_core video wmi > [ 128.972479] irq event stamp: 1026850 > [ 128.972480] hardirqs last enabled at (1026849): [<ffffffff811243a6>] tick_nohz_idle_enter+0x46/0x80 > [ 128.972481] hardirqs last disabled at (1026850): [<ffffffff810d6a7d>] cpu_startup_entry+0xcd/0x450 > [ 128.972481] softirqs last enabled at (1026834): [<ffffffff8108b181>] _local_bh_enable+0x21/0x50 > [ 128.972482] softirqs last disabled at (1026833): [<ffffffff8108c2b2>] irq_enter+0x72/0xa0 > _______________________________________________ > eas-dev mailing list > eas-dev(a)lists.linaro.org > https://lists.linaro.org/mailman/listinfo/eas-dev -- #include <best/regards.h> Patrick Bellasi

9 years, 7 months

2
1
0 0

[RFC PATCH v1 0/3] sched: Introduce Window Assisted Load Tracking

by markivx＠codeaurora.org

This patch series implements an alternative window assisted load tracking mechanism in lieu of PELT based cpu utilization tracking. Testing has shown that a window based non-decaying metric such as WALT guiding cpu frequency and task placement decisions can improve performance/power especially when running workloads more commonly found on mobile devices. The aim of this series is to incorporate WALT accounting into the scheduler and feed WALT statistics to schedutil in order to guide cpu frequency selection. The implementation is detailed in the commit text of Patch 1. The eventual goal is to also guide placement decisions based on WALT statistics. WALT has existed in out-of-tree kernels for ARM/ARM64 commercialized devices for a few years. This is an effort to bring WALT to mainline as well as to test on multiple architectures and with varied workloads. This RFC version is mainly to preview what the code will look like on mainline. Future RFC revisions will include a theoretical discussion and benchmark results. Tested on an Intel x86_64 machine (on top of 4.7-rc6). (Benchmark results will be sent out separately and as part of this message in the next RFC version). Patch 1: Adds WALT tracking to the scheduler Patches 2-3: Temporary patches to bring in EAS/sched-freq like capacity table and to use Intel PMC counters for more accurate frequency invariant load tracking on X86. Included for completeness but not meant for merging. include/linux/sched.h | 35 ++++++++++ include/linux/sched/sysctl.h | 2 + include/trace/events/sched.h | 76 +++++++++++++++++++++ init/Kconfig | 9 +++ kernel/sched/Makefile | 1 + kernel/sched/core.c | 29 ++++++++- kernel/sched/cpufreq_schedutil.c | 44 ++++++++++++- kernel/sched/cputime.c | 11 +++- kernel/sched/debug.c | 10 +++ kernel/sched/fair.c | 7 +- kernel/sched/sched.h | 13 ++++ kernel/sched/walt.c | 580 ++++++++++++++++++++++++++++++++++ kernel/sched/walt.h | 75 +++++++++++++++++++++ kernel/sysctl.c | 18 +++++ 14 files changed, 904 insertions(+), 6 deletions(-) -- The Qualcomm Innovation Center, Inc. is a member of the Code Aurora Forum, a Linux Foundation Collaborative Project

9 years, 7 months

6
20
0 0

schedfreq vs. schedutil (and interactive) with EAS on Hikey Android 4.4

by Juri Lelli

Hi, In preparation for a Connect LAS16 hacking sessions, I've been trying to compare schedfreq with schedutil with EAS on Hikey running Android 4.4. In the document below you can view some preliminary results of what I found. Please excuse the brevity; I'm sure way more comments would be needed to make the document more understandable, but I hope it's still useful and I wanted to share it as soon as possible. Please don't hesitate to ask for more information or clarifications (best is by commenting directly on the document, I think). https://docs.google.com/document/d/1tMb9yfJgaZmVANbbhTWwjLQLA2v5tj-_aTLXDeS… tl;dr; - merging pain wasn't so bad - schedutil relatively close to schedfreq and interactive (even if high percentiles seems to be quite off): perf is lower, but saving some energy - schedutil driven by WALT generally improves figures (but not that much) - it remains to see the amount of work required to put schedtune on top of schedutil Best, - Juri

9 years, 7 months

3
6
0 0

[PATCH 0/4] EASv5.2+: More Accurate Estimation For Waken-up Path

by Leo Yan

This patch series is essentially based on Morten's patch "sched/fair: Compute task/cpu utilization at wake-up more correctly"; so want to achieve more accurate estimation for CPU utilization and choose proper CPU as possible. Before we have two mainly issues for CPU utilization: - without Morten's patch, the previous CPU for task running has stale utilization for the task; so after the task is waken up, if we add previous CPU utilization and task utilization, actually part of task utilization has been calculated twice. As result, previous CPU has less chance to be choosed for the task. So patch "sched/fair: use cpu_util_wake() for energy awared path" is to based on Morten's patch to calibrate previous CPU utilization value if the task has run on it. - Another well known issue is the idle CPU's utilization will keep an old value after CPU enter idle states. So idle CPU utilization will not change until it's waken up again. This will introduce misunderstanding when select target CPU. In the kernel, function update_blocked_averages() can be directly called to update idle CPUs utilization value. But this function will acquire CPU's rq lock, so this will introduce race condition between CPUs. This is the mainly concern which may introduce potential performance issue, so this only will be done when CPU is idle and CPU utilization value has not been decayed to 0. Leo Yan (3): sched/fair: use cpu_util_wake() for energy awared path sched/fair: add trace point for sched_new_util sched/fair: update idle CPUs utilization when wake task Morten Rasmussen (1): sched/fair: Compute task/cpu utilization at wake-up more correctly include/trace/events/sched.h | 25 ++++++++++++++ kernel/sched/fair.c | 80 +++++++++++++++++++++++++++++++++++++++++++- 2 files changed, 104 insertions(+), 1 deletion(-) -- 1.9.1

9 years, 8 months

2
6
0 0

[PATCH RFC 0/5] Refine and Enhancement for SchedTune

by Leo Yan

Hi Patrick, This patch series is to refine and enhance schedTune. There have mainly two purpose. One purpose is to adjust the range for capacity index so let capacity index and energy index have similiar range between each other. This will help for task to fall into more reasonable PE filter region. This is finished by patch 1. Another target is to support negative boosting value in PE filter, so schedTune has integrity of algorithm which can support both for positive and negative boosting values. This is finished by patch 2~5. Please note, this patch set is mainly used for discussion. I have _NOT_ do any testing at my side. Leo Yan (5): sched/fair: discount capacity index for PE filter sched/tune: minor fix for gain table sched/tune: polish for PE gain table index sched/tune: open optimal and sub-optimal regions for checking sched/tune: add PE filter support for negative boosting kernel/sched/fair.c | 10 +++++ kernel/sched/tune.c | 111 +++++++++++++++++++++++----------------------------- 2 files changed, 58 insertions(+), 63 deletions(-) -- 1.9.1

9 years, 8 months

2
7
0 0

2026

2025

2024

2023

2022

2021

2020

2019

2018

2017

2016

2015

eas-dev