On 08/18/20 15:14, Yun Hsiang wrote:
(1) Reduce tasks UCLAMP_MIN space to [0 .. 819] (mentioned above) or remap [0 ... 1023] to [0 .. 819] internally.
Reducing uclamp_min in userspace is a hack. Which is okay as a temporary workaround. This mapping issue is internal implementation detail that must be fixed internally IMO.
IMO, uclamp is simply clamping to task util. So this remap hack is a little weird to me. And as Dietmar mentioned, if there are other capacity pressure (RT/DL/IRQ/thermal), we will still meet the same problem.
It's a stepping point to fix the immediate problem. I agree it's not optimized solution.
(2) System-wide spare capacity CPU (your proposal).
The issue w/ this is that you would return the system-wide max_spare_capacity CPU although CPUs sharing the Perf Domain with prev_cpu could serve as a best_energy_cpu. This can happen in case only prev_cpu experiences (temporary) capacity pressure.
(3) Route a task which cannot fit on prev_cpu due to its UCLAMP_MIN value through select_idle_sibling() -> select_idle_capacity(). I.e. add an appropriate check next to rd->overutilized and sd check at the beginning of find_energy_efficient_cpu() which goes to fail
Similar issue as in (2) here.
boosting != overutilized IMO.
If the task asked for more boosting than can be satisfied, then best effort is a better fallback. 1024 for me translates into the 'maximum available performance point'. If the system is under pressured, we still want to try to return the maximum performance point under pressure, even if it is not the absolute maximum that one can get without pressure.
I agree. I think the task util/uclamp is a performance requirement. The scheduler should find a CPU to satisfy the task performance requirement (util), and if it is possible, choose the energy-efficient one. So if there is no cpu that can satisfy the requirements, find a max spare cpu may be a better choice.
This does sound ideal to me too. But how do you handle the other 20% range, e.g: util_min = 850? In this case another medium CPU could be a better fit and more energy efficient.
If you can improve your patch to handle this 20% scenario and send it as RFC to LKML, it'll help pushing things more forward in this direction :-)
If the actual utilization of the task is high, then over utilization logic will trigger anyway.
(4) Deal with the fact that we return prev cpu in this case (current behavior). Doesn't seem to bad in combination with (1).
Fixing 1 internally fixes the immediate bug agreed. I think 2 is a good option, but that is an optimization rather than a bug fix and needs more info to see how much it's worth it.
Yun, can you share more info about your use case?
Our case is gaming scenario on android. We want a task (related to game but util is low) to run on big core with the highest frequency. We set uclamp min to 1024 to the task but it is staying on little core.
Thanks for sharing!
Cheers
-- Qais Yousef