Fair enough, I just didn't particularly want yet another patch mucking around with the config.
A revert of the one that did the damage and then another patch to remove the other stuff (if you can't do 2 reverts) would be much nicer.
Mark
-----Original Message----- From: Chris Redpath [mailto:Chris.Redpath@arm.com] Sent: 01 May 2014 16:01 To: Mark Hambleton; Mark Brown; Jon Medhurst (Tixy) Cc: Linaro Kernel Subject: Re: [PATCH] sched: hmp: Remove compile-time configuration of task packing
Hi Mark,
On 01/05/14 15:36, Mark Hambleton wrote:
- PGP Signed by an unknown key
On Thu, May 01, 2014 at 03:24:18PM +0100, Jon Medhurst (Tixy) wrote:
I personally don't care one way or the other, I'm just a piggy-in-the-middle patch monkey chimera who'd wish people would
make
up
their minds what they want :-)
So if Mark H, Mark B, and Robin can come to a consensus on what the
task
packing config should be, then tell me...
I have no real opinion on what the best configuration is, I'm assuming this is intended as a per system tunable and that the main point of the change was to demonstrate best practice for doing the tuning better.
My problem here is that this patch doesn't just revert the change that turns
off task packing, it also reverts the config that most will probably have been building with...
hmp_full_threshold now changes from:
unsigned int hmp_full_threshold = 650; /* 80% of the 800Mhz freq *
NICE_0_LOAD */
To:
unsigned int hmp_full_threshold = (NICE_0_LOAD * 9) / 8;
As I suspect most will be building with TC2 enabled.
Yes it is a tunable, and yes it can be set for a platform.. but it is just more
messing about when this patch finally gets reverted.
It's actually quite tricky to determine what the optimal value is as it depends where the voltage steps in the DVFS curve go. There isn't really a suitable default for all systems - the nice thing about the TC2 specific data was that it showed that you need to think about it and that you might not even want this feature.
On TC2 there are two problems with packing.
Firstly, the A7s use a lot more power when the first voltage change comes in the DVFS curve, as the comment says. This is why we did the threshold change - to keep the A7s below the first voltage step as much as possible.
If you assume that most of the DVFS governors try to keep utilization in the 80% range by changing frequency (mostly, by default) then you can try to keep the load on a packed CPU down this way but it isn't going to guarantee that you don't go over the target frequency. We only decide on packing at wakeup time, once you've decided then you are stuck with the decision until the next load balance or wakeup.
It turns out that it's difficult to test that it's doing the right thing every time as the packing conditions depend on instantaneous scheduler conditions and these are hard to control. Even when you're sure you are doing the right thing, if you've got synchronized test threads all active at the same time then the scheduler's idea of load is *runnable* time so you get an increase in apparent load from packing those threads together.
Secondly, I've never seen the TC2 A7 cluster power down fully in a trace. When you have one A7 on, hence the L2 etc. is on, the overhead of adding more A7s is low assuming the frequency is.
Anyway, the point is we need a default, 650 is as good as anything else for the reasons above. If it doesn't suit a platform, change it in the init script - you were supposed to do that anyway :)
We have already reverted the latest patch, but what would be cleaner is to
revert the first patch and then add a second patch to remove the other bit of config and just set the value to the one that is most likely the most common default?
I haven't had any feedback what people are setting this to in order to decide what the most common value would be. All data points gratefully received.
Mark
--Chris