Re: How big.little switcher works when it try to boot all cpus in all clusters and then try to switch?

23 Aug 2012


      Dave,
On Thu, Aug 23, 2012 at 8:47 PM, Dave Martin dave.martin@linaro.org wrote:
...
On Thu, Aug 23, 2012 at 04:51:45PM +0800, Lei Wen wrote:
...
Hi Dave,
[snip here]
...
We don't use the virtualisation extensions for this in our current code.
It just becomes normal kernel code, analogous to subsystems like cpufreq
and CPU hotplug.
Virtualisation is only really needed if we want to trick the OS into
thinking that it is not really being migrated between different physical
CPUs.  This approach has the advantage that it can work with any OS, with
no need for modifying the OS.  But because we can modify Linux so that it
understands and controls the switching, virtualisation is not needed.
This also makes it easier to use the virtualisation extensions for running
true hypervisors like KVM, because we don't have to work out a way to
let KVM and the switcher co-exist in hypervisor space.
In kernel implementation is a very elegant way to handle the coexisting of
 switching and kvm. :)
While in-kernel implementation is using paired-cpu switching way, I think
it is more close to Big.Little MP solution, which also has both A7/A15 alive.
The only different here is that paired-cpu allow one cpu in pair alive.
I don't know whether I understand it right, system may run with both
A7/A15 existed.
Correct me if I am wrong. :)
Ignoring some implementation details, your understanding is correct:
With big.LITTLE MP and the switcher, the kernel has access to all the
physical CPUs in the system.
In a sense, the switcher implements a particular policy for how the
CPUs are used.  big.LITTLE MP just gives all the CPUs to the kernel,
but the switcher combines the physical CPUs into big+LITTLE pairs so
that only one is running any any given time, and presents those logical
paired CPUs to the rest of the kernel.
One question here, do we still need to bind identical cpuid into pair?
I think only by way, processor would still believe itself don't need any
change. While I don't know whether the changed cluster id would affect
system process or not, since this is not visualized by hypervisor as ARM's
reference code.
...
...
So does this in-kernel implementation take the consideration of
load-balance issue
which is also faced by the MP solution, since the computing capability
difference?
Or linaro just did the paired-cpu switching for all A7/A15 pairs,
which would mimic
the cluster switching in the ARM's reference code?
With big.LITTLE MP, you have many CPUs with differing properties:
    bbbbLLLL


whereas with the switcher, the kernel sees fewer CPUs, but they are
identical:
    ssss


The switcher logical CPUs can run either on the big or little cluster,
but for scheduling purposes most of the kernel does not really need to
understand this.  big/LITTLE becomes an extra performance point
parameter for each logical CPU, similar to frequency/voltage scaling.
Because the switcher logical CPUs have identical properties, the kernel
can treat them as identical for scheduling purposes.  This means that
the scheduler should work sensibly without any modifications.
Good abstraction!
However I cannot see why kernel still believe those logic cpus has same
computing capability, if the real cpu running is bLbL. What I can learn
from SMP is that the kernel believe the cpus has the same DMIPS.
Does the logic cpu fake its DMIPS capability and  report the same value
to the kernel side?
...
Just as with frequency/voltage scaling, we can decide when to switch
each CPU to big or little depending on how busy that CPU is.  Note
that the Linaro switcher implementation switches each CPU independently.
It does not switch them all at the same time like the ARM reference
switcher implementation does.
Does the cpufreq driver need to consider whether switching to the cluster
which may bring power benefit? Like if the power of bLLL is higher than LLLL,
for bLLL has both cluster powered on, while LLLL only has one cluster works.
Anyway switching independently provide more flexible user policy.
...
The cost of this simplicity is that you can't run Linux on all the
physical CPUs simultaneously.  This means lower peak throughput and
lower parallelism than is possible with big.LITTLE MP.  But a similar
level of powersaving should be achievable with both, because they
both allow any physical CPU or cluster to be idled or turned off
when the system is sufficiently idle.
For most use cases, the reduced throughput probably won't be an issue:
the big cores have higher performance than the little cores anyway,
so when the platform is running at full throttle, you get most of
the theoretically possible system throughput even with the little
cores turned off.  Having more CPUs active also adds interprocessor/
intercluster communication overheads, so you may still get a better
power-performance tradeoff if the little cluster is simply turned
off when you want to run the device as fast as possible.
...
I am curious to know which implementation linaro finally choose. :)
Many appreciations for all the support.
The main difference is that the switcher approach does not rely on
experimental scheduler modifications, and the impact of the switcher on
system behaviour and performance is better understood than for MP right
now.  Therefore it should be possible to have it working well sooner
(and upstreamed sooner) than will be possible with b.L MP.
b.L MP is a more flexible and powerful approach, but this is expected to
mature over a longer timescale.  Linaro is interested in both.
Yep, the scheduler modification is a tough task. :)
...
Cheers
---Dave
Thanks,
Lei

2025

2024

2023

2022

2021

2020

2019

2018

2017

2016

2015

2014

2013

2012

2011

Re: How big.little switcher works when it try to boot all cpus in all clusters and then try to switch?