On Tue, Dec 18, 2012 at 5:53 PM, Vincent Guittot vincent.guittot@linaro.org wrote:
On 17 December 2012 16:24, Alex Shi alex.shi@intel.com wrote:
>> The scheme below tries to summaries the idea: >> >> Socket | socket 0 | socket 1 | socket 2 | socket 3 | >> LCPU | 0 | 1-15 | 16 | 17-31 | 32 | 33-47 | 48 | 49-63 | >> buddy conf0 | 0 | 0 | 1 | 16 | 2 | 32 | 3 | 48 | >> buddy conf1 | 0 | 0 | 0 | 16 | 16 | 32 | 32 | 48 | >> buddy conf2 | 0 | 0 | 16 | 16 | 32 | 32 | 48 | 48 | >> >> But, I don't know how this can interact with NUMA load balance and the >> better might be to use conf3. > > I mean conf2 not conf3
Cyclictest is the ultimate small tasks use case which points out all weaknesses of a scheduler for such kind of tasks. Music playback is a more realistic one and it also shows improvement
granularity or one tick, thus we really don't need to consider task migration cost. But when the task are not too small, migration is more
For which kind of machine are you stating that hypothesis ?
Seems the biggest argument between us is you didn't want to admit 'not too small tasks' exists and that will cause more migrations because your patch.
even so they should run in the same socket for power saving consideration(my power scheduling patch can do this), instead of spread to all sockets.
This is may be good for your scenario and your machine :-) Packing small tasks is the best choice for any scenario and machine.
That's clearly wrong, I had explained many times, your single buddy CPU is impossible packing all tasks for a big machine, like for just 16 LCPU, while it suppose do.
Anyway you have right insist your design. and I thought I can not say more clear about the scalability issue. I won't judge the patch again.