Hi Vincent,
Here we have RT activity running on big CPU cluster induced with rt-app, and running hackbench in parallel. The RT tasks are bound to 4 CPUs on the big cluster (cpu 4,5,6,7) and have 100ms periodicity with runtime=20ms sleep=80ms.
Hackbench shows big benefit (30%) improvement when number of tasks is 8 and 32: Note: data is completion time in seconds (lower is better). Number of loops for 8 and 16 tasks is 50000, and for 32 tasks its 20000. +--------+-----+-------+-------------------+---------------------------+ | groups | fds | tasks | Without Patch | With Patch | +--------+-----+-------+---------+---------+-----------------+---------+ | | | | Mean | Stdev | Mean | Stdev | | | | +-------------------+-----------------+---------+ | 1 | 8 | 8 | 1.0534 | 0.13722 | 0.7293 (+30.7%) | 0.02653 | | 2 | 8 | 16 | 1.6219 | 0.16631 | 1.6391 (-1%) | 0.24001 | | 4 | 8 | 32 | 1.2538 | 0.13086 | 1.1080 (+11.6%) | 0.16201 | +--------+-----+-------+---------+---------+-----------------+---------+
Out of curiosity, do you know why you don't see any improvement for 16 tasks but only for 8 and 32 tasks ?
Yes I'm not fully sure why 16 tasks didn't show that much improvement.
Yes. This is just to make sure that there no unexpected side effect
Just got back from vacation. Tried to reproduce these results, looks like our product kernel changed enough that I am not able to exactly replicate these results and I don't recall the tree I ran these on. I will redo these tests and share my data in the next rev. Worst case I can probably drop this test, since there are other hackbench tests in this patch as well that show improvements. But I'll give it a shot to make sure no side effects from this. thanks.
- Joel