On Fri, Apr 9, 2021 at 12:43 AM Pratik Sampat psampat@linux.ibm.com wrote:
On 09/04/21 10:53 am, Doug Smythies wrote:
I tried V3 on a Intel i5-10600K processor with 6 cores and 12 CPUs. The core to cpu mappings are: core 0 has cpus 0 and 6 core 1 has cpus 1 and 7 core 2 has cpus 2 and 8 core 3 has cpus 3 and 9 core 4 has cpus 4 and 10 core 5 has cpus 5 and 11
By default, it will test CPUs 0,2,4,6,10 on cores 0,2,4,0,2,4. wouldn't it make more sense to test each core once?
Ideally it would be better to run on all the CPUs, however on larger systems that I'm testing on with hundreds of cores and a high a thread count, the execution time increases while not particularly bringing any additional information to the table.
That is why it made sense only run on one of the threads of each core to make the experiment faster while preserving accuracy.
To handle various thread topologies it maybe worthwhile if we parse /sys/devices/system/cpu/cpuX/topology/thread_siblings_list for each core and use this information to run only once per physical core, rather than assuming the topology.
What are your thoughts on a mechanism like this?
Yes, seems like a good solution.
... Doug