Hello Amit/Robert/Others,
I was looking into the code for heat_cpu.c in pm-qa. I think there are some
issues with the current implementation. Please find details below -
1. #define NR_THREAD 3
pid_t thr_id[NR_THREAD];
I think the array thr_id should be dynamically allocated as number of
threads depend on number of cpus. However, we can eliminate the need for
this array as mentioned below at no 7.
2. There is no error checking for sysconf (...) call (line 83). If this
call fails num_cpus will be -1 which is a problem.
3. at line 122,
pthread_t *p_thread_ptr = (pthread_t *) malloc( num_cpus *
sizeof(pthread_attr_t));
we are allocating space for pthread_t. So sizeof(ptread_t) should be
used instead of sizeof(pthread_attr_t).
4. at line 144, SCHED_NORMAL is used. On my Ubuntu 13.04, <sched.h> does
not define SCHED_NORMAL. So it gives compile error if ANDROID is defined.
heat_cpu.c:145:44: error: ‘SCHED_NORMAL’ undeclared (first use in this
function)
We can either use SCHED_OTHER (which is equivalent to SCHED_NORMAL) or
include <linux/sched.h>
5. at line 168, error message is misleading. This error is related to
failure from pthread_create, not during setting affinity.
6. at line 76, the call to gettid results in linking error -
heat_cpu.c: In function ‘do_loop’:
heat_cpu.c:77:2: warning: implicit declaration of function ‘gettid’
[-Wimplicit-function-declaration]
/tmp/ccAhcUGE.o: In function `do_loop':
heat_cpu.c:(.text+0x4e): undefined reference to `gettid'
There is no libc wrapper for gettid. So I think it should be called as
below -
#include <sys/syscall.h>
thr_id[thread] = syscall(SYS_gettid);
With this changes the compilation warning and link error goes away and
it works. However, this call can be eliminated as suggested below at no 7.
7. There is a potential race condition between line 171 & 182, despite the
synchronization mechanism (mutex & condition variable) used. There is no
guaranty that do_loop(...) will execute first and thr_id[thread] =
gettid(); will execute before setting affinity as noted from below debug
output -
setting affinity for : 0
setting affinity for : 0
setting affinity for : 0
setting affinity for : 0
The code is setting affinity for tid 0, which it should not, despite the
synchronization code.
However, we can eliminate need for thr_id and the associated
synchronization if we use pthread_setaffinity_np(...) .
Would you care to test the alternative implementation using the patch
below (also attached)?
--------------------------------------------------------------------
diff --git a/utils/heat_cpu.c b/utils/heat_cpu.c
index 90d8a97..236cb1d 100644
--- a/utils/heat_cpu.c
+++ b/utils/heat_cpu.c
@@ -47,23 +47,12 @@
#include <string.h>
#include <math.h>
-#define NR_THREAD 3
-
int cont = 1;
-int cpu_index;
long long a = 1, c = 1;
float f = M_PI;
-pid_t thr_id[NR_THREAD];
int moderate_inst;
-
-#ifdef ANDROID
-int conditionMet;
-pthread_cond_t cond = PTHREAD_COND_INITIALIZER;
-pthread_mutex_t mutex = PTHREAD_MUTEX_INITIALIZER;
-#endif
-
void *do_loop(void *data)
{
const long long b = 3;
@@ -71,14 +60,6 @@ void *do_loop(void *data)
a = 1;
c = 1;
-#ifdef ANDROID
- long int thread = (long int) data;
- thr_id[thread] = gettid();
- pthread_mutex_lock(&mutex);
- while (!conditionMet)
- pthread_cond_wait(&cond, &mutex);
- pthread_mutex_unlock(&mutex);
-#endif
while (cont) {
if (!moderate_inst) {
@@ -96,9 +77,15 @@ void *do_loop(void *data)
int main(int arg_count, char *argv[])
{
int ret, i;
- int num_cpus = sysconf(_SC_NPROCESSORS_ONLN);
+ int num_cpus;
cpu_set_t cpuset;
+ num_cpus = sysconf(_SC_NPROCESSORS_ONLN);
+ if (num_cpus < 0) {
+ printf("ERROR: sysconf error\n");
+ return -1;
+ }
+
printf("Num CPUs: %d\n", num_cpus);
if (arg_count > 1) {
if (!strcmp("moderate", argv[1])) {
@@ -120,7 +107,7 @@ int main(int arg_count, char *argv[])
}
pthread_t *p_thread_ptr = (pthread_t *) malloc(
- num_cpus * sizeof(pthread_attr_t));
+ num_cpus * sizeof(pthread_t));
if (!p_thread_ptr) {
printf("ERROR: out of memory\n");
@@ -141,7 +128,7 @@ int main(int arg_count, char *argv[])
#endif
#else
- ret = pthread_attr_setschedpolicy(&p[i], SCHED_NORMAL);
+ ret = pthread_attr_setschedpolicy(&p[i], SCHED_OTHER);
#endif
/* for each new object */
CPU_SET(i, &cpuset);
@@ -156,7 +143,7 @@ int main(int arg_count, char *argv[])
return ret;
}
#endif
- CPU_CLR(i, &cpuset);
+ CPU_CLR(i, &cpuset);
}
@@ -165,29 +152,23 @@ int main(int arg_count, char *argv[])
ret = pthread_create(&p_thread_ptr[i], &p[i],
do_loop, (void *)i);
if (ret < 0)
- printf("Error setting affinity for cpu%d\n", i);
+ printf("Error creating thread for cpu%d\n", i);
#ifdef ANDROID
CPU_ZERO(&cpuset);
CPU_SET(i, &cpuset);
- ret = sched_setaffinity(thr_id[i], sizeof(cpuset), &cpuset);
+ ret = pthread_setaffinity_np(p_thread_ptr[i],
sizeof(cpuset), &cpuset);
if (ret) {
- printf("Error setting affinity on pthread th_id\n");
+ printf("Error setting affinity on pthread\n");
printf("Error: %s\n", strerror(ret));
return ret;
}
#endif
}
-#ifdef ANDROID
- pthread_mutex_lock(&mutex);
- conditionMet = 1;
- pthread_cond_broadcast(&cond);
- pthread_mutex_unlock(&mutex);
-#endif
-
while (1)
sleep(1);
+
return 0;
}
--
Thanks,
-Meraj
Hello,
I tested powertop on Samsung Arndale Board with Ubuntu Saucy Server 14.01
with custom kernel (config file attached, powertop config options enabled),
and reported a bug 1316096. Also, I found the other bug 1316096 happening
on arndale board.
In short, bug 1316096 - Powertop displays "nan" on top [summary] where it
should show legal value.
Summary: nan wakeups/second, nan
GPU ops/seconds, nan VFS ops/sec and -0.0% CPU use <--- displays "nan"
instead of valid value
bug 1316096 - Idle stats & Frequency stats pages are empty.
I did some debugging and it seems that the root cause of both the bugs may
be same. It looks like the parsing code enumerate_cpus()[src/cpu/cpu.cpp]
called from main.cpp, is expecting /proc/cpuinfo file in a particular
format which works on Intel but not on ARM.
On Intel, it gets the file in the format expected. However, on ARM on
Arndale board with saucy server, the file looks as below -
processor : 0
model name : ARMv7 Processor rev 4 (v7l)
Features : swp half thumb fastmult vfp edsp neon vfpv3 tls vfpv4
idiva idivt vfpd32 lpae evtstrm
CPU implementer : 0x41
CPU architecture: 7
CPU variant : 0x0
CPU part : 0xc0f
CPU revision : 4
processor : 1
model name : ARMv7 Processor rev 4 (v7l)
Features : swp half thumb fastmult vfp edsp neon vfpv3 tls vfpv4
idiva idivt vfpd32 lpae evtstrm
CPU implementer : 0x41
CPU architecture: 7
CPU variant : 0x0
CPU part : 0xc0f
CPU revision : 4
Hardware : SAMSUNG EXYNOS5 (Flattened Device Tree)
Revision : 0000
Serial : 0000000000000000
The parsing code which gathers information about cpus, fails to parse this
file and as a result the vector "all_cpus" remains empty.
I played with it a bit more and added a few lines to get a work around as
follows -
in function enumarate_cpus():
if (strncmp(line, "processor\t",10) == 0) {
char *c;
c = strchr(line, ':');
if (c) {
c++;
number = strtoull(c, NULL, 10);
* handle_one_cpu(number, "Samsung", 0, 0);
[ line added]*
* set_max_cpu(number); [ line added]*
}
}
With this changes, idle stat and frequency pages were populated and summary
was shown properly as noted below -
*Summary: 392.6 wakeups/second, 0.0 GPU ops/seconds, 0.0 VFS ops/sec and
4.0% CPU use*
I think due to this bug, the information shown on summary page was also not
correct and some other info displayed by powertop may be incorrect too.
Please comment.
--
Thanks,
-Meraj
Hello,
It looks like README file in powertop branch is out of sync with how
powertop should be actually built.
It should be updated.
diff --git a/README b/README
index 2961eba..d83b315 100644
--- a/README
+++ b/README
@@ -2,9 +2,10 @@
------------------------------
To build and install PowerTOP type the following commands,
- ./configure
- ./make
- ./make install
+ autoreconf -sfi
+ ./configure
+ make
+ make install
Note: For Android (running Intel Architecture ) there is a Android.mk
that was provided by community members, and at this time is supported
--
Thanks,
-Meraj
Hi All,
In arch/arm64/include/asm/barrier.h, there is the definition of
smp_mb()/smp_rmb()/smp_wmb() for arm64. I noticed that all the 3 macors
are using "dmb ishxx", which is only affect the cluster of the CPU
executing the instruction. But in the big.LITTLE system, there will be 2
cluster. So the smp_mb()/smp_rmb()/smp_wmb() cannot affect all the CPU
in the system.
Is there other considerations so that smp_mb()/smp_rmb()/smp_wmb() are
implemented to "only affecting inner sharable cores"?
Best Regards,
Kelvin K. Li
---------------------------------------------------------------
Software Team, VIA Technologies, Inc.
5F, VIA Tower, 1 Zhongguancun East Road,
Haidian District, Beijing, 100084
Tel: 86-10-59852288 ext.3620
mailto: kelvinkli(a)via-alliance.com