Hi Yabin,
On Fri, Sep 20, 2019 at 12:06:26PM -0700, Yabin Cui wrote:
Thanks for the suggestions!
I think if use ETR's sg mode or CATU mode, it might ease the overhead
for CPU and memory bandwidth rather than flat mode.
I am using ETR's flat mode, not sure if the qcom sdm845 supports SG or CATU. If they are good for performance, I can try them.
Let me firstly try your testing and share back the profiling result on my Hikey board.
Another thing you could try is to use 'perf -e cycles' command to
locate the hotspot in CoreSight tracing flow
Yes, I will profile the kernel code. One thing interesting is that tmc_flush_and_stop() always show timeout warnings on hikey board, but not on pixel 3. Don't know if it does some time consuming hardware operation.
Could you confirm which Hikey board you are using? Hikey620 or Hikey960?
One thing should note is for Hikey620, usually it's required to disable CPUIdle, otherwise the CoreSight driver might report warning and dump stack trace. If possible, could you try to add "nohlt" into kernel command line and check if the timeout warnings will dismiss or not.
I suppose that depends on the mode operation, i.e per-thread of
CPU-wide. It is hard for me to comment without more information about how you came up with those metrics.
I am using an aux buffer per cpu. Differents threads on one cpu share the same buffer using ioctl(PERF_EVENT_IOC_SET_OUTPUT). I measure the metrics like below: $ simpleperf record -e cs-etm simpleperf stat -e cpu-cycles etm_test_one_thread
simpleperf is an alternative of linux perf we use on android. you can replace it with linux perf with tiny option change. cpu-cycles can be replaced by cpu-cycles:k or cpu-cycles:u to measure kernel or user space only. etm_test_one_thread can be replaced by etm_test_multi_threads.
I included the test programs in the attachment, in case anyone wants to try them in your environment. They need to be compiled with -O0.
Thanks for sharing the testing. I will try the testing on my Hikey620 board.
Thanks, Leo Yan