On 29 July 2016 at 04:18, liubowen (A) <liubowen2@huawei.com> wrote:

Hi Mathieu:

 

    Glad to receive your reply.

I am sorry to trouble you, and I will present my question again, and Thanks for your time again.

As we know, perf can be used to record samples by interrupts made from PMU and show hot spots.


Not all PMUs generate interrupts, and that is exactly the case for CoreSight.  The CoreSight PMU simply start trace collection when the process it is associated with is installed on a processor.  The recording process stops when the process is yanked out.  As such issues with spinlocks as you describe below aren't a problem.  The ETMs will trace for as long as the process is executing, regardless of what that execution is.

 

One condition as follows:

 

spin_lock_irq();

   A();

   B();

spin_unlock_irq();

 

In order to avoid deadlock, wo replace spin_lock_irq with spin_lock. However, spin_lock_irq() will disable local interrupt by local_irq_disable().

So, the interrupts made from PMU can not be handle util spin_unlock_irq() executes. At this point, the value of the current Instruction Pointer direct “spin_unlock_irq”.

The time spent on A() and B() will be treated on spin_unlock_irq(), when we perf report, we can not see the occupation of A() or B(). Therefore the report is abnormal, and the normal report should contain A() and B().

Currently, I do not come up with a wonderful solution. And today, I read the paper “CoreSight, Perf and the OpenCSD Library” once again, I get more.

The trace data from ETM will be recorded in the perf.data. When we perf report or script, we can decode the trace data and get hot spots and so on.

I wonder whether the trace data is from the start to the end during recording. If the trace data is complete, the abnormal report will be solved perfectly. Indeed, I do not get insight into the solution offered by you guys. So, it is a little hard for me to check. And I hope you can understand what I say and give me some suggestion.  Maybe it is a easy question, and beg your pardon.^_^


From your description it is not clear (at least to me) if you have collected trace data generated by the CoreSight PMU or not.  
 

 

Okay, thanks very much for your time spenting on my question. And it is my honor to talk with you.

   

On the other hand, I find something wrong. Such as the web page https://github.com/Linaro/OpenCSD/blob/opencsd-0v002/HOWTO.md#on-target-trace-collection


I suggest you use opencsd-0v003 - it has the latest code and updated documentation.

 

I git clone the whole project, but find there is no branch named perf-opencsd-4.7-rc1.

 

So, when I do as follows, I can not get the rc1 branch. Maybe it is the reason why I get stuck.

 


Simply use branch "perf-opencsd-4.7" - it has the same features as "perf-opencsd-4.7-rc1".

Also keep an eye out for the address range filtering feature, allowing one to limit tracing to a very narrow range.  I will publish the code in the coming weeks, as soon as I know it has made it to the maintainers' tree. 

 

At last, thanks thanks for your time!!!

 

Regards

Bob

 

 

 

 

 

 

 

 

 

 

 

发件人: Mathieu Poirier [mailto:mathieu.poirier@linaro.org]
发送时间: 2016728 22:38
收件人: liubowen (A)
抄送: coresight@lists.linaro.org; Zhanweitao
主题: Re: questions on coresight integrated with perf

 

 

 

On 28 July 2016 at 00:53, liubowen (A) <liubowen2@huawei.com> wrote:

Hi,

 

  Thanks for your time!

 

I am bob. I am interested in the CoreSight Project. And I get much from the web page http://www.linaro.org/blog/core-dump/coresight-perf-and-the-opencsd-library/.

 

Because I work on ARM64, there is a bug with perf working on ARM. Specific information from https://www.linaro.org/blog/core-dump/debugging-arm-kernels-using-nmifiq/.

 

  For instance, when we run : dd if=/dev/urandom of=/dev/null, over 90% of the CPU time is spent unlocking interrupts and the cryptographic operations that should dominate the

use case are completely hidden.

 

The author Daniel Thompson from Linaro comes up with a primary solution, however he suggests it will need further work.

 

Now, CoreSight can trace program flow only by hardware. If we combine coresight with perf, when we run “dd if=/dev/urandom of=/dev/null” and perf record, will the report be normal?   

  If it is normal, it will be amazing!!! And, I am eager for the related information.

 

 

What do you expect to see in a "normal" report?  

 

There is no restriction on the code CoreSight can trace, and with the soon-to-be released address filtering capabilities, knowing exactly what the HW is doing will become a lot easier.  The only requirement (for now) is that CPUidle be disabled.

 

 

  I have followed the documentation to enable coresight and perf, but get stuck. I can not figure out whether it is normal.

 

That is unfortunately the downside to CoreSight.  But as every powerful technology, complexity is inherent.

 

 

  I greatly appreciate for your help!!!    Thanks again for your time!!!

 

 

I am not sure of how I can help you here.  Other than the one above (to which I have replied), I don't see any specific questions.

 

Regards,

Mathieu

 


_______________________________________________
CoreSight mailing list
CoreSight@lists.linaro.org
https://lists.linaro.org/mailman/listinfo/coresight