Hi Zied,
On 11/9/20 5:20 PM, Mathieu Poirier wrote:
On Sat, Nov 07, 2020 at 04:41:07AM +0000, zied guermazi wrote:
hi Mathieuis this status still up to date? is there any changes in betweenFYI: on gdb it is possible to trace multi-threaded programs on Intel processors using intel PT. I am not quite sure how this is realized, but I tested it and it works.
Suzuki has done some work on this. Based on his findings doing something such as:
$ perf record -e cs_etm/.../ some_multithread_application
should work. Notice the absence of the --per-thread, -A or -C options. Kernel shark traces on a multi-threaded application yielded positive results. Inspection of the collected traces also showed trace data from different threads.
Yes, that should work. However please be aware that, trace decoding could be tricky on current platforms, due to shared sink. i.e, the traces from the threads could be mixed up in a single AUX record. It is possible to separate them by combining the TraceID (of the ETM) and the "CID" in the packets.
e.g, you could have :
AUX record = 0 TraceID: 0 .. Context Packet.... CID=pidA #Thead A on TraceID 0 (etm0) TraceID: 0 ... Branch packets.. #Thread A TraceID: 1 .. Context Packet.... CID=pidB #Thead B on TraceID 1 (etm1) TraceID: 0 ... Branch packets.. #Thread A TraceID: 1... Branch packets.. #Thread B TraceID: 0 .. Context Packet... CID=pidB # Now Thead B on TraceID 0 (etm0) TraceID: 0 ... Branch Packets... #Thread B
Ideally the cs_etm decoder should try to do this split, and may be create new AUX records by separating the trace stream (by correlating TraceID and CID) during a perf inject.
Or even teach the perf tool to look in all AUX records, ignoring the TID in the AUX_RECORD header.
Suzuki
Have a go at it and let us know how you fare.
Thanks, Mathieu
I am attaching the gdb test source file for it. it can be compiled using "�gcc -g ./non-stop.c -o ./non-stop -lpthread" start gdb using "gdb ./non-stop"on gdb execute following commandsb mainrunbreak 27 break 30 thread apply all continue & record btrace thread apply all continue &thread apply all info recinfo threadsset record instruction-history-size 256 thread apply 1 record goto 2 thread apply 2 record goto 4thread apply 1 record instruction-history 1thread apply 2 record instruction-history 1
you can see that gdb collects traces for both threads Kind RegardsZied Guermazi
On Monday, March 2, 2020, 4:23:10 PM GMT+1, Mathieu Poirier <mathieu.poirier@linaro.org> wrote:
Hi Andrea,
On Mon, 2 Mar 2020 at 01:46, Andrea Brunato Andrea.Brunato@arm.com wrote:
Good morning,
Is tracing a multi-threaded program a supported use case for perf cs-etm?
It is not a supported case for both PT and Coresight.
Mathieu
If yes, are there any flags that should be specified with perf?
Thanks, Andrea IMPORTANT NOTICE: The contents of this email and any attachments are confidential and may also be privileged. If you are not the intended recipient, please notify the sender immediately and do not disclose the contents to any other person, use it for any purpose, or store or copy the information in any medium. Thank you. _______________________________________________ CoreSight mailing list CoreSight@lists.linaro.org https://lists.linaro.org/mailman/listinfo/coresight
CoreSight mailing list CoreSight@lists.linaro.org https://lists.linaro.org/mailman/listinfo/coresight
/* This testcase is part of GDB, the GNU debugger.
Copyright 2015-2019 Free Software Foundation, Inc. This program is free software; you can redistribute it and/or modify it under the terms of the GNU General Public License as published by the Free Software Foundation; either version 3 of the License, or (at your option) any later version. This program is distributed in the hope that it will be useful, but WITHOUT ANY WARRANTY; without even the implied warranty of MERCHANTABILITY or FITNESS FOR A PARTICULAR PURPOSE. See the GNU General Public License for more details. You should have received a copy of the GNU General Public License along with this program. If not, see <http://www.gnu.org/licenses/>. */
#include <pthread.h>
static int global;
static void * test (void *arg) { unsigned int i;
i = 0; /* bp.1 */ for (; i < 10; ++i) global += i; /* loop */
return arg; /* bp.2 */ }
int main (void) { pthread_t th;
pthread_create (&th, NULL, test, NULL);
test (NULL);
pthread_join (th, NULL);
return 0; /* bp.3 */ }