These series includes fixes/changes that have been done. These patches
mainly fixed some typos or moves code but should not change the
behavior of rt-app.
Those patches are also found in the branch:
https://git.linaro.org/people/picheng.chen/rt-app.git fix
Chris Muller (1):
Update thread name
Ivan T. Ivanov (1):
fixup json-c dependencies
Vincent Guittot (12):
Minor clean up
fix deadline print format
consolidate trace and debug point
update .gitignore
fix inconsistency in delay unit
fix cpu affinity string info
deadline: set deadline field to deadline parameter
reorder the start sequence of threads
cleanup of doc directory
remove useless json_object_put
remove conditional compilation of json Remove useless --with-json
rt-app: remove use of deprecated json interface
.gitignore | 8 ++
README.in | 7 +-
autogen.sh | 2 +-
configure.ac | 17 ++--
doc/taskset.yml | 53 -----------
src/Makefile.am | 8 +-
src/rt-app.c | 227 ++++++++++++++++++++++++++--------------------
src/rt-app_args.c | 77 ++++++++--------
src/rt-app_parse_config.c | 51 +++++++----
src/rt-app_parse_config.h | 2 +-
src/rt-app_utils.c | 3 +
src/rt-app_utils.h | 4 +-
12 files changed, 228 insertions(+), 231 deletions(-)
delete mode 100644 doc/taskset.yml
--
1.9.1
Add a tutorial.txt file that describes the grammar used in json file to
emulated a use case
Signed-off-by: Vincent Guittot <vincent.guittot(a)linaro.org>
---
doc/tutorial.txt | 374 +++++++++++++++++++++++++++++++++++++++++++++++++++++++
1 file changed, 374 insertions(+)
create mode 100644 doc/tutorial.txt
diff --git a/doc/tutorial.txt b/doc/tutorial.txt
new file mode 100644
index 0000000..3b2fa03
--- /dev/null
+++ b/doc/tutorial.txt
@@ -0,0 +1,374 @@
+HOW-TO use rt-app and workgen to emulate a workload ?
+
+**** rt-app and workgen ****
+rt-app/workgen is a tool that can be used to emulate a use case. Not only the
+sleep and run pattern can be emulated but also the dependency between tasks
+like accessing same critical resources, creating sequencial wake up or syncing
+the wake up of threads. The use case is described in a json like file which is
+parsed by workgen then rt-app.
+
+workgen is a python script that will parse, check and update the json like file
+in order to strictly match the json grammar before calling rt-app that will
+execute the sequence described in it.
+
+Even if the json protocol is used in rt-app, workgen enables some freedom
+compared to json grammar:
+- You don't have to ensure uniqueness of key in json object, workgen will check
+the file and ensure that each key is unique by appending an index if necessary.
+The feature is quite useful when you describe a sequence made of same kind of
+events like a sequence of multiple sleep and run events.
+- Some values can be omitted and workgen will add them when it parses the file
+before starting the use case.
+
+**** json file skeleton ****
+
+The json file that describes a workload is made on 3 main objects: tasks,
+resources and global objects. Only task object is mandatory; default value will
+be used if global object is not defined and resources object is kept for
+backward compatibility with old version of rt-app but it doesn't give any
+benefit to declare it as the resources are now dynamiccally created by rt-app
+when it parses the file.
+
+**** global object ****
+
+The global object defines parameters for the whole use case:
+
+* duration : Integer. Duration of the use case in second. All the threads will
+be killed once the duration has elapsed. if -1 has been set, the use case will
+run indefinitly until all threads kill themselve (as an example if a finite
+number of loop has been defined in their running pattern) or if a signal
+is received to stop the use case.
+
+* calibration : Text or Integer: A text defines the CPU that will be used to
+calibrate the ns per loop value. "CPU0" is the default value (see run event
+section for details about ns per loop value). A integer skips the calibration
+step and uses the integer value as ns per loop.
+
+* default_policy : String. Default scheduling policy of threads. Default
+value is SCHED_OTHER.
+
+* pi_enabled: Boolean. Enable the prority inheritance of mutex. Default value
+is False.
+
+* lock_pages : Boolean. Lock the mem page in RAM. The lock of the page is
+only possible for non CFS tasks. Default value is False.
+
+* logdir : Text. Path to store the various log files. The default path is
+the current directory (./).
+
+* log_basename : Text. Prefix used for all log files of the use case.
+"rt-app-" is used by default.
+
+* ftrace: Boolean. If enable, rt-app logs in ftrace the main events of the use
+case. Default value is False.
+
+* gnuplot : Boolean. if True, it will create a gnu plot compatible file for
+each threads (see gnuplot section for more details). Default value is False.
+
+*** default global object:
+ "global" : {
+ "duration" : -1,
+ "calibration" : "CPU0",
+ "default_policy" : "SCHED_OTHER",
+ "pi_enabled" : false,
+ "lock_pages" : false,
+ "logdir" : "./",
+ "log_basename" : "rt-app",
+ "ftrace" : false,
+ "gnuplot" : false,
+ }
+
+**** tasks object ****
+
+ The tasks object is made of sub-objects that describe threads. No other kind
+ of object than thread object is allowed. The key value of each object will be
+ used as thread's name.
+
+"tasks" : {
+ "thread_name1" : {
+ ...
+ },
+ "thread_name2" : {
+ ...
+ },
+ "thread_name3" : {
+ ...
+ },
+ ...
+}
+
+*** thread object ***
+
+Thread object describes the behavior of one kind of thread which means that
+several threads can be created with the same object (see instance parameter
+below).
+
+* instance : Integer. Define the number of threads that will be created with
+the properties of this thread object. Default value is 1.
+
+*** policy : String: Define the scheduling policy of the thread. default_policy
+is used if not defined
+
+* priority : Integrer. Define the scheduling priority of the thread. The value
+must be aligned with the range allowed by the policy. The default priority is
+0 for sched_other class and 10 for other classes
+
+* runtime : Integer: Define the runtime budget for deadline scheduling class.
+Default value is 0.
+
+* period : Integer. Define the period duration for deadline scheduling class.
+Default value is runtime.
+
+* deadline : Integer. Define the deadline parameter for deadline scheduling
+class. Default value is period.
+
+* cpus: Array of Integer. Define the CPU affinity of the thread. Default
+value is all CPUs of the system. An example : "cpus" : [0, 2, 3]
+
+* phases: Object. The phases object described TBF
+
+If there is only 1 phase, the sequence of events can be directly put in the
+thread object instead of creating a phases object. As an example, the 2 objects
+below are equals:
+
+"task" : {
+ "thread1" : {
+ "phases" : {
+ "phase1" : {
+ "run" : 10,
+ "sleep" : 10
+ }
+ }
+ }
+}
+
+"task" : {
+ "thread1" : {
+ "run" : 10,
+ "sleep" : 10
+ }
+}
+
+* loop: Integer. Define the number of times the parent object must be run.
+The parent object can be a phase or a thread. For phase object, the loop
+defines the number of time the phase will be executed before starting the next
+phase. For thread object, the loop defines the number of times that the
+complete "phases" object will be executed. The default value is -1.
+
+*** events ***
+
+events are simple action that will be performed by the thread or on the
+thread. They have to be listed by execution order.
+
+* run : Integer. Emulate the execution of a load. The duration is defined in
+usec but the run event will effectively run a number of time a loop that waste
+cpu cycles. Whe the run event is executed, the duration is transformed in a
+number of loop thanks to the ns per loop value (see calibration). This way of
+working enables to emulate a load with a duration that will vary with the
+frequency or the compute capacity of the CPU.
+
+* sleep : Integer. Emulate the sleep of a task. The duration is defined in
+usec.
+
+* timer : Object. Emulate the wake up of the thread by a timer. Timer differs
+from sleep event by the start time of the timer duration. Sleep duration starts
+at the beg of the sleep event whereas timer duration starts at the end of the
+last use of the timer. So Timer event are immunized against preemption,
+frequency scaling and computing capacity of a CPU. The initial starting time of
+the timer is set during the 1st use of the latter.
+
+ |<------19---------->|<------19---------->|<------19---------->|
+task A ...|run 5|timer 19 |------run 5|timer 19|run 5|timer 19 |
+task B |run 10 |
+
+
+As an example to point out the difference betwwen sleep and timer, let consider
+a task A that run 5 and then sleep 10. The period of task A should be 15. Let
+add a task B that runs 5 and use a timer to wakes up every 19.
+
+ |<------15------>|<------15------>|<------19---------->|<------15------>|<------16------->|<------19---------->|<------19---------->|<------19---------->|<------19---------->|<------19---------->|
+taskA ...|run 5|sleep 10 |run 5|sleep 10 |----run 5|sleep 10 |run 5|sleep 10 |-run 5|sleep 10 |----run 5|sleep 10 |----run 5|sleep 10 |----run 5|sleep 10 |----run 5|sleep 10 |----run 5|sleep 10 |
+taskB ...|------run 5|timer 19|--run 5|timer 19 |run 5|timer 19 |run 5|timer 19 |run 5|timer 19 |run 5|timer 19 |run 5|timer 19 |run 5|timer 19 |run 5|timer 19 |
+ |<------19---------->|<------19---------->|<------19---------->|<------19---------->|<------19---------->|<------19---------->|<------19---------->|<------19---------->|<------19---------->|
+
+We can see that task B period stays to 19 even if the run event is delayed
+because of scheduling preemption whereas the period of task A starts at 15 but
+increases to 19 because of the scheduling delay.
+
+"unique" timer device: When a thread that use a timer is instanciated, the
+timer will be shared across the thread instance which disturbe the original
+sequence. In order to have 1 timer per instance, you can use the "unique"
+value so a new timer will be created for each instance.
+
+* lock : String. Lock the mutex defined by the string value.
+
+* unlock : String. Unlock the mutex defined by the string value.
+
+* wait : Object {"ref" : String, "mutex" : String }. Block the calling thread
+until another thread sends a wake up signal to the resource defined by "ref".
+The mutex defined by "mutex" is used during the block sequence: See
+pthread_cond_wait() for more details about the sequence (especially regarding
+the use of the mutex).
+
+* signal : String. Send a wake up signal to the resource defined by the
+string. The 1st thread in the wait list will be wokn up. See
+pthread_cond_signal() for more details.
+
+* broad : String. Send a wake up signal to the resource defined by the
+string. All threads that are blocked on the resource wil wake up. See
+pthread_cond_broadcast() for more details.
+
+* sync : Object {"ref" : String, "mutex" : String }. Atomically wakes
+up a blocked thread and then blocks the calling thread on the condition.
+taskA ...|run 5|wait|------------|run 5| wait |----------------
+taskB ...-------------------|sync|--------
+
+The sync event "sync" : {"ref" : "CondA", "mutex" : "mutexA" } generates the
+following sequence:
+{
+ "lock" : "mutexA",
+ "signal" : "CondA",
+ "wait" : { "ref" : "condA", "mutex" : "mutexA" },
+ "unlock" : "mutexA"
+}
+
+* suspend : String. Block the calling thread until another thread wakes it up
+with resume. The String can be let empty as it will be filled by workgen with
+the right thread's name before starting the use case.
+
+* resume : String. Wake up the thread defined by the string.
+
+taskA ...|run 5|suspend |----------------|run 5|suspend |----------------
+taskB ...-------------------|resume taskA|run 10 |--------------------
+
+
+**** Trace and Log ****
+
+Some traces and log hooks have been added to ease the debug and monitor various
+metrics of the use cases
+
+*** Trace ***
+
+A trace can be inserted into ftrace buffer for each event in order to sync
+kernel events like sched_switch with the use case's sequence.
+
+A simple example of ftrace log:
+
+ thread0-23345 [003] 115591.560884: tracing_mark_write: [0] starts
+ thread0-23345 [003] 115591.560890: tracing_mark_write: [0] begins loop 0 phase 0 step 0
+ thread0-23345 [003] 115591.560894: tracing_mark_write: [0] executing 0
+ thread0-23345 [003] 115591.580212: tracing_mark_write: [0] executing 1
+ thread0-23345 [003] 115591.580217: sched_switch: prev_comm=thread0 prev_pid=23345 prev_prio=120 prev_state=S ==> next_comm=swapper/3 next_pid=0 next_prio=120
+ <idle>-0 [003] 115591.670198: sched_switch: prev_comm=swapper/3 prev_pid=0 prev_prio=120 prev_state=R ==> next_comm=thread0 next_pid=23345 next_prio=120
+ thread0-23345 [003] 115591.670243: tracing_mark_write: [0] end loop 0 phase 0 step 0
+ thread0-23345 [003] 115591.670251: tracing_mark_write: [0] begins loop 0 phase 0 step 1
+ thread0-23345 [003] 115591.670254: tracing_mark_write: [0] executing 0
+ thread0-23345 [003] 115591.688081: tracing_mark_write: [0] executing 1
+ thread0-23345 [003] 115591.688085: sched_switch: prev_comm=thread0 prev_pid=23345 prev_prio=120 prev_state=S ==> next_comm=swapper/3 next_pid=0 next_prio=120
+ <idle>-0 [003] 115591.778063: sched_switch: prev_comm=swapper/3 prev_pid=0 prev_prio=120 prev_state=R ==> next_comm=thread0 next_pid=23345 next_prio=120
+ thread0-23345 [003] 115591.778108: tracing_mark_write: [0] end loop 0 phase 0 step 1
+ thread0-23345 [003] 115591.778116: tracing_mark_write: [0] begins loop 0 phase 0 step 2
+ thread0-23345 [003] 115591.778119: tracing_mark_write: [0] executing 0
+ thread0-23345 [003] 115591.794619: tracing_mark_write: [0] executing 1
+ thread0-23345 [003] 115591.794623: sched_switch: prev_comm=thread0 prev_pid=23345 prev_prio=120 prev_state=S ==> next_comm=swapper/3 next_pid=0 next_prio=120
+ <idle>-0 [003] 115591.884607: sched_switch: prev_comm=swapper/3 prev_pid=0 prev_prio=120 prev_state=R ==> next_comm=thread0 next_pid=23345 next_prio=120
+ thread0-23345 [003] 115591.884652: tracing_mark_write: [0] end loop 0 phase 0 step 2
+ ...
+ thread0-23345 [003] 115593.394022: tracing_mark_write: [0] begins loop 0 phase 0 step 17
+ thread0-23345 [003] 115593.394025: tracing_mark_write: [0] executing 0
+ thread0-23345 [003] 115593.410583: tracing_mark_write: [0] executing 1
+ thread0-23345 [003] 115593.410594: sched_switch: prev_comm=thread0 prev_pid=23345 prev_prio=120 prev_state=S ==> next_comm=swapper/3 next_pid=0 next_prio=120
+ <idle>-0 [003] 115593.500567: sched_switch: prev_comm=swapper/3 prev_pid=0 prev_prio=120 prev_state=R ==> next_comm=thread0 next_pid=23345 next_prio=120
+ thread0-23345 [003] 115593.500612: tracing_mark_write: [0] end loop 0 phase 0 step 17
+ thread0-23345 [003] 115593.500620: tracing_mark_write: [0] begins loop 0 phase 0 step 18
+ thread0-23345 [003] 115593.500623: tracing_mark_write: [0] executing 0
+ thread0-23345 [003] 115593.520198: tracing_mark_write: [0] executing 1
+ thread0-23345 [003] 115593.520202: sched_switch: prev_comm=thread0 prev_pid=23345 prev_prio=120 prev_state=S ==> next_comm=swapper/3 next_pid=0 next_prio=120
+ <idle>-0 [003] 115593.610185: sched_switch: prev_comm=swapper/3 prev_pid=0 prev_prio=120 prev_state=R ==> next_comm=thread0 next_pid=23345 next_prio=120
+ thread0-23345 [003] 115593.610230: tracing_mark_write: [0] end loop 0 phase 0 step 18
+ thread0-23345 [003] 115593.610258: tracing_mark_write: [0] exiting
+
+The main events are :
+
+* "tracing_mark_write: [0] starts" indicates the start of the thread with index 0
+
+* "tracing_mark_write: [0] begins loop A phase B loop C" indicates that the
+thread with index 0 starts to execute the loop C of the phase B of the main
+loop A
+
+* "tracing_mark_write: [0] executing X": indicates that the thread with index 0
+execute the event X of the current phase
+-"tracing_mark_write: [0] end loop A phase B step C" indicates that the
+thread with index 0 ends to execute the event C of the phase B of the loop A
+
+*"tracing_mark_write: [0] exiting": indiscates that the trhead with index 0 has
+finished to execute its events.
+
+*** Log and gnuplot ***
+
+You can log some metrics for each phase that is executed by a thread. These
+metrics are:
+- perf: number of loop that has been executed by run events
+- run: time spent by the thread to execute the run events
+- period: duration to execute the complte phase
+- start/end : absolute start and end time of a phase. Same time base is used in
+ftrace
+- rel_st: start time of a phase relatively to the beg of the use case
+
+Below is an extract of a log:
+
+# Policy : SCHED_OTHER priority : 0
+#idx perf run period start end rel_st
+0 500000 34382 114492 175604833899 175604948391 785
+0 500000 29786 109863 175604948437 175605058300 115323
+0 500000 26977 107077 175605058336 175605165413 225223
+0 500000 28667 108771 175605165458 175605274229 332344
+0 500000 26644 106745 175605274274 175605381019 441160
+0 500000 26745 106867 175605381061 175605487928 547947
+0 500000 29513 109620 175605487974 175605597595 654860
+
+Some gnuplot files are also created to generate charts based on the log files
+for each thread and for each kind of metrics. The format of the chart that
+will be generated by gnuplot is eps (text art has been used for the chart
+below)
+
+ Measured thread0 Loop stats
+ Load [nb loop]
+ 120000 ++--+----+---+----+---+---+----+---+----+--++ 506000 load ******
+ + + + + + + + + + + + run ######
+ |$$ : : : : $$$ : : : : | period $$$$$$
+ 110000 ++.$$$........$$$$...$...$..........$$$.$$$$+
+ | : $$$$$$$$ $$ $: $ $$$$$$$$ $: $
+ | : : : : $ : :$ : : : ++ 504000
+ | : : : : : : : : : |
+ 100000 ++.........................................++
+ | : : : : : : : : : |
+ | : : : : : : : : : |
+ 90000 ++.........................................++
+ | : : : : : : : : : ++ 502000
+ | : : : : : : : : : |
+ 80000 ++.........................................++
+ | : : : : : : : : : |
+ | : : : : : : : : : |
+ 70000 +******************************************** 500000
+ | : : : : : : : : : |
+ | : : : : : : : : : |
+ | : : : : : : : : : |
+ 60000 ++.........................................++
+ | : : : : : : : : : |
+ | : : : : : : : : : ++ 498000
+ 50000 ++.........................................++
+ | : : : : : : : : : |
+ | : : : : : : : : : |
+ 40000 ++.........................................++
+ | : : : : : : : : : ++ 496000
+ |# : : : : : : : : : |
+ | # : : : : ### : : : : |
+ 30000 ++.###.###....####...#...#..........###.####+
+ | : # :#### ### : # ######## #: #
+ + + + + + + +# + + + +
+ 20000 ++--+----+---+----+---+---+----+---+----+--++ 494000
+ 17560556057560556057560656065606756065606756075607
+ Loop start time [msec]
+
+
--
1.9.1
Use pthread_cond_broadcast instead of pthread_cond_signal in order to wake
up all instances of a thread
Signed-off-by: Vincent Guittot <vincent.guittot(a)linaro.org>
---
src/rt-app.c | 2 +-
1 file changed, 1 insertion(+), 1 deletion(-)
diff --git a/src/rt-app.c b/src/rt-app.c
index d2945e1..4376313 100644
--- a/src/rt-app.c
+++ b/src/rt-app.c
@@ -198,7 +198,7 @@ static int run_event(event_data_t *event, int dry_run,
{
log_debug("resume %s ", rdata->name);
pthread_mutex_lock(&(ddata->res.mtx.obj));
- pthread_cond_signal(&(rdata->res.cond.obj));
+ pthread_cond_broadcast(&(rdata->res.cond.obj));
pthread_mutex_unlock(&(ddata->res.mtx.obj));
break;
}
--
1.9.1
Dear all,
I'm using idlestat on a quad-core machine. There is only one cluster. I didn't implement cpu-map node in cpus node of device-tree file. My cluster id is read as -1.
It incurred idlestat assert failed. Need we judge this condition in idlestat? Or must we write cpu-map node in our dts file?
Thank you.
--
Zhizhou
Hi everyone,
For tuning schedule on big-LITTLE architecture, the workload replay helps. I have tried to use perf sched record/replay. But it seems not so exactly.
I have heard linaro group is going to implement the workload replay with rt-app project. Could anybody tell me what's the schedule about this feature?
Thank you.
--
Zhizhou
Tuukka,
While testing out the various outstanding patches, I found that the
comparison mode was broken.
Specifically, the following command failed with an assertion in merged_pstates()
./idlestat --import -f changedstate -b mytrace -r comparison -p
Data files can be obtained from pastebin:
mytrace = http://paste.ubuntu.com/10582474/
changedstate = http://paste.ubuntu.com/10582483/
During a quick investigation, it seemed like merged_pstates() isn't
doing what it says in the comments. The following patch seems to fix
it and make comparison mode work again. Can you confirm this is the
intended behaviour?
Minimal testing done.
Regards,
Amit