HI Guys,
I have been working on CPU Isolation work since sometime now. The target here is to isolate a Core (for High performance Networking: data plane thread) from all kernel activities. A single data plane thread must run on isolated CPU indefinitely.
So, we need isolation from tasks, timers, ticks, workqueues, etc. Anything I missed in this list?
I am doing this with help of CPUSets/NO_HZ_FULL, etc currently.
One of the problem which isn't solved very well until now is: "How can we guarantee that CPU is isolated" ?
Currently, my script is relying on the fact that on most of the interruptions per-cpu tick gets updated and checking its value must be enough from /proc/interrupts for clkevt device.
Is that enough? Or there are cases when there might be some interruption and ticks don't get updated?
The problem with the /proc/interrupts solution is that it changes with clkevt-driver. Some platforms might have strings as "arch_timer" or "twd" or something else.
Is there some robust way which would work on any platform? ARM/X86/etc ..
-- viresh
On 06/24/2014 01:01 PM, Viresh Kumar wrote:
HI Guys,
I have been working on CPU Isolation work since sometime now. The target here is to isolate a Core (for High performance Networking: data plane thread) from all kernel activities. A single data plane thread must run on isolated CPU indefinitely.
So, we need isolation from tasks, timers, ticks, workqueues, etc. Anything I missed in this list?
I am doing this with help of CPUSets/NO_HZ_FULL, etc currently.
One of the problem which isn't solved very well until now is: "How can we guarantee that CPU is isolated" ?
No traces for the cpu during a long period of time (except for entering idle) ?
Currently, my script is relying on the fact that on most of the interruptions per-cpu tick gets updated and checking its value must be enough from /proc/interrupts for clkevt device.
Is that enough? Or there are cases when there might be some interruption and ticks don't get updated?
The problem with the /proc/interrupts solution is that it changes with clkevt-driver. Some platforms might have strings as "arch_timer" or "twd" or something else.
Is there some robust way which would work on any platform? ARM/X86/etc ..
Yes, ftrace should give you this information by giving no traces.
On 24 June 2014 17:21, Daniel Lezcano daniel.lezcano@linaro.org wrote:
No traces for the cpu during a long period of time (except for entering idle) ?
Two things: - I want to process data at runtime and start new sample as soon as isolation gets interrupted.
Traces are good for analysis though and I am using them to check what's interrupting isolated core.
- Second there are cases, like what I reported to you for ONESHOT_STOPPED bug. These spurious interrupts don't get traced at all..
On 06/24/2014 02:41 PM, Viresh Kumar wrote:
On 24 June 2014 17:21, Daniel Lezcano daniel.lezcano@linaro.org wrote:
No traces for the cpu during a long period of time (except for entering idle) ?
Two things:
- I want to process data at runtime and start new sample as soon as
isolation gets interrupted.
Traces are good for analysis though and I am using them to check what's interrupting isolated core.
- Second there are cases, like what I reported to you for
ONESHOT_STOPPED bug. These spurious interrupts don't get traced at all..
Mmh, I saw your patch but I thought it was to spot the issue. Even if you enable all the traces, these wakeups are not shown ?
On 24 June 2014 19:15, Daniel Lezcano daniel.lezcano@linaro.org wrote:
Mmh, I saw your patch but I thought it was to spot the issue. Even if you enable all the traces, these wakeups are not shown ?
Yeah, these can only be caught by irq traces and those probably are handled later somewhere..
On 24 June 2014 13:01, Viresh Kumar viresh.kumar@linaro.org wrote:
HI Guys,
I have been working on CPU Isolation work since sometime now. The target here is to isolate a Core (for High performance Networking: data plane thread) from all kernel activities. A single data plane thread must run on isolated CPU indefinitely.
So, we need isolation from tasks, timers, ticks, workqueues, etc. Anything I missed in this list?
IRQs even if they are pinned to CPU0 by default on ARM system
I am doing this with help of CPUSets/NO_HZ_FULL, etc currently.
One of the problem which isn't solved very well until now is: "How can we guarantee that CPU is isolated" ?
Currently, my script is relying on the fact that on most of the interruptions per-cpu tick gets updated and checking its value must be enough from /proc/interrupts for clkevt device.
why only clkevt devices ? and not all the IRQs except those that you specifically want to be handled on the isolated CPU (if there are some)
Is that enough? Or there are cases when there might be some interruption and ticks don't get updated?
yes, a small wake up because of an IRQ doesn't always generate a tick irq
The problem with the /proc/interrupts solution is that it changes with clkevt-driver. Some platforms might have strings as "arch_timer" or "twd" or something else.
Is there some robust way which would work on any platform? ARM/X86/etc ..
-- viresh
On 24 June 2014 17:48, Vincent Guittot vincent.guittot@linaro.org wrote:
IRQs even if they are pinned to CPU0 by default on ARM system
Forgot to add them, yes these are affined to non-isolated core on my setup..
why only clkevt devices ? and not all the IRQs except those that you specifically want to be handled on the isolated CPU (if there are some)
Hmm, see below..
Is that enough? Or there are cases when there might be some interruption and ticks don't get updated?
yes, a small wake up because of an IRQ doesn't always generate a tick irq
Probably yes. Though we reach tick_nohz_irq_exit() on irq-exit, but tick_nohz_stop_sched_tick() doesn't always call tick_do_update_jiffies64() ..
Probably checking all irq sources might be a good idea, and so we wouldn't have any platform dependency as well.
But anything else which might be missed here?
Thanks for inputs Daniel & Vincent.
Viresh Kumar viresh.kumar@linaro.org writes:
On 24 June 2014 17:48, Vincent Guittot vincent.guittot@linaro.org wrote:
IRQs even if they are pinned to CPU0 by default on ARM system
Forgot to add them, yes these are affined to non-isolated core on my setup..
why only clkevt devices ? and not all the IRQs except those that you specifically want to be handled on the isolated CPU (if there are some)
Hmm, see below..
Is that enough? Or there are cases when there might be some interruption and ticks don't get updated?
yes, a small wake up because of an IRQ doesn't always generate a tick irq
Probably yes. Though we reach tick_nohz_irq_exit() on irq-exit, but tick_nohz_stop_sched_tick() doesn't always call tick_do_update_jiffies64() ..
Probably checking all irq sources might be a good idea, and so we wouldn't have any platform dependency as well.
But anything else which might be missed here?
You probably want to watch the IPIs also listed in /proc/interrupts
Kevin
On 24 June 2014 21:40, Kevin Hilman khilman@linaro.org wrote:
You probably want to watch the IPIs also listed in /proc/interrupts
Yeah, I would be looking for every possible interrupt mentioned in /proc/interrupts..
Hi Viresh,
On 06/24/2014 10:23 PM, Viresh Kumar wrote:
On 24 June 2014 21:40, Kevin Hilman khilman@linaro.org wrote:
You probably want to watch the IPIs also listed in /proc/interrupts
Yeah, I would be looking for every possible interrupt mentioned in /proc/interrupts..
I gave it some thought, looks like /proc/interrupts is the only way to figure if a cpu is isolated from any other activity.
A task can continue to run on a given cpu as long as a scheduler tick does not occur, so as to choose another task to run or any other interrupt does not trigger. I would have suggested looking at ftrace buffer just like Vincent did, but you mention a few loop holes there. So the surest way is to look at /proc/interrupts as far as I can see.
Regards Preeti U Murthy
On 30 June 2014 14:34, Preeti U Murthy preeti@linux.vnet.ibm.com wrote:
I gave it some thought, looks like /proc/interrupts is the only way to figure if a cpu is isolated from any other activity.
A task can continue to run on a given cpu as long as a scheduler tick does not occur, so as to choose another task to run or any other interrupt does not trigger. I would have suggested looking at ftrace buffer just like Vincent did, but you mention a few loop holes there. So the surest way is to look at /proc/interrupts as far as I can see.
Thanks Preeti, its already up and running.
Let me know in case anybody is interested in that script..
linaro-kernel@lists.linaro.org