On Fri, Oct 30, 2020 at 01:29:56PM +0530, Sai Prakash Ranjan wrote:
Hello guys,
On 2020-10-24 02:07, Mathieu Poirier wrote:
On Fri, Oct 23, 2020 at 03:44:16PM +0200, Peter Zijlstra wrote:
On Fri, Oct 23, 2020 at 02:29:54PM +0100, Suzuki Poulose wrote:
On 10/23/20 2:16 PM, Peter Zijlstra wrote:
On Fri, Oct 23, 2020 at 01:56:47PM +0100, Suzuki Poulose wrote:
That way another session could use the same sink if it is free. i.e
perf record -e cs_etm/@sink0/u --per-thread app1
and
perf record -e cs_etm/@sink0/u --per-thread app2
both can work as long as the sink is not used by the other session.
Like said above, if sink is shared between CPUs, that's going to be a trainwreck :/ Why do you want that?
That ship has sailed. That is how the current generation of systems are, unfortunately. But as I said, this is changing and there are guidelines in place to avoid these kind of topologies. With the future technologies, this will be completely gone.
I understand that the hardware is like that, but why do you want to support this insanity in software?
If you only allow a single sink user (group) at the same time, your problem goes away. Simply disallow the above scenario, do not allow concurrent sink users if sinks are shared like this.
Have the perf-record of app2 above fail because the sink is in-user already.
I agree with you that --per-thread scenarios are easy to deal with, but to support cpu-wide scenarios events must share a sink (because there is one event per CPU). CPU-wide support can't be removed because it has been around for close to a couple of years and heavily used. I also think using the pid of the process that created the events, i.e perf, is a good idea. We just need to agree on how to gain access to it.
In Sai's patch you objected to the following:
struct task_struct *task = READ_ONCE(event->owner);
if (!task || is_kernel_event(event))
Would it be better to use task_nr_pid(current) instead of event->owner? The end result will be exactly the same. There is also no need to check the validity of @current since it is a user process.
We have devices deployed where these crashes are seen consistently, so for some immediate relief, could we atleast get some fix in this cycle without major design overhaul which would likely take more time. Perhaps my first patch [1] without any check for owner or I can post a new version as Suzuki suggested [2] dropping the export of is_kernel_event(). Then we can always work on top of it based on the conclusion of this discussion, we will atleast not have the systems crash in the meantime, thoughts?
For the time being I think [1], exactly the way it is, is a reasonable way forward.
Regards, Mathieu
[1] https://lore.kernel.org/patchwork/patch/1318098/ [2] https://lore.kernel.org/lkml/fa6cdf34-88a0-1050-b9ea-556d0a9438cb@arm.com/
Thanks, Sai
-- QUALCOMM INDIA, on behalf of Qualcomm Innovation Center, Inc. is a member of Code Aurora Forum, hosted by The Linux Foundation