On 21/09/2022 16:08, Catalin Marinas wrote:
On Wed, Sep 21, 2022 at 03:05:34PM +0100, James Clark wrote:
As suggested by Catalin here's the change to add Coresight to defconfig.
Unfortunately I don't think we should add CONFIG_CORESIGHT_SOURCE_ETM4X which builds a few files until [1] is merged because of the overhead of CONFIG_PID_IN_CONTEXTIDR.
I thought the overhead wasn't the problem, it's mostly negligible. We can probably save a few more cycles on the __switch_to() path by replacing several isb()s in those functions with a single one just before cpu_switch_to().
IIRC the issue is that unless a process runs in the root pid namespace, the actual pid written to contextidr is meaningless.
This is true, and Leo has recently disabled it in that scenario in aab473867fed.
Now that you reminded me of that thread, I see three options (sorry, not entirely related to the defconfig updates):
- Remove CONFIG_PID_IN_CONTEXTIDR and corresponding code completely, find other events to correlate the task with the trace.
Unfortunately when tracing per core we would need kernel timestamps in the trace to correlate to the switch records. At the moment Coresight is using a different clock source so it's not possible and we're still using the context ID to correlate samples.
With FEAT_TRF in v8.4 it will be possible to do this and we've started working towards that here: 0f00b223ea22. But we'd still have to support older hardware too, so CONFIG_PID_IN_CONTEXTIDR can't be removed completely.
For SPE it's not required because we already have the right timestamps in the samples and we've added support for no context IDs in the decoder here: 27d113cfe892
- Always on CONFIG_PID_IN_CONTEXTIDR (we might as well remove the Kconfig entry). This would write the root pid namespace value (task_pid_nr()).
If we're not worried about the overhead after all, this would be the easiest solution. And then SPE or Coresight already decide whether they want to use the value or not, so no further changes are needed.
From Leo's patch there is a table that shows a 1% overhead with it enabled permanently, and I've heard a figure like that mentioned before. So I could also resurrect that patch to use static keys? Although it's a bit more complicated, that would be my preference. And then we can have that mode always on.
- Similar to (2) but instead write task_pid_nr_ns(). An alternative here is to write -1 if the task is not in the root pid namespace.
Strong preference for (1).