Hi -
We're using one kernel binary with BL Switcher enabled in config, but able to work on SoC without Big Little.
This is OK except where the BL patches touch the PMU driver. It makes an assumption about BL configured == in use which is not true. PMU init fails and when you try to use perf list later, it blows chunks.
I worked around it with the hack below, so it can fail out from the bigLITTLE path when it doesn't see the cluster property in DT, but there's presumably a better way to do that which more directly checks if we care about BL in this execution environment.
-Andy
Author: Andy Green andy.green@linaro.org Date: Thu May 30 09:44:17 2013 +0800
bl switcher fix dont assume bl active in pmu probe
Signed-off-by: Andy Green andy.green@linaro.org
diff --git a/arch/arm/kernel/perf_event_cpu.c b/arch/arm/kernel/perf_event_cpu.c index b3ae24f..c02ea21 100644 --- a/arch/arm/kernel/perf_event_cpu.c +++ b/arch/arm/kernel/perf_event_cpu.c @@ -440,6 +440,9 @@ static int cpu_pmu_device_probe(struct platform_device *pdev) hwid = of_get_property(ncluster, "reg", &len); if (hwid && len == 4) cluster = be32_to_cpup(hwid); + } else { + ret = probe_current_pmu(pmu); + goto bail; } /* set sibling mask to all cpu mask if socket is not specified */ /* @@ -501,7 +504,7 @@ static int cpu_pmu_device_probe(struct platform_device *pdev) } else { ret = probe_current_pmu(pmu); } - +bail: if (ret) goto error;
On Thu, May 30, 2013 at 7:36 AM, Andy Green andy.green@linaro.org wrote:
Hi -
We're using one kernel binary with BL Switcher enabled in config, but able to work on SoC without Big Little.
This is OK except where the BL patches touch the PMU driver. It makes an assumption about BL configured == in use which is not true. PMU init fails and when you try to use perf list later, it blows chunks.
I worked around it with the hack below, so it can fail out from the bigLITTLE path when it doesn't see the cluster property in DT, but there's presumably a better way to do that which more directly checks if we care about BL in this execution environment.
-Andy
This is Dave's area but Nico ought to weigh in whether the presence of the DT property is the best place for this. It seems OK to me.
Author: Andy Green andy.green@linaro.org Date: Thu May 30 09:44:17 2013 +0800
bl switcher fix dont assume bl active in pmu probe Signed-off-by: Andy Green <andy.green@linaro.org>
diff --git a/arch/arm/kernel/perf_event_cpu.c b/arch/arm/kernel/perf_event_cpu.c index b3ae24f..c02ea21 100644 --- a/arch/arm/kernel/perf_event_cpu.c +++ b/arch/arm/kernel/perf_event_cpu.c @@ -440,6 +440,9 @@ static int cpu_pmu_device_probe(struct platform_device *pdev) hwid = of_get_property(ncluster, "reg", &len); if (hwid && len == 4) cluster = be32_to_cpup(hwid);
} else {
ret = probe_current_pmu(pmu);
goto bail; } /* set sibling mask to all cpu mask if socket is not
specified */ /* @@ -501,7 +504,7 @@ static int cpu_pmu_device_probe(struct platform_device *pdev) } else { ret = probe_current_pmu(pmu); }
+bail: if (ret) goto error;
On Thu, May 30, 2013 at 10:06:20AM +0800, Andy Green wrote:
Hi -
We're using one kernel binary with BL Switcher enabled in config, but able to work on SoC without Big Little.
This is OK except where the BL patches touch the PMU driver. It makes an assumption about BL configured == in use which is not true. PMU init fails and when you try to use perf list later, it blows chunks.
I worked around it with the hack below, so it can fail out from the bigLITTLE path when it doesn't see the cluster property in DT, but there's presumably a better way to do that which more directly checks if we care about BL in this execution environment.
-Andy
Author: Andy Green andy.green@linaro.org Date: Thu May 30 09:44:17 2013 +0800
bl switcher fix dont assume bl active in pmu probe Signed-off-by: Andy Green <andy.green@linaro.org>
diff --git a/arch/arm/kernel/perf_event_cpu.c b/arch/arm/kernel/perf_event_cpu.c index b3ae24f..c02ea21 100644 --- a/arch/arm/kernel/perf_event_cpu.c +++ b/arch/arm/kernel/perf_event_cpu.c @@ -440,6 +440,9 @@ static int cpu_pmu_device_probe(struct platform_device *pdev) hwid = of_get_property(ncluster, "reg", &len); if (hwid && len == 4) cluster = be32_to_cpup(hwid);
} else {
ret = probe_current_pmu(pmu);
goto bail;
Can you provide logs?
Why is the DT different for MP versus IKS?
It is incorrect for the DT to be different, because the hardware is exactly the same in both cases.
If the DT isn't different, then can you elaborate on what is fixed by this change?
You may be getting affected by the fact that we're relying on non- pstreamed DT bindings to describe the PMUs in the system. Discussions upstream are taking a different direction, so this code is going to get substantially rewritten at some point. The perf support for IKS is a big hack at the moment... Unfortunately, I have ongoing distractions which mean that progress on this is mostly stalled right now.
There is also a separate problem caused by a recent change to the IKS code which breaks the perf support's assumptions about what physical/ logical CPU mappings are possible. I need to fix this, but it's not done yet... If you are using up-to-date IKS code, it's possible you're hitting this issue.
Cheers ---Dave
} /* set sibling mask to all cpu mask if socket is not
specified */ /* @@ -501,7 +504,7 @@ static int cpu_pmu_device_probe(struct platform_device *pdev) } else { ret = probe_current_pmu(pmu); }
+bail: if (ret) goto error;
-- Andy Green | Fujitsu Landing Team Leader Linaro.org │ Open source software for ARM SoCs | Follow Linaro http://facebook.com/pages/Linaro/155974581091106 - http://twitter.com/#%21/linaroorg - http://linaro.org/linaro-blog
linaro-dev mailing list linaro-dev@lists.linaro.org http://lists.linaro.org/mailman/listinfo/linaro-dev
On 30/05/13 18:50, the mail apparently from Dave Martin included:
On Thu, May 30, 2013 at 10:06:20AM +0800, Andy Green wrote:
Hi -
We're using one kernel binary with BL Switcher enabled in config, but able to work on SoC without Big Little.
This is OK except where the BL patches touch the PMU driver. It makes an assumption about BL configured == in use which is not true. PMU init fails and when you try to use perf list later, it blows chunks.
I worked around it with the hack below, so it can fail out from the bigLITTLE path when it doesn't see the cluster property in DT, but there's presumably a better way to do that which more directly checks if we care about BL in this execution environment.
-Andy
Author: Andy Green andy.green@linaro.org Date: Thu May 30 09:44:17 2013 +0800
bl switcher fix dont assume bl active in pmu probe Signed-off-by: Andy Green <andy.green@linaro.org>
diff --git a/arch/arm/kernel/perf_event_cpu.c b/arch/arm/kernel/perf_event_cpu.c index b3ae24f..c02ea21 100644 --- a/arch/arm/kernel/perf_event_cpu.c +++ b/arch/arm/kernel/perf_event_cpu.c @@ -440,6 +440,9 @@ static int cpu_pmu_device_probe(struct platform_device *pdev) hwid = of_get_property(ncluster, "reg", &len); if (hwid && len == 4) cluster = be32_to_cpup(hwid);
} else {
ret = probe_current_pmu(pmu);
goto bail;
Can you provide logs?
If you really want them, but --->
Why is the DT different for MP versus IKS?
That is not the issue (nor did I mention IKS anywhere...), we are using CONFIG_ARCH_MULTIPLATFORM and supporting SoCs based on CA9 and SoCs with "other things". In the case I'm talking about, the kernel with big.LITTLE configured on finds itself waking up on a CA9 box.
It is incorrect for the DT to be different, because the hardware is exactly the same in both cases.
If the DT isn't different, then can you elaborate on what is fixed by this change?
The bug where having BL_SWITCHER configured on has gotten mixed up with the idea that the kernel must therefore be running on a bL SoC, for PMU purposes. In a CONFIG_ARCH_MULTIPLATFORM world, that ain't so.
I think the bug is smaller and less interesting than you're giving it credit for ^^ but it's still something that should be fixed.
-Andy
You may be getting affected by the fact that we're relying on non- pstreamed DT bindings to describe the PMUs in the system. Discussions upstream are taking a different direction, so this code is going to get substantially rewritten at some point. The perf support for IKS is a big hack at the moment... Unfortunately, I have ongoing distractions which mean that progress on this is mostly stalled right now.
There is also a separate problem caused by a recent change to the IKS code which breaks the perf support's assumptions about what physical/ logical CPU mappings are possible. I need to fix this, but it's not done yet... If you are using up-to-date IKS code, it's possible you're hitting this issue.
Cheers ---Dave
} /* set sibling mask to all cpu mask if socket is not
specified */ /* @@ -501,7 +504,7 @@ static int cpu_pmu_device_probe(struct platform_device *pdev) } else { ret = probe_current_pmu(pmu); }
+bail: if (ret) goto error;
-- Andy Green | Fujitsu Landing Team Leader Linaro.org │ Open source software for ARM SoCs | Follow Linaro http://facebook.com/pages/Linaro/155974581091106 - http://twitter.com/#%21/linaroorg - http://linaro.org/linaro-blog
linaro-dev mailing list linaro-dev@lists.linaro.org http://lists.linaro.org/mailman/listinfo/linaro-dev
On Thu, May 30, 2013 at 06:58:53PM +0800, Andy Green wrote:
On 30/05/13 18:50, the mail apparently from Dave Martin included:
On Thu, May 30, 2013 at 10:06:20AM +0800, Andy Green wrote:
Hi -
We're using one kernel binary with BL Switcher enabled in config, but able to work on SoC without Big Little.
This is OK except where the BL patches touch the PMU driver. It makes an assumption about BL configured == in use which is not true. PMU init fails and when you try to use perf list later, it blows chunks.
I worked around it with the hack below, so it can fail out from the bigLITTLE path when it doesn't see the cluster property in DT, but there's presumably a better way to do that which more directly checks if we care about BL in this execution environment.
-Andy
Author: Andy Green andy.green@linaro.org Date: Thu May 30 09:44:17 2013 +0800
bl switcher fix dont assume bl active in pmu probe Signed-off-by: Andy Green <andy.green@linaro.org>
diff --git a/arch/arm/kernel/perf_event_cpu.c b/arch/arm/kernel/perf_event_cpu.c index b3ae24f..c02ea21 100644 --- a/arch/arm/kernel/perf_event_cpu.c +++ b/arch/arm/kernel/perf_event_cpu.c @@ -440,6 +440,9 @@ static int cpu_pmu_device_probe(struct platform_device *pdev) hwid = of_get_property(ncluster, "reg", &len); if (hwid && len == 4) cluster = be32_to_cpup(hwid);
} else {
ret = probe_current_pmu(pmu);
goto bail;
Can you provide logs?
If you really want them, but --->
Why is the DT different for MP versus IKS?
That is not the issue (nor did I mention IKS anywhere...), we are using CONFIG_ARCH_MULTIPLATFORM and supporting SoCs based on CA9 and SoCs with "other things". In the case I'm talking about, the kernel with big.LITTLE configured on finds itself waking up on a CA9 box.
It is incorrect for the DT to be different, because the hardware is exactly the same in both cases.
If the DT isn't different, then can you elaborate on what is fixed by this change?
The bug where having BL_SWITCHER configured on has gotten mixed up with the idea that the kernel must therefore be running on a bL SoC, for PMU purposes. In a CONFIG_ARCH_MULTIPLATFORM world, that ain't so.
I think the bug is smaller and less interesting than you're giving it credit for ^^ but it's still something that should be fixed.
Ah, I get you now.
So, the problem is the hacked DT bindings we're using for vexpress, which aren't compatible with upstream -- the perf changes assume these non-standard bindings are in use.
Your fix won't work for platforms which describe multiple CPU PMUs in their DTs -- but that shouldn't happen for non-multicluster platforms anyway.
Since this code needs significant rework anyway, I think your patch is a reasonable workaround for now?
Nico, do you think it's worth picking this up in your tree? It only really affects people using the BL support in multiplatform kernels.
Cheers ---Dave
On Thu, 30 May 2013, Dave Martin wrote:
So, the problem is the hacked DT bindings we're using for vexpress, which aren't compatible with upstream -- the perf changes assume these non-standard bindings are in use.
Your fix won't work for platforms which describe multiple CPU PMUs in their DTs -- but that shouldn't happen for non-multicluster platforms anyway.
Since this code needs significant rework anyway, I think your patch is a reasonable workaround for now?
Nico, do you think it's worth picking this up in your tree? It only really affects people using the BL support in multiplatform kernels.
You are the perf master, so if you agree I agree.
I'd need a patch that can be applied though. And your ACK would be good.
In fact Tixy might be interested as well as the IKS patches feed into the linux-linaro tree via his tree.
Nicolas
On Thu, May 30, 2013 at 12:21:07PM -0400, Nicolas Pitre wrote:
On Thu, 30 May 2013, Dave Martin wrote:
So, the problem is the hacked DT bindings we're using for vexpress, which aren't compatible with upstream -- the perf changes assume these non-standard bindings are in use.
Your fix won't work for platforms which describe multiple CPU PMUs in their DTs -- but that shouldn't happen for non-multicluster platforms anyway.
Since this code needs significant rework anyway, I think your patch is a reasonable workaround for now?
Nico, do you think it's worth picking this up in your tree? It only really affects people using the BL support in multiplatform kernels.
You are the perf master, so if you agree I agree.
I'd need a patch that can be applied though. And your ACK would be good.
In fact Tixy might be interested as well as the IKS patches feed into the linux-linaro tree via his tree.
I'll repost the patch in your direction when I've tried it, but I didn't get very far with that today...
Cheers ---Dave