The TRBE driver support is build as a module, we found some driver issues
based on the patchset [1] and set CONFIG_CORESIGHT_TRBE=m.
1. TRBE driver potential sleep in atomic context when unregister device
2. Multiple free the platform data resource when rmmod coresight TRBE
driver
[1] "coresight: trbe: Enable ACPI based devices"
https://lore.kernel.org/all/20230808082247.383405-1-anshuman.khandual@arm.c…
Junhao He (2):
coresight: trbe: Fix TRBE potential sleep in atomic context
coresight: core: Fix multiple free TRBE platform data resource
drivers/hwtracing/coresight/coresight-core.c | 7 ++--
drivers/hwtracing/coresight/coresight-trbe.c | 35 +++++++++++---------
2 files changed, 24 insertions(+), 18 deletions(-)
--
2.33.0
This series enables detection of ACPI based TRBE devices via a stand alone
purpose built representative platform device. But as a pre-requisite this
changes coresight_platform_data structure assignment for the TRBE device.
This series is based on v6.5-rc5 kernel, is also dependent on the following
EDK2 changes posted earlier by Sami.
https://edk2.groups.io/g/devel/message/107239https://edk2.groups.io/g/devel/message/107241
Changes in V5:
- Detected zeroed parsed GSI as a mismatch but handled all zero scenario
- Changed condition check from 'if (ret < 0)' into a 'if (ret)'
- Dropped pr_warn() message after platform_device_register()
Changes in V4:
https://lore.kernel.org/all/20230808082247.383405-1-anshuman.khandual@arm.c…
- Added in-code comment for arm_trbe_device_probe()
- Reverted back using IS_ENABLED() for SPE PMU platform device
- Replaced #ifdef with IS_ENABLED() for TRBE platform device
- Protected arm_trbe_acpi_match with ACPI_PTR() - preventing a build failure
when CONFIG_ACPI is not enabled
- Added __maybe_unused for arm_acpi_register_pmu_device() and dropped config
checks with IS_ENABLED()
Changes in V3:
https://lore.kernel.org/all/20230803055652.1322801-1-anshuman.khandual@arm.…
- Changed ARMV8_TRBE_PDEV_NAME from "arm-trbe-acpi" into "arm,trbe"
- Dropped local variable 'matched'
- Replaced 'matched' with 'valid gsi' as being already matched once
- Moved find_acpi_cpu_topology_hetero_id() outside conditional check
Changes in V2:
https://lore.kernel.org/all/20230801094052.750416-1-anshuman.khandual@arm.c…
- Refactored arm_spe_acpi_register_device() in a separate patch
- Renamed trbe_acpi_resources as trbe_resources
- Renamed trbe_acpi_dev as trbe_dev
Changes in V1:
https://lore.kernel.org/all/20230728112733.359620-1-anshuman.khandual@arm.c…
Cc: Sami Mujawar <sami.mujawar(a)arm.com>
Cc: Catalin Marinas <catalin.marinas(a)arm.com>
Cc: Will Deacon <will(a)kernel.org>
Cc: Mark Rutland <mark.rutland(a)arm.com>
Cc: Suzuki K Poulose <suzuki.poulose(a)arm.com>
Cc: Mike Leach <mike.leach(a)linaro.org>
Cc: Leo Yan <leo.yan(a)linaro.org>
Cc: Alexander Shishkin <alexander.shishkin(a)linux.intel.com>
Cc: James Clark <james.clark(a)arm.com>
Cc: coresight(a)lists.linaro.org
Cc: linux-arm-kernel(a)lists.infradead.org
Cc: linux-kernel(a)vger.kernel.org
Anshuman Khandual (4):
arm_pmu: acpi: Refactor arm_spe_acpi_register_device()
arm_pmu: acpi: Add a representative platform device for TRBE
coresight: trbe: Add a representative coresight_platform_data for TRBE
coresight: trbe: Enable ACPI based TRBE devices
arch/arm64/include/asm/acpi.h | 3 +
drivers/hwtracing/coresight/coresight-trbe.c | 26 +++-
drivers/hwtracing/coresight/coresight-trbe.h | 2 +
drivers/perf/arm_pmu_acpi.c | 142 ++++++++++++++-----
include/linux/perf/arm_pmu.h | 1 +
5 files changed, 132 insertions(+), 42 deletions(-)
--
2.25.1
On 15/08/2023 10:40, John Garry wrote:
> On 11/08/2023 15:39, James Clark wrote:
>> N2 r0p3 doesn't require the workaround [1], so gating on (#slots - 5) no
>> longer works because all N2s have 5 slots. Add a new expression builtin
>> that allows calling strcmp_cpuid_str() and comparing CPUIDs in metric
>> formulas.
>>
>> In this case, the commented formula looks like this:
>>
>> strcmp_cpuid_str(0x410fd493) # greater than or equal to N2 r0p3
>> | strcmp_cpuid_str(0x410fd490) ^ 1 # OR NOT any version of N2
>>
>> [1]:https://urldefense.com/v3/__https://gitlab.arm.com/telemetry-solution/te… Signed-off-by: James Clark<james.clark(a)arm.com>
>> ---
>> tools/perf/arch/arm64/util/pmu.c | 18 +-----------------
>> .../arch/arm64/arm/neoverse-n2-v2/metrics.json | 8 ++++----
>> tools/perf/pmu-events/metric.py | 17 +++++++++++++++--
>> tools/perf/util/expr.c | 18 ++++++++++++++++++
>> tools/perf/util/expr.h | 1 +
>> tools/perf/util/expr.l | 1 +
>> tools/perf/util/expr.y | 8 +++++++-
>> tools/perf/util/pmu.c | 17 +++++++++++++++++
>> tools/perf/util/pmu.h | 1 +
>> 9 files changed, 65 insertions(+), 24 deletions(-)
>
> This patch looks ok, but I think that it would be better to separate out
> the metrics.json change into a separate patch.
>
Yep I thought about doing that, I can change it in the next version.
> Thanks,
> John
On 15/08/2023 10:35, John Garry wrote:
> On 11/08/2023 15:39, James Clark wrote:
>> Currently variant and revision fields are masked out of the MIDR so
>> it's not possible to compare different versions of the same CPU.
>> In a later commit a workaround will be removed just for N2 r0p3, so
>> enable comparisons on version.
>>
>> This has the side effect of changing the MIDR stored in the header of
>> the perf.data file to no longer have masked version fields. It also
>> affects the lookups in mapfile.csv, but as that currently only has
>> zeroed version fields, it has no actual effect. The mapfile.csv
>> documentation also states to zero the version fields, so unless this
>> isn't done it will continue to have no effect.
>>
>
> This looks ok apart from a couple of comments, below.
>
> Thanks,
> John
>
>> Signed-off-by: James Clark <james.clark(a)arm.com>
>> ---
>> tools/perf/arch/arm64/util/header.c | 64 ++++++++++++++++++++++-------
>> 1 file changed, 50 insertions(+), 14 deletions(-)
>>
>> diff --git a/tools/perf/arch/arm64/util/header.c
>> b/tools/perf/arch/arm64/util/header.c
>> index 80b9f6287fe2..8f74e801e1ab 100644
>> --- a/tools/perf/arch/arm64/util/header.c
>> +++ b/tools/perf/arch/arm64/util/header.c
>> @@ -1,3 +1,6 @@
>> +#include <linux/kernel.h>
>> +#include <linux/bits.h>
>> +#include <linux/bitfield.h>
>> #include <stdio.h>
>> #include <stdlib.h>
>> #include <perf/cpumap.h>
>> @@ -10,14 +13,12 @@
>> #define MIDR "/regs/identification/midr_el1"
>> #define MIDR_SIZE 19
>> -#define MIDR_REVISION_MASK 0xf
>> -#define MIDR_VARIANT_SHIFT 20
>> -#define MIDR_VARIANT_MASK (0xf << MIDR_VARIANT_SHIFT)
>> +#define MIDR_REVISION_MASK GENMASK(3, 0)
>> +#define MIDR_VARIANT_MASK GENMASK(23, 20)
>> static int _get_cpuid(char *buf, size_t sz, struct perf_cpu_map
>> *cpus)
>> {
>> const char *sysfs = sysfs__mountpoint();
>> - u64 midr = 0;
>> int cpu;
>> if (!sysfs || sz < MIDR_SIZE)
>> @@ -44,21 +45,11 @@ static int _get_cpuid(char *buf, size_t sz, struct
>> perf_cpu_map *cpus)
>> }
>> fclose(file);
>> - /* Ignore/clear Variant[23:20] and
>> - * Revision[3:0] of MIDR
>> - */
>> - midr = strtoul(buf, NULL, 16);
>> - midr &= (~(MIDR_VARIANT_MASK | MIDR_REVISION_MASK));
>> - scnprintf(buf, MIDR_SIZE, "0x%016lx", midr);
>> /* got midr break loop */
>> break;
>> }
>> perf_cpu_map__put(cpus);
>> -
>> - if (!midr)
>> - return EINVAL;
>
> Is there a reason to drop this check?
>
> As I see, it is still checked in perf_pmu__getcpudid() ->
> get_cpuid_str() -> _get_cpuid(), and we don't zero the buf allocated in
> _get_cpuid()
>
Ah yes, now if all the files fail to open or read then buf will be
uninitialized. I make it so that it will return EINVAL unless the
fgets() succeeds, but I don't think we need to add the strtoul() back in?
>> -
>> return 0;
>> }
>> @@ -99,3 +90,48 @@ char *get_cpuid_str(struct perf_pmu *pmu)
>> return buf;
>> }
>> +
>> +/*
>> + * Return 0 if idstr is a higher or equal to version of the same part as
>> + * mapcpuid.
>
> And what other values may be returned? If just 0/1, then can we have a
> bool return value?
>
I don't think that's best for consistency. All the other CPU ID
comparison functions return the strcmp style return values which is the
reverse of booleans. We could change them all to bool, but it would be a
big change, and they'd still have strcmp in the name which suggests
-1/0/1 return values (although -1 is never used here).
I will add to the comment that 1 is returned for a comparison failure
thought. That is missing.
>> + *
>> + * Therefore, if mapcpuid has 0 for revision and variant then any
>> version of
>> + * idstr will match as long as it's the same CPU type.
>> + */
>> +int strcmp_cpuid_str(const char *mapcpuid, const char *idstr)
>> +{
>> + u64 map_id = strtoull(mapcpuid, NULL, 16);
>> + char map_id_variant = FIELD_GET(MIDR_VARIANT_MASK, map_id);
>> + char map_id_revision = FIELD_GET(MIDR_REVISION_MASK, map_id);
>> + u64 id = strtoull(idstr, NULL, 16);
>> + char id_variant = FIELD_GET(MIDR_VARIANT_MASK, id);
>> + char id_revision = FIELD_GET(MIDR_REVISION_MASK, id);
>> + u64 id_fields = ~(MIDR_VARIANT_MASK | MIDR_REVISION_MASK);
>> +
>> + /* Compare without version first */
>> + if ((map_id & id_fields) != (id & id_fields))
>> + return 1;
>> +
>> + /*
>> + * ID matches, now compare version.
>> + *
>> + * Arm revisions (like r0p0) are compared here like two digit semver
>> + * values eg. 1.3 < 2.0 < 2.1 < 2.2. The events json file with the
>> + * highest matching version is used.
>> + *
>> + * r = high value = 'Variant' field in MIDR
>> + * p = low value = 'Revision' field in MIDR
>> + *
>> + */
>> + if (id_variant > map_id_variant)
>> + return 0;
>> +
>> + if (id_variant == map_id_variant && id_revision >= map_id_revision)
>> + return 0;
>> +
>> + /*
>> + * variant is less than mapfile variant or variants are the same but
>> + * the revision doesn't match. Return no match.
>> + */
>> + return 1;
>> +}
>
On 14/08/2023 15:43, John Garry wrote:
> On 14/08/2023 15:15, James Clark wrote:
>>
>> On 14/08/2023 14:07, John Garry wrote:
>>> On 11/08/2023 15:39, James Clark wrote:
>>>> Currently variant and revision fields are masked out of the MIDR so
>>>> it's not possible to compare different versions of the same CPU.
>>>> In a later commit a workaround will be removed just for N2 r0p3, so
>>>> enable comparisons on version.
>>>>
>>>> This has the side effect of changing the MIDR stored in the header of
>>>> the perf.data file to no longer have masked version fields.
>>> Did you consider adding a raw version of _get_cpuid(), which returns the
>>> full MIDR just for the purpose of caller strcmp_cpuid_str()?
>> I did, but I thought that seeing as it would only be used in one place,
>> and that changing the existing one didn't break anything, that it was
>> better to not fragment the CPU ID interface. I thought it might also
>> have repercussions for the other architectures as well. It would also
>> mean that the MIDR that's stored in the header wouldn't have the version
>> information, which if we're starting to do things with that could be bad.
>>
>> There are already callers of strcmp_cpuid_str() so it's probably best to
>> keep it using the same get_cpuid() string. Unless there is a reason
>> _not_ to do it? There isn't really anything that can't be done with it
>> accepting/returning the full unmasked MIDR. If you want the old
>> behavior, you just set the version fields to 0, which I've also used in
>> a later patch and is already done in mapfile.csv
>>
>
> ok, fine, so we seems that we would be following x86 on this in terms of
> using strcmp_cpuid_str(). It would be good to mention that there is
> already a weak version of strcmp_cpuid_str() for !x86 in your commit
> message.
>
Yep I can add this.
> Let me check your code again...
>
> Thanks,
> John
On 8/11/23 15:49, Will Deacon wrote:
> On Wed, Aug 09, 2023 at 01:54:55PM +0100, Suzuki K Poulose wrote:
>> On 08/08/2023 14:16, Will Deacon wrote:
>>> On Tue, Aug 08, 2023 at 09:48:16AM +0100, Suzuki K Poulose wrote:
>>>> On 08/08/2023 09:22, Anshuman Khandual wrote:
>>>>> Sanity checking all the GICC tables for same interrupt number, and ensuring
>>>>> a homogeneous ACPI based machine, could be used for other platform devices
>>>>> as well. Hence this refactors arm_spe_acpi_register_device() into a common
>>>>> helper arm_acpi_register_pmu_device().
>>>>>
>>>>> Cc: Catalin Marinas <catalin.marinas(a)arm.com>
>>>>> Cc: Will Deacon <will(a)kernel.org>
>>>>> Cc: Mark Rutland <mark.rutland(a)arm.com>
>>>>> Cc: linux-arm-kernel(a)lists.infradead.org
>>>>> Cc: linux-kernel(a)vger.kernel.org
>>>>> Co-developed-by: Will Deacon <will(a)kernel.org>
>>>>> Signed-off-by: Will Deacon <will(a)kernel.org>
>>>>> Signed-off-by: Anshuman Khandual <anshuman.khandual(a)arm.com>
>>>>> ---
>>>>> drivers/perf/arm_pmu_acpi.c | 105 ++++++++++++++++++++++--------------
>>>>> 1 file changed, 65 insertions(+), 40 deletions(-)
>>>>>
>>>>> diff --git a/drivers/perf/arm_pmu_acpi.c b/drivers/perf/arm_pmu_acpi.c
>>>>> index 90815ad762eb..72454bef2a70 100644
>>>>> --- a/drivers/perf/arm_pmu_acpi.c
>>>>> +++ b/drivers/perf/arm_pmu_acpi.c
>>>>> + pdev->resource[0].start = irq;
>>>>> + ret = platform_device_register(pdev);
>>>>> + if (ret < 0) {
>>>>> + pr_warn("ACPI: %s: Unable to register device\n", pdev->name);
>>>>> + acpi_unregister_gsi(gsi);
>>>>> + }
>>>>> + return ret;
>>>>
>>>> A postivie return value here could confuse the caller. Also, with my comment
>>>> below, we don't really need to return something from here.
>>>
>>> How does this return a positive value?
>>
>> Right now, there aren't. My point is this function returns a "return value"
>> of another function. And the caller of this function doesn't
>> really follow the "check" it needs. e.g.:
>>
>> ret = foo();
>> if (ret < 0)
>> error;
>> return ret;
>>
>>
>>
>> And the caller only checks for
>>
>> if (ret)
>> error;
>>
>> This seems fragile.
>
> Yeah, the '< 0' check is weird. I'd be inclined to drop that entirely
> from the helper function tbh...
static int __maybe_unused
arm_acpi_register_pmu_device(struct platform_device *pdev, u8 len,
u16 (*parse_gsi)(struct acpi_madt_generic_interrupt *))
{
.....
.....
ret = platform_device_register(pdev);
if (ret < 0) {
pr_warn("ACPI: %s: Unable to register device\n", pdev->name);
acpi_unregister_gsi(gsi);
}
return ret;
}
We would still need to call acpi_unregister_gsi() in the helper itself in case previous
platform_device_register() did not work correctly. AFAICS, acpi_unregister_gsi() cannot
be moved to the caller. We could change the error check as 'if (ret)' which is the case
in many other places in the tree. Also drop the above generic pr_warn() error message.
static int __maybe_unused
arm_acpi_register_pmu_device(struct platform_device *pdev, u8 len,
u16 (*parse_gsi)(struct acpi_madt_generic_interrupt *))
{
.....
.....
ret = platform_device_register(pdev);
if (ret)
acpi_unregister_gsi(gsi);
return ret;
}
>
>>>>> + int ret = arm_acpi_register_pmu_device(&spe_dev, ACPI_MADT_GICC_SPE,
>>>>> + arm_spe_parse_gsi);
>>>>> + if (ret)
>>>>> pr_warn("ACPI: SPE: Unable to register device\n");
>>>>
>>>> With this change, a system without SPE interrupt description always
>>>> generates the above message. Is this intended ?
>>>
>>> If there are no irqs, why doesn't this return 0?
>>
>> Apologies, I missed that.
>>
>>> arm_acpi_register_pmu_device() should only fail if either:
>>>
>>> - The static resources passed in are broken
>>> - The tables are not homogeneous
>>> - We fail to register the interrupt
>>>
>>> so something is amiss.
>>
>> Agreed. We don't need duplicate messages about an error ?
>> i.e., one in arm_acpi_register_pmu_device() and another
>> one in the caller ? (Of course adding any missing error msgs).
>
> ... and then just print the registration failure message in the caller.
But we already have such messages in respective callers.
static void arm_spe_acpi_register_device(void)
{
int ret = arm_acpi_register_pmu_device(&spe_dev, ACPI_MADT_GICC_SPE,
arm_spe_parse_gsi);
if (ret)
pr_warn("ACPI: SPE: Unable to register device\n");
}
static void arm_trbe_acpi_register_device(void)
{
int ret = arm_acpi_register_pmu_device(&trbe_dev, ACPI_MADT_GICC_TRBE,
arm_trbe_parse_gsi);
if (ret)
pr_warn("ACPI: TRBE: Unable to register device\n");
}
On 8/11/23 16:30, Will Deacon wrote:
> On Fri, Aug 11, 2023 at 03:55:43PM +0530, Anshuman Khandual wrote:
>>
>>
>> On 8/11/23 15:42, Will Deacon wrote:
>>> On Fri, Aug 11, 2023 at 02:13:42PM +0530, Anshuman Khandual wrote:
>>>> On 8/8/23 13:52, Anshuman Khandual wrote:
>>>>> + /*
>>>>> + * Sanity check all the GICC tables for the same interrupt
>>>>> + * number. For now, only support homogeneous ACPI machines.
>>>>> + */
>>>>> + for_each_possible_cpu(cpu) {
>>>>> + struct acpi_madt_generic_interrupt *gicc;
>>>>> +
>>>>> + gicc = acpi_cpu_get_madt_gicc(cpu);
>>>>> + if (gicc->header.length < len)
>>>>> + return gsi ? -ENXIO : 0;
>>>>> +
>>>>> + this_gsi = parse_gsi(gicc);
>>>>> + if (!this_gsi)
>>>>> + return gsi ? -ENXIO : 0;
>>>>> +
>>>>> + this_hetid = find_acpi_cpu_topology_hetero_id(cpu);
>>>>> + if (!gsi) {
>>>>> + hetid = this_hetid;
>>>>> + gsi = this_gsi;
>>>>> + } else if (hetid != this_hetid || gsi != this_gsi) {
>>>>> + pr_warn("ACPI: %s: must be homogeneous\n", pdev->name);
>>>>> + return -ENXIO;
>>>>> + }
>>>>> + }
>>>>
>>>> As discussed on the previous version i.e V3 thread, will move the
>>>> 'this_gsi' check after parse_gsi(), inside if (!gsi) conditional
>>>> block. This will treat subsequent cpu parse_gsi()'s failure as a
>>>> mismatch thus triggering the pr_warn() message.
>>>>
>>>> diff --git a/drivers/perf/arm_pmu_acpi.c b/drivers/perf/arm_pmu_acpi.c
>>>> index 845683ca7c64..6eae772d6298 100644
>>>> --- a/drivers/perf/arm_pmu_acpi.c
>>>> +++ b/drivers/perf/arm_pmu_acpi.c
>>>> @@ -98,11 +98,11 @@ arm_acpi_register_pmu_device(struct platform_device *pdev, u8 len,
>>>> return gsi ? -ENXIO : 0;
>>>>
>>>> this_gsi = parse_gsi(gicc);
>>>> - if (!this_gsi)
>>>> - return gsi ? -ENXIO : 0;
>>>> -
>>>> this_hetid = find_acpi_cpu_topology_hetero_id(cpu);
>>>> if (!gsi) {
>>>> + if (!this_gsi)
>>>> + return 0;
>>>
>>> Why do you need this hunk?
>>
>> Otherwise '0' gsi on all cpus would just clear the above homogeneity
>> test, and end up in acpi_register_gsi() making it fail, but with the
>> following warning before returning with -ENXIO.
>>
>> irq = acpi_register_gsi(NULL, gsi, ACPI_LEVEL_SENSITIVE, ACPI_ACTIVE_HIGH);
>> if (irq < 0) {
>> pr_warn("ACPI: %s Unable to register interrupt: %d\n", pdev->name, gsi);
>> return -ENXIO;
>> }
>
> Ah gotcha, thanks.
>
>> Is this behaviour better than returning 0 after detecting '0' gsi in
>> the first cpu to avoid the above mentioned scenario ? Although 0 gsi
>> followed by non-zero ones will still end up warning about a mismatch.
>
> Can we move the check _after_ the loop, then? That way, we still detect
> mismatches but we'll quietly return 0 if nobody has an interrupt.
Sure, will fold in the following changes instead.
diff --git a/drivers/perf/arm_pmu_acpi.c b/drivers/perf/arm_pmu_acpi.c
index 845683ca7c64..d7beb035345a 100644
--- a/drivers/perf/arm_pmu_acpi.c
+++ b/drivers/perf/arm_pmu_acpi.c
@@ -98,9 +98,6 @@ arm_acpi_register_pmu_device(struct platform_device *pdev, u8 len,
return gsi ? -ENXIO : 0;
this_gsi = parse_gsi(gicc);
- if (!this_gsi)
- return gsi ? -ENXIO : 0;
-
this_hetid = find_acpi_cpu_topology_hetero_id(cpu);
if (!gsi) {
hetid = this_hetid;
@@ -111,6 +108,15 @@ arm_acpi_register_pmu_device(struct platform_device *pdev, u8 len,
}
}
+ /*
+ * This is a special case where no cpu on
+ * the system has the interrupt and which
+ * could not have been detected via above
+ * homogeneous mismatch test.
+ */
+ if (!this_gsi)
+ return 0;
+
irq = acpi_register_gsi(NULL, gsi, ACPI_LEVEL_SENSITIVE, ACPI_ACTIVE_HIGH);
if (irq < 0) {
pr_warn("ACPI: %s Unable to register interrupt: %d\n", pdev->name, gsi);