On 15/08/2023 10:35, John Garry wrote:
> On 11/08/2023 15:39, James Clark wrote:
>> Currently variant and revision fields are masked out of the MIDR so
>> it's not possible to compare different versions of the same CPU.
>> In a later commit a workaround will be removed just for N2 r0p3, so
>> enable comparisons on version.
>>
>> This has the side effect of changing the MIDR stored in the header of
>> the perf.data file to no longer have masked version fields. It also
>> affects the lookups in mapfile.csv, but as that currently only has
>> zeroed version fields, it has no actual effect. The mapfile.csv
>> documentation also states to zero the version fields, so unless this
>> isn't done it will continue to have no effect.
>>
>
> This looks ok apart from a couple of comments, below.
>
> Thanks,
> John
>
>> Signed-off-by: James Clark <james.clark(a)arm.com>
>> ---
>> tools/perf/arch/arm64/util/header.c | 64 ++++++++++++++++++++++-------
>> 1 file changed, 50 insertions(+), 14 deletions(-)
>>
>> diff --git a/tools/perf/arch/arm64/util/header.c
>> b/tools/perf/arch/arm64/util/header.c
>> index 80b9f6287fe2..8f74e801e1ab 100644
>> --- a/tools/perf/arch/arm64/util/header.c
>> +++ b/tools/perf/arch/arm64/util/header.c
>> @@ -1,3 +1,6 @@
>> +#include <linux/kernel.h>
>> +#include <linux/bits.h>
>> +#include <linux/bitfield.h>
>> #include <stdio.h>
>> #include <stdlib.h>
>> #include <perf/cpumap.h>
>> @@ -10,14 +13,12 @@
>> #define MIDR "/regs/identification/midr_el1"
>> #define MIDR_SIZE 19
>> -#define MIDR_REVISION_MASK 0xf
>> -#define MIDR_VARIANT_SHIFT 20
>> -#define MIDR_VARIANT_MASK (0xf << MIDR_VARIANT_SHIFT)
>> +#define MIDR_REVISION_MASK GENMASK(3, 0)
>> +#define MIDR_VARIANT_MASK GENMASK(23, 20)
>> static int _get_cpuid(char *buf, size_t sz, struct perf_cpu_map
>> *cpus)
>> {
>> const char *sysfs = sysfs__mountpoint();
>> - u64 midr = 0;
>> int cpu;
>> if (!sysfs || sz < MIDR_SIZE)
>> @@ -44,21 +45,11 @@ static int _get_cpuid(char *buf, size_t sz, struct
>> perf_cpu_map *cpus)
>> }
>> fclose(file);
>> - /* Ignore/clear Variant[23:20] and
>> - * Revision[3:0] of MIDR
>> - */
>> - midr = strtoul(buf, NULL, 16);
>> - midr &= (~(MIDR_VARIANT_MASK | MIDR_REVISION_MASK));
>> - scnprintf(buf, MIDR_SIZE, "0x%016lx", midr);
>> /* got midr break loop */
>> break;
>> }
>> perf_cpu_map__put(cpus);
>> -
>> - if (!midr)
>> - return EINVAL;
>
> Is there a reason to drop this check?
>
> As I see, it is still checked in perf_pmu__getcpudid() ->
> get_cpuid_str() -> _get_cpuid(), and we don't zero the buf allocated in
> _get_cpuid()
>
Ah yes, now if all the files fail to open or read then buf will be
uninitialized. I make it so that it will return EINVAL unless the
fgets() succeeds, but I don't think we need to add the strtoul() back in?
>> -
>> return 0;
>> }
>> @@ -99,3 +90,48 @@ char *get_cpuid_str(struct perf_pmu *pmu)
>> return buf;
>> }
>> +
>> +/*
>> + * Return 0 if idstr is a higher or equal to version of the same part as
>> + * mapcpuid.
>
> And what other values may be returned? If just 0/1, then can we have a
> bool return value?
>
I don't think that's best for consistency. All the other CPU ID
comparison functions return the strcmp style return values which is the
reverse of booleans. We could change them all to bool, but it would be a
big change, and they'd still have strcmp in the name which suggests
-1/0/1 return values (although -1 is never used here).
I will add to the comment that 1 is returned for a comparison failure
thought. That is missing.
>> + *
>> + * Therefore, if mapcpuid has 0 for revision and variant then any
>> version of
>> + * idstr will match as long as it's the same CPU type.
>> + */
>> +int strcmp_cpuid_str(const char *mapcpuid, const char *idstr)
>> +{
>> + u64 map_id = strtoull(mapcpuid, NULL, 16);
>> + char map_id_variant = FIELD_GET(MIDR_VARIANT_MASK, map_id);
>> + char map_id_revision = FIELD_GET(MIDR_REVISION_MASK, map_id);
>> + u64 id = strtoull(idstr, NULL, 16);
>> + char id_variant = FIELD_GET(MIDR_VARIANT_MASK, id);
>> + char id_revision = FIELD_GET(MIDR_REVISION_MASK, id);
>> + u64 id_fields = ~(MIDR_VARIANT_MASK | MIDR_REVISION_MASK);
>> +
>> + /* Compare without version first */
>> + if ((map_id & id_fields) != (id & id_fields))
>> + return 1;
>> +
>> + /*
>> + * ID matches, now compare version.
>> + *
>> + * Arm revisions (like r0p0) are compared here like two digit semver
>> + * values eg. 1.3 < 2.0 < 2.1 < 2.2. The events json file with the
>> + * highest matching version is used.
>> + *
>> + * r = high value = 'Variant' field in MIDR
>> + * p = low value = 'Revision' field in MIDR
>> + *
>> + */
>> + if (id_variant > map_id_variant)
>> + return 0;
>> +
>> + if (id_variant == map_id_variant && id_revision >= map_id_revision)
>> + return 0;
>> +
>> + /*
>> + * variant is less than mapfile variant or variants are the same but
>> + * the revision doesn't match. Return no match.
>> + */
>> + return 1;
>> +}
>
On 14/08/2023 15:43, John Garry wrote:
> On 14/08/2023 15:15, James Clark wrote:
>>
>> On 14/08/2023 14:07, John Garry wrote:
>>> On 11/08/2023 15:39, James Clark wrote:
>>>> Currently variant and revision fields are masked out of the MIDR so
>>>> it's not possible to compare different versions of the same CPU.
>>>> In a later commit a workaround will be removed just for N2 r0p3, so
>>>> enable comparisons on version.
>>>>
>>>> This has the side effect of changing the MIDR stored in the header of
>>>> the perf.data file to no longer have masked version fields.
>>> Did you consider adding a raw version of _get_cpuid(), which returns the
>>> full MIDR just for the purpose of caller strcmp_cpuid_str()?
>> I did, but I thought that seeing as it would only be used in one place,
>> and that changing the existing one didn't break anything, that it was
>> better to not fragment the CPU ID interface. I thought it might also
>> have repercussions for the other architectures as well. It would also
>> mean that the MIDR that's stored in the header wouldn't have the version
>> information, which if we're starting to do things with that could be bad.
>>
>> There are already callers of strcmp_cpuid_str() so it's probably best to
>> keep it using the same get_cpuid() string. Unless there is a reason
>> _not_ to do it? There isn't really anything that can't be done with it
>> accepting/returning the full unmasked MIDR. If you want the old
>> behavior, you just set the version fields to 0, which I've also used in
>> a later patch and is already done in mapfile.csv
>>
>
> ok, fine, so we seems that we would be following x86 on this in terms of
> using strcmp_cpuid_str(). It would be good to mention that there is
> already a weak version of strcmp_cpuid_str() for !x86 in your commit
> message.
>
Yep I can add this.
> Let me check your code again...
>
> Thanks,
> John
On 8/11/23 15:49, Will Deacon wrote:
> On Wed, Aug 09, 2023 at 01:54:55PM +0100, Suzuki K Poulose wrote:
>> On 08/08/2023 14:16, Will Deacon wrote:
>>> On Tue, Aug 08, 2023 at 09:48:16AM +0100, Suzuki K Poulose wrote:
>>>> On 08/08/2023 09:22, Anshuman Khandual wrote:
>>>>> Sanity checking all the GICC tables for same interrupt number, and ensuring
>>>>> a homogeneous ACPI based machine, could be used for other platform devices
>>>>> as well. Hence this refactors arm_spe_acpi_register_device() into a common
>>>>> helper arm_acpi_register_pmu_device().
>>>>>
>>>>> Cc: Catalin Marinas <catalin.marinas(a)arm.com>
>>>>> Cc: Will Deacon <will(a)kernel.org>
>>>>> Cc: Mark Rutland <mark.rutland(a)arm.com>
>>>>> Cc: linux-arm-kernel(a)lists.infradead.org
>>>>> Cc: linux-kernel(a)vger.kernel.org
>>>>> Co-developed-by: Will Deacon <will(a)kernel.org>
>>>>> Signed-off-by: Will Deacon <will(a)kernel.org>
>>>>> Signed-off-by: Anshuman Khandual <anshuman.khandual(a)arm.com>
>>>>> ---
>>>>> drivers/perf/arm_pmu_acpi.c | 105 ++++++++++++++++++++++--------------
>>>>> 1 file changed, 65 insertions(+), 40 deletions(-)
>>>>>
>>>>> diff --git a/drivers/perf/arm_pmu_acpi.c b/drivers/perf/arm_pmu_acpi.c
>>>>> index 90815ad762eb..72454bef2a70 100644
>>>>> --- a/drivers/perf/arm_pmu_acpi.c
>>>>> +++ b/drivers/perf/arm_pmu_acpi.c
>>>>> + pdev->resource[0].start = irq;
>>>>> + ret = platform_device_register(pdev);
>>>>> + if (ret < 0) {
>>>>> + pr_warn("ACPI: %s: Unable to register device\n", pdev->name);
>>>>> + acpi_unregister_gsi(gsi);
>>>>> + }
>>>>> + return ret;
>>>>
>>>> A postivie return value here could confuse the caller. Also, with my comment
>>>> below, we don't really need to return something from here.
>>>
>>> How does this return a positive value?
>>
>> Right now, there aren't. My point is this function returns a "return value"
>> of another function. And the caller of this function doesn't
>> really follow the "check" it needs. e.g.:
>>
>> ret = foo();
>> if (ret < 0)
>> error;
>> return ret;
>>
>>
>>
>> And the caller only checks for
>>
>> if (ret)
>> error;
>>
>> This seems fragile.
>
> Yeah, the '< 0' check is weird. I'd be inclined to drop that entirely
> from the helper function tbh...
static int __maybe_unused
arm_acpi_register_pmu_device(struct platform_device *pdev, u8 len,
u16 (*parse_gsi)(struct acpi_madt_generic_interrupt *))
{
.....
.....
ret = platform_device_register(pdev);
if (ret < 0) {
pr_warn("ACPI: %s: Unable to register device\n", pdev->name);
acpi_unregister_gsi(gsi);
}
return ret;
}
We would still need to call acpi_unregister_gsi() in the helper itself in case previous
platform_device_register() did not work correctly. AFAICS, acpi_unregister_gsi() cannot
be moved to the caller. We could change the error check as 'if (ret)' which is the case
in many other places in the tree. Also drop the above generic pr_warn() error message.
static int __maybe_unused
arm_acpi_register_pmu_device(struct platform_device *pdev, u8 len,
u16 (*parse_gsi)(struct acpi_madt_generic_interrupt *))
{
.....
.....
ret = platform_device_register(pdev);
if (ret)
acpi_unregister_gsi(gsi);
return ret;
}
>
>>>>> + int ret = arm_acpi_register_pmu_device(&spe_dev, ACPI_MADT_GICC_SPE,
>>>>> + arm_spe_parse_gsi);
>>>>> + if (ret)
>>>>> pr_warn("ACPI: SPE: Unable to register device\n");
>>>>
>>>> With this change, a system without SPE interrupt description always
>>>> generates the above message. Is this intended ?
>>>
>>> If there are no irqs, why doesn't this return 0?
>>
>> Apologies, I missed that.
>>
>>> arm_acpi_register_pmu_device() should only fail if either:
>>>
>>> - The static resources passed in are broken
>>> - The tables are not homogeneous
>>> - We fail to register the interrupt
>>>
>>> so something is amiss.
>>
>> Agreed. We don't need duplicate messages about an error ?
>> i.e., one in arm_acpi_register_pmu_device() and another
>> one in the caller ? (Of course adding any missing error msgs).
>
> ... and then just print the registration failure message in the caller.
But we already have such messages in respective callers.
static void arm_spe_acpi_register_device(void)
{
int ret = arm_acpi_register_pmu_device(&spe_dev, ACPI_MADT_GICC_SPE,
arm_spe_parse_gsi);
if (ret)
pr_warn("ACPI: SPE: Unable to register device\n");
}
static void arm_trbe_acpi_register_device(void)
{
int ret = arm_acpi_register_pmu_device(&trbe_dev, ACPI_MADT_GICC_TRBE,
arm_trbe_parse_gsi);
if (ret)
pr_warn("ACPI: TRBE: Unable to register device\n");
}
On 8/11/23 16:30, Will Deacon wrote:
> On Fri, Aug 11, 2023 at 03:55:43PM +0530, Anshuman Khandual wrote:
>>
>>
>> On 8/11/23 15:42, Will Deacon wrote:
>>> On Fri, Aug 11, 2023 at 02:13:42PM +0530, Anshuman Khandual wrote:
>>>> On 8/8/23 13:52, Anshuman Khandual wrote:
>>>>> + /*
>>>>> + * Sanity check all the GICC tables for the same interrupt
>>>>> + * number. For now, only support homogeneous ACPI machines.
>>>>> + */
>>>>> + for_each_possible_cpu(cpu) {
>>>>> + struct acpi_madt_generic_interrupt *gicc;
>>>>> +
>>>>> + gicc = acpi_cpu_get_madt_gicc(cpu);
>>>>> + if (gicc->header.length < len)
>>>>> + return gsi ? -ENXIO : 0;
>>>>> +
>>>>> + this_gsi = parse_gsi(gicc);
>>>>> + if (!this_gsi)
>>>>> + return gsi ? -ENXIO : 0;
>>>>> +
>>>>> + this_hetid = find_acpi_cpu_topology_hetero_id(cpu);
>>>>> + if (!gsi) {
>>>>> + hetid = this_hetid;
>>>>> + gsi = this_gsi;
>>>>> + } else if (hetid != this_hetid || gsi != this_gsi) {
>>>>> + pr_warn("ACPI: %s: must be homogeneous\n", pdev->name);
>>>>> + return -ENXIO;
>>>>> + }
>>>>> + }
>>>>
>>>> As discussed on the previous version i.e V3 thread, will move the
>>>> 'this_gsi' check after parse_gsi(), inside if (!gsi) conditional
>>>> block. This will treat subsequent cpu parse_gsi()'s failure as a
>>>> mismatch thus triggering the pr_warn() message.
>>>>
>>>> diff --git a/drivers/perf/arm_pmu_acpi.c b/drivers/perf/arm_pmu_acpi.c
>>>> index 845683ca7c64..6eae772d6298 100644
>>>> --- a/drivers/perf/arm_pmu_acpi.c
>>>> +++ b/drivers/perf/arm_pmu_acpi.c
>>>> @@ -98,11 +98,11 @@ arm_acpi_register_pmu_device(struct platform_device *pdev, u8 len,
>>>> return gsi ? -ENXIO : 0;
>>>>
>>>> this_gsi = parse_gsi(gicc);
>>>> - if (!this_gsi)
>>>> - return gsi ? -ENXIO : 0;
>>>> -
>>>> this_hetid = find_acpi_cpu_topology_hetero_id(cpu);
>>>> if (!gsi) {
>>>> + if (!this_gsi)
>>>> + return 0;
>>>
>>> Why do you need this hunk?
>>
>> Otherwise '0' gsi on all cpus would just clear the above homogeneity
>> test, and end up in acpi_register_gsi() making it fail, but with the
>> following warning before returning with -ENXIO.
>>
>> irq = acpi_register_gsi(NULL, gsi, ACPI_LEVEL_SENSITIVE, ACPI_ACTIVE_HIGH);
>> if (irq < 0) {
>> pr_warn("ACPI: %s Unable to register interrupt: %d\n", pdev->name, gsi);
>> return -ENXIO;
>> }
>
> Ah gotcha, thanks.
>
>> Is this behaviour better than returning 0 after detecting '0' gsi in
>> the first cpu to avoid the above mentioned scenario ? Although 0 gsi
>> followed by non-zero ones will still end up warning about a mismatch.
>
> Can we move the check _after_ the loop, then? That way, we still detect
> mismatches but we'll quietly return 0 if nobody has an interrupt.
Sure, will fold in the following changes instead.
diff --git a/drivers/perf/arm_pmu_acpi.c b/drivers/perf/arm_pmu_acpi.c
index 845683ca7c64..d7beb035345a 100644
--- a/drivers/perf/arm_pmu_acpi.c
+++ b/drivers/perf/arm_pmu_acpi.c
@@ -98,9 +98,6 @@ arm_acpi_register_pmu_device(struct platform_device *pdev, u8 len,
return gsi ? -ENXIO : 0;
this_gsi = parse_gsi(gicc);
- if (!this_gsi)
- return gsi ? -ENXIO : 0;
-
this_hetid = find_acpi_cpu_topology_hetero_id(cpu);
if (!gsi) {
hetid = this_hetid;
@@ -111,6 +108,15 @@ arm_acpi_register_pmu_device(struct platform_device *pdev, u8 len,
}
}
+ /*
+ * This is a special case where no cpu on
+ * the system has the interrupt and which
+ * could not have been detected via above
+ * homogeneous mismatch test.
+ */
+ if (!this_gsi)
+ return 0;
+
irq = acpi_register_gsi(NULL, gsi, ACPI_LEVEL_SENSITIVE, ACPI_ACTIVE_HIGH);
if (irq < 0) {
pr_warn("ACPI: %s Unable to register interrupt: %d\n", pdev->name, gsi);
Em Fri, Aug 11, 2023 at 03:53:49PM +0100, John Garry escreveu:
> On 11/08/2023 15:39, James Clark wrote:
> > Metrics will be published here [1] going forwards, but they have
> > slightly different scale units. To allow autogenerated metrics to be
> > added more easily, update the scale units to match.
> >
> > The more detailed descriptions have also been taken and added to the
> > common file.
> >
> > [1]:https://urldefense.com/v3/__https://gitlab.arm.com/telemetry-solution/te…
> >
> > Acked-by: Ian Rogers<irogers(a)google.com>
> > Signed-off-by: James Clark<james.clark(a)arm.com>
>
> Reviewed-by: John Garry <john.g.garry(a)oracle.com>
Applied as well, will wait for the others to get the review comment
addressed.
- Arnaldo
This is a completely new approach from V3 [1], although the metrics and
event descriptions are autogenerated, the topdown metrics have been
manually edited to use #no_stall_errata (now directly comparing on the
CPUID in v5). This removes the need to duplicate the whole set of JSONs
when only the topdown metrics are different between N2 and V2.
The CPU ID comparison function still needs to change so that the new
literal can compare on versions, but now no change is needed to mapfile
or the PMU event generation code because we still only support one
set of JSONs per CPU.
[1]: https://lore.kernel.org/lkml/20230711100218.1651995-1-james.clark@arm.com/
------
Changes since v4:
* Replace the #no_stall_errata literal with a more generic function
for comparing CPU IDs. This will hopefully keep configuration out
of the code and inside the JSONs
Changes since v3:
* Instead of duplicating all the metrics, add a new expression
literal that can be used to share the same metrics between N2 and V2
* Move tests to arch/arm64/tests
* Remove changes from jevents.py and mapfile.csv
Changes since v2:
* version -> variant in second commit message
* Add a bit more detail about version matching in the second commit
message
* Update the comments in pmu-events/arch/arm64/mapfile.csv to say that
variant and revision fields are now used
* Increase the CC list
Changes since v1:
* Split last change into two so it doesn't hit the mailing list size
limit
James Clark (6):
perf: cs-etm: Don't duplicate FIELD_GET()
perf arm64: Allow version comparisons of CPU IDs
perf test: Add a test for the new Arm CPU ID comparison behavior
perf vendor events arm64: Update scale units and descriptions of
common topdown metrics
perf vendor events arm64: Update stall_slot workaround for N2 r0p3
perf vendor events arm64: Update N2 and V2 metrics and events using
Arm telemetry repo
tools/perf/arch/arm64/include/arch-tests.h | 3 +
tools/perf/arch/arm64/tests/Build | 1 +
tools/perf/arch/arm64/tests/arch-tests.c | 4 +
tools/perf/arch/arm64/tests/cpuid-match.c | 38 ++
tools/perf/arch/arm64/util/header.c | 64 ++-
tools/perf/arch/arm64/util/pmu.c | 18 +-
.../arch/arm64/arm/neoverse-n2-v2/branch.json | 8 -
.../arch/arm64/arm/neoverse-n2-v2/bus.json | 18 +-
.../arch/arm64/arm/neoverse-n2-v2/cache.json | 155 --------
.../arm64/arm/neoverse-n2-v2/exception.json | 45 ++-
.../arm/neoverse-n2-v2/fp_operation.json | 22 ++
.../arm64/arm/neoverse-n2-v2/general.json | 10 +
.../arm64/arm/neoverse-n2-v2/instruction.json | 143 -------
.../arm64/arm/neoverse-n2-v2/l1d_cache.json | 54 +++
.../arm64/arm/neoverse-n2-v2/l1i_cache.json | 14 +
.../arm64/arm/neoverse-n2-v2/l2_cache.json | 50 +++
.../arm64/arm/neoverse-n2-v2/l3_cache.json | 22 ++
.../arm64/arm/neoverse-n2-v2/ll_cache.json | 10 +
.../arch/arm64/arm/neoverse-n2-v2/memory.json | 39 +-
.../arm64/arm/neoverse-n2-v2/metrics.json | 365 ++++++++++--------
.../arm64/arm/neoverse-n2-v2/pipeline.json | 23 --
.../arm64/arm/neoverse-n2-v2/retired.json | 30 ++
.../arch/arm64/arm/neoverse-n2-v2/spe.json | 12 +-
.../arm/neoverse-n2-v2/spec_operation.json | 110 ++++++
.../arch/arm64/arm/neoverse-n2-v2/stall.json | 30 ++
.../arch/arm64/arm/neoverse-n2-v2/sve.json | 50 +++
.../arch/arm64/arm/neoverse-n2-v2/tlb.json | 66 ++++
.../arch/arm64/arm/neoverse-n2-v2/trace.json | 27 +-
tools/perf/pmu-events/arch/arm64/sbsa.json | 24 +-
tools/perf/pmu-events/metric.py | 17 +-
tools/perf/util/cs-etm.c | 14 +-
tools/perf/util/expr.c | 18 +
tools/perf/util/expr.h | 1 +
tools/perf/util/expr.l | 1 +
tools/perf/util/expr.y | 8 +-
tools/perf/util/pmu.c | 17 +
tools/perf/util/pmu.h | 1 +
37 files changed, 923 insertions(+), 609 deletions(-)
create mode 100644 tools/perf/arch/arm64/tests/cpuid-match.c
delete mode 100644 tools/perf/pmu-events/arch/arm64/arm/neoverse-n2-v2/branch.json
delete mode 100644 tools/perf/pmu-events/arch/arm64/arm/neoverse-n2-v2/cache.json
create mode 100644 tools/perf/pmu-events/arch/arm64/arm/neoverse-n2-v2/fp_operation.json
create mode 100644 tools/perf/pmu-events/arch/arm64/arm/neoverse-n2-v2/general.json
delete mode 100644 tools/perf/pmu-events/arch/arm64/arm/neoverse-n2-v2/instruction.json
create mode 100644 tools/perf/pmu-events/arch/arm64/arm/neoverse-n2-v2/l1d_cache.json
create mode 100644 tools/perf/pmu-events/arch/arm64/arm/neoverse-n2-v2/l1i_cache.json
create mode 100644 tools/perf/pmu-events/arch/arm64/arm/neoverse-n2-v2/l2_cache.json
create mode 100644 tools/perf/pmu-events/arch/arm64/arm/neoverse-n2-v2/l3_cache.json
create mode 100644 tools/perf/pmu-events/arch/arm64/arm/neoverse-n2-v2/ll_cache.json
delete mode 100644 tools/perf/pmu-events/arch/arm64/arm/neoverse-n2-v2/pipeline.json
create mode 100644 tools/perf/pmu-events/arch/arm64/arm/neoverse-n2-v2/retired.json
create mode 100644 tools/perf/pmu-events/arch/arm64/arm/neoverse-n2-v2/spec_operation.json
create mode 100644 tools/perf/pmu-events/arch/arm64/arm/neoverse-n2-v2/stall.json
create mode 100644 tools/perf/pmu-events/arch/arm64/arm/neoverse-n2-v2/sve.json
create mode 100644 tools/perf/pmu-events/arch/arm64/arm/neoverse-n2-v2/tlb.json
--
2.34.1
On 14/08/2023 14:07, John Garry wrote:
> On 11/08/2023 15:39, James Clark wrote:
>> Currently variant and revision fields are masked out of the MIDR so
>> it's not possible to compare different versions of the same CPU.
>> In a later commit a workaround will be removed just for N2 r0p3, so
>> enable comparisons on version.
>>
>> This has the side effect of changing the MIDR stored in the header of
>> the perf.data file to no longer have masked version fields.
>
> Did you consider adding a raw version of _get_cpuid(), which returns the
> full MIDR just for the purpose of caller strcmp_cpuid_str()?
I did, but I thought that seeing as it would only be used in one place,
and that changing the existing one didn't break anything, that it was
better to not fragment the CPU ID interface. I thought it might also
have repercussions for the other architectures as well. It would also
mean that the MIDR that's stored in the header wouldn't have the version
information, which if we're starting to do things with that could be bad.
There are already callers of strcmp_cpuid_str() so it's probably best to
keep it using the same get_cpuid() string. Unless there is a reason
_not_ to do it? There isn't really anything that can't be done with it
accepting/returning the full unmasked MIDR. If you want the old
behavior, you just set the version fields to 0, which I've also used in
a later patch and is already done in mapfile.csv
>
> I can't comment on how it will be called in relation to
> strcmp_cpuid_str(), as I am unsure whether strcmp_cpuid_str() should be
> just in arch arm64 code - see comment on later patch.
>
> It also
>> affects the lookups in mapfile.csv, but as that currently only has
>> zeroed version fields, it has no actual effect. The mapfile.csv
>> documentation also states to zero the version fields, so unless this
>> isn't done it will continue to have no effect.
>>
>> Signed-off-by: James Clark <james.clark(a)arm.com>
>> ---
>> tools/perf/arch/arm64/util/header.c | 64 ++++++++++++++++++++++-------
>> 1 file changed, 50 insertions(+), 14 deletions(-)
>>
>> diff --git a/tools/perf/arch/arm64/util/header.c
>> b/tools/perf/arch/arm64/util/header.c
>> index 80b9f6287fe2..8f74e801e1ab 100644
>> --- a/tools/perf/arch/arm64/util/header.c
>> +++ b/tools/perf/arch/arm64/util/header.c
>> @@ -1,3 +1,6 @@
>> +#include <linux/kernel.h>
>> +#include <linux/bits.h>
>> +#include <linux/bitfield.h>
>> #include <stdio.h>
>> #include <stdlib.h>
>> #include <perf/cpumap.h>
>> @@ -10,14 +13,12 @@
>> #define MIDR "/regs/identification/midr_el1"
>> #define MIDR_SIZE 19
>> -#define MIDR_REVISION_MASK 0xf
>> -#define MIDR_VARIANT_SHIFT 20
>> -#define MIDR_VARIANT_MASK (0xf << MIDR_VARIANT_SHIFT)
>> +#define MIDR_REVISION_MASK GENMASK(3, 0)
>> +#define MIDR_VARIANT_MASK GENMASK(23, 20)
>> static int _get_cpuid(char *buf, size_t sz, struct perf_cpu_map
>> *cpus)
>> {
>> const char *sysfs = sysfs__mountpoint();
>> - u64 midr = 0;
>> int cpu;
>> if (!sysfs || sz < MIDR_SIZE)
>> @@ -44,21 +45,11 @@ static int _get_cpuid(char *buf, size_t sz, struct
>> perf_cpu_map *cpus)
>> }
>> fclose(file);
>> - /* Ignore/clear Variant[23:20] and
>> - * Revision[3:0] of MIDR
>> - */
>> - midr = strtoul(buf, NULL, 16);
>> - midr &= (~(MIDR_VARIANT_MASK | MIDR_REVISION_MASK));
>> - scnprintf(buf, MIDR_SIZE, "0x%016lx", midr);
>> /* got midr break loop */
>> break;
>> }
>> perf_cpu_map__put(cpus);
>> -
>> - if (!midr)
>> - return EINVAL;
>> -
>> return 0;
>> }
>> @@ -99,3 +90,48 @@ char *get_cpuid_str(struct perf_pmu *pmu)
>> return buf;
>> }
>> +
>> +/*
>> + * Return 0 if idstr is a higher or equal to version of the same part as
>> + * mapcpuid.
>> + *
>> + * Therefore, if mapcpuid has 0 for revision and variant then any
>> version of
>> + * idstr will match as long as it's the same CPU type.
>> + */
>> +int strcmp_cpuid_str(const char *mapcpuid, const char *idstr)
>
> should we check implementator and other fields as a sanity check?
>
I'm not sure what we could check them against? There are no existing
sanity checks around the MIDR stuff, it just takes whatever MIDR is in
mapfile.csv and from the system. Unless there is something specific to
this change that would benefit from it? Otherwise it sounds like it
could be a separate additional change to what's already in Perf.
>> +{
>> + u64 map_id = strtoull(mapcpuid, NULL, 16);
>> + char map_id_variant = FIELD_GET(MIDR_VARIANT_MASK, map_id);
>> + char map_id_revision = FIELD_GET(MIDR_REVISION_MASK, map_id);
>> + u64 id = strtoull(idstr, NULL, 16);
>> + char id_variant = FIELD_GET(MIDR_VARIANT_MASK, id);
>> + char id_revision = FIELD_GET(MIDR_REVISION_MASK, id);
>> + u64 id_fields = ~(MIDR_VARIANT_MASK | MIDR_REVISION_MASK);
>> +
>> + /* Compare without version first */
>> + if ((map_id & id_fields) != (id & id_fields))
>> + return 1;
>> +
>> + /*
>> + * ID matches, now compare version.
>> + *
>> + * Arm revisions (like r0p0) are compared here like two digit semver
>> + * values eg. 1.3 < 2.0 < 2.1 < 2.2. The events json file with the
>> + * highest matching version is used.
>> + *
>> + * r = high value = 'Variant' field in MIDR
>> + * p = low value = 'Revision' field in MIDR
>> + *
>> + */
>> + if (id_variant > map_id_variant)
>> + return 0;
>> +
>> + if (id_variant == map_id_variant && id_revision >= map_id_revision)
>> + return 0;
>> +
>> + /*
>> + * variant is less than mapfile variant or variants are the same but
>> + * the revision doesn't match. Return no match.
>> + */
>> + return 1;
>> +}
>
> Thanks,
> John
>