Hi Al and Mike,
16 (0xF) should work for all silicon, as AXI allows burst sizes up to 16. So unless we've missed something, this is an implementation non-compliance and we should not be penalising compliant implementations by reducing the default burst size - the question is how we can enable the upstream kernel
to
workaround the issue on this chip only, and to me that sounds like it
needs
something that can be triggered by a setting in DT/ACPI.
Another possibility would be to introduce an errata workaround in Kconfig for your silicon. There are a number of these already in KConfig for PE issues e.g. CONFIG_ARM64_ERRATUM_826319, and we have introduced CONFIG_ETM4X_IMPDEF_FEATURE for silicon specific variants in the ETMv4 driver.
The latter config compiles in implementation defined workarounds which operate on the basis of matching the AMBA ID for the silicon. This means they will operate only if configured in KConfig and only on silicon where the workaround is needed.
Thanks a lot for the suggestions.
We are thinking of using "part-number" and "designer" rather than device tree property. While we use standard ARM core IP and Coresight SoC-600 IP, we cannot differentiate our silicon from others using ETR AMBA ID, PIDR and CPU MIDR registers.
We are proposing to expose Coresight ROM region to the driver and determine part number and designer by reading the following fields. part_number = (PIDR1.PART_1 << 8) | PIDR0.PART_0; designer = ((PIDR4.DES_2 << 7) & 0xf) | ((PIDR2.DES_1 << 4) & 0x7) | ((PIDR1.DES_0 & 0xf));
Using a combination of part number and designer from ROM region would help in identifying the Marvell implementation.
This option would be generic and could be helpful for other silicon with similar issues and can be applied across Coresight components like ETF/ETR.
What are your thoughts on this approach ?
With Regards, Tanmay
Regards
Mike
Al
Hi Al and Mike,
16 (0xF) should work for all silicon, as AXI allows burst sizes up to 16. So unless we've missed something, this is an implementation non-compliance and we should not be penalising compliant implementations by reducing the default burst size - the question is how we can enable the upstream kernel
to
workaround the issue on this chip only, and to me that sounds like it
needs
something that can be triggered by a setting in DT/ACPI.
Another possibility would be to introduce an errata workaround in Kconfig for your silicon. There are a number of these already in KConfig for PE issues e.g. CONFIG_ARM64_ERRATUM_826319, and we have introduced CONFIG_ETM4X_IMPDEF_FEATURE for silicon specific variants in the ETMv4 driver.
The latter config compiles in implementation defined workarounds which operate on the basis of matching the AMBA ID for the silicon. This means they will operate only if configured in KConfig and only on silicon where the workaround is needed.
Thanks a lot for the suggestions.
We are thinking of using "part-number" and "designer" rather than device tree property. While we use standard ARM core IP and Coresight SoC-600 IP, we cannot differentiate our silicon from others using ETR AMBA ID, PIDR and CPU MIDR registers.
We are proposing to expose Coresight ROM region to the driver and determine part number and designer by reading the following fields. part_number = (PIDR1.PART_1 << 8) | PIDR0.PART_0; designer = ((PIDR4.DES_2 << 7) & 0xf) | ((PIDR2.DES_1 << 4) & 0x7) | ((PIDR1.DES_0 & 0xf));
Using a combination of part number and designer from ROM region would help in identifying the Marvell implementation.
This option would be generic and could be helpful for other silicon with similar issues and can be applied across Coresight components like ETF/ETR.
What are your thoughts on this approach ?
It's a valid approach if you can locate the CoreSight ROM region. Some CPUs don't set the CPU DBGDRAR (ROM region base address) register, so as a generic mechanism, there would need to be some alternative way of locating the ROM region, e.g. through DT or ACPI.
Al
With Regards, Tanmay
Regards
Mike
Al
On Mon, Aug 23, 2021 at 09:44:51AM +0000, Al Grant wrote:
Hi Al and Mike,
16 (0xF) should work for all silicon, as AXI allows burst sizes up to 16. So unless we've missed something, this is an implementation non-compliance and we should not be penalising compliant implementations by reducing the default burst size - the question is how we can enable the upstream kernel
to
workaround the issue on this chip only, and to me that sounds like it
needs
something that can be triggered by a setting in DT/ACPI.
Another possibility would be to introduce an errata workaround in Kconfig for your silicon. There are a number of these already in KConfig for PE issues e.g. CONFIG_ARM64_ERRATUM_826319, and we have introduced CONFIG_ETM4X_IMPDEF_FEATURE for silicon specific variants in the ETMv4 driver.
The latter config compiles in implementation defined workarounds which operate on the basis of matching the AMBA ID for the silicon. This means they will operate only if configured in KConfig and only on silicon where the workaround is needed.
Thanks a lot for the suggestions.
We are thinking of using "part-number" and "designer" rather than device tree property. While we use standard ARM core IP and Coresight SoC-600 IP, we cannot differentiate our silicon from others using ETR AMBA ID, PIDR and CPU MIDR registers.
We are proposing to expose Coresight ROM region to the driver and determine part number and designer by reading the following fields. part_number = (PIDR1.PART_1 << 8) | PIDR0.PART_0; designer = ((PIDR4.DES_2 << 7) & 0xf) | ((PIDR2.DES_1 << 4) & 0x7) | ((PIDR1.DES_0 & 0xf));
Using a combination of part number and designer from ROM region would help in identifying the Marvell implementation.
This option would be generic and could be helpful for other silicon with similar issues and can be applied across Coresight components like ETF/ETR.
What are your thoughts on this approach ?
It's a valid approach if you can locate the CoreSight ROM region. Some CPUs don't set the CPU DBGDRAR (ROM region base address) register, so as a generic mechanism, there would need to be some alternative way of locating the ROM region, e.g. through DT or ACPI.
Sorry for sudden jumping into this topic, since I have some experience for DT binding, I agree with Al that DT (or ACPI) is the right way to move forward.
You could see there have some examples for DT binding for DMA controller and memory controller for burst size, e.g.
in Documentation/devicetree/bindings/dma/arm-pl08x.yaml:
memcpy-burst-size: $ref: /schemas/types.yaml#/definitions/uint32 enum: - 1 - 4 - 8 - 16 - 32 - 64 - 128 - 256 description: the size of the bursts for memcpy
Just remind, DT binding is not used to set any specific register value, on the other hand, DT property is to describe the hardware character. So I think it's reasonable to add a new property to describe burst size for sink, if DT binding has no the property, then driver will fallback to use the default value '16'.
Thanks, Leo
On Mon, Aug 23, 2021 at 06:13:04PM +0800, Leo Yan wrote:
On Mon, Aug 23, 2021 at 09:44:51AM +0000, Al Grant wrote:
Hi Al and Mike,
16 (0xF) should work for all silicon, as AXI allows burst sizes up to 16. So unless we've missed something, this is an implementation non-compliance and we should not be penalising compliant implementations by reducing the default burst size - the question is how we can enable the upstream kernel
to
workaround the issue on this chip only, and to me that sounds like it
needs
something that can be triggered by a setting in DT/ACPI.
Another possibility would be to introduce an errata workaround in Kconfig for your silicon. There are a number of these already in KConfig for PE issues e.g. CONFIG_ARM64_ERRATUM_826319, and we have introduced CONFIG_ETM4X_IMPDEF_FEATURE for silicon specific variants in the ETMv4 driver.
The latter config compiles in implementation defined workarounds which operate on the basis of matching the AMBA ID for the silicon. This means they will operate only if configured in KConfig and only on silicon where the workaround is needed.
Thanks a lot for the suggestions.
We are thinking of using "part-number" and "designer" rather than device tree property. While we use standard ARM core IP and Coresight SoC-600 IP, we cannot differentiate our silicon from others using ETR AMBA ID, PIDR and CPU MIDR registers.
We are proposing to expose Coresight ROM region to the driver and determine part number and designer by reading the following fields. part_number = (PIDR1.PART_1 << 8) | PIDR0.PART_0; designer = ((PIDR4.DES_2 << 7) & 0xf) | ((PIDR2.DES_1 << 4) & 0x7) | ((PIDR1.DES_0 & 0xf));
Using a combination of part number and designer from ROM region would help in identifying the Marvell implementation.
This option would be generic and could be helpful for other silicon with similar issues and can be applied across Coresight components like ETF/ETR.
What are your thoughts on this approach ?
It's a valid approach if you can locate the CoreSight ROM region. Some CPUs don't set the CPU DBGDRAR (ROM region base address) register, so as a generic mechanism, there would need to be some alternative way of locating the ROM region, e.g. through DT or ACPI.
Sorry for sudden jumping into this topic, since I have some experience for DT binding, I agree with Al that DT (or ACPI) is the right way to move forward.
I'd like to clarify a bit more at here:
A good way is to use DT binding with a new node property to describe the burst size, e.g. in Documentation/devicetree/bindings/arm/coresight.txt, we can add an optional property like below:
* Optional property for TMC:
* arm,burst-size: burst size initiated by the TMC on the AXI master interface. The burst size can be in the range [0..15], the setting supports one data transfer per burst to maximum of 16 data transfers per burst. If don't set this property, the driver will set to 15 (16 data transfers per burst) as default value.
I don't think it's a right way to use CoreSight ROM to read out part number and producer number, and based on these numbers to set burst size. The main reason is this will add SoC specific code into the driver, and DT binding is just used to decouple the platform specific with the driver code, so with the DT binding, the driver works as common as possible.
Using errata workaround also is not the right thing to do. Based on the TMC spec (ARM DDI 0461B), the burst size on AXI bus could be different on different SoCs, so it's not a silicon bug if a SoC configures any burst size rather than '15' (which is hard coded in current driver). We can use DT binding to allow a platform to pass its specific burst size and the driver could set the 'correct' burst size for register AXICTL. For backward compatibility, if this property is not set, the driver will set the default value 'TMC_AXICTL_WR_BURST_16'.
Thanks, Leo
-----Original Message----- From: Leo Yan leo.yan@linaro.org Sent: 24 August 2021 04:12 To: Al Grant Al.Grant@arm.com Cc: Tanmay Jagdale tanmay@marvell.com; Mike Leach mike.leach@linaro.org; coresight@lists.linaro.org; Sunil Kovvuri Goutham sgoutham@marvell.com; Linu Cherian lcherian@marvell.com; Bharat Bhushan bbhushan2@marvell.com Subject: Re: Query regarding AXI Write Burst Length in ETR driver
On Mon, Aug 23, 2021 at 06:13:04PM +0800, Leo Yan wrote:
On Mon, Aug 23, 2021 at 09:44:51AM +0000, Al Grant wrote:
Hi Al and Mike,
16 (0xF) should work for all silicon, as AXI allows burst sizes up to 16. So unless we've missed something, this is an implementation non-compliance and we should not be penalising compliant implementations by reducing the default burst size - the question is how we can enable the upstream kernel
to
workaround the issue on this chip only, and to me that sounds like it
needs
something that can be triggered by a setting in DT/ACPI.
Another possibility would be to introduce an errata workaround in Kconfig for your silicon. There are a number of these already in KConfig for PE issues e.g. CONFIG_ARM64_ERRATUM_826319, and we have introduced CONFIG_ETM4X_IMPDEF_FEATURE for silicon specific variants in the ETMv4 driver.
The latter config compiles in implementation defined workarounds which operate on the basis of matching the AMBA ID for the silicon. This means they will operate only if configured in KConfig and only on silicon where the workaround is needed.
Thanks a lot for the suggestions.
We are thinking of using "part-number" and "designer" rather than device tree property. While we use standard ARM core IP and Coresight SoC-600 IP, we cannot differentiate our silicon from others using ETR AMBA ID, PIDR and CPU MIDR registers.
We are proposing to expose Coresight ROM region to the driver and determine part number and designer by reading the following fields. part_number = (PIDR1.PART_1 << 8) | PIDR0.PART_0; designer = ((PIDR4.DES_2 << 7) & 0xf) | ((PIDR2.DES_1 << 4) & 0x7) | ((PIDR1.DES_0 & 0xf));
Using a combination of part number and designer from ROM region would help in identifying the Marvell implementation.
This option would be generic and could be helpful for other silicon with similar issues and can be applied across Coresight components
like ETF/ETR.
What are your thoughts on this approach ?
It's a valid approach if you can locate the CoreSight ROM region. Some CPUs don't set the CPU DBGDRAR (ROM region base address) register, so as a generic mechanism, there would need to be some alternative way of locating the ROM region, e.g. through DT or ACPI.
Sorry for sudden jumping into this topic, since I have some experience for DT binding, I agree with Al that DT (or ACPI) is the right way to move forward.
I'd like to clarify a bit more at here:
A good way is to use DT binding with a new node property to describe the burst size, e.g. in Documentation/devicetree/bindings/arm/coresight.txt, we can add an optional property like below:
Optional property for TMC:
- arm,burst-size: burst size initiated by the TMC on the AXI master interface. The burst size can be in the range [0..15], the setting supports one data transfer per burst to maximum of 16 data transfers per burst. If don't set this property, the driver will set to 15 (16 data transfers per burst) as default value.
I don't think it's a right way to use CoreSight ROM to read out part number and producer number, and based on these numbers to set burst size. The main reason is this will add SoC specific code into the driver, and DT binding is just used to decouple the platform specific with the driver code, so with the DT binding, the driver works as common as possible.
Using errata workaround also is not the right thing to do. Based on the TMC spec (ARM DDI 0461B), the burst size on AXI bus could be different on different SoCs
Hi Leo, where are you reading that this is a property of the AXI bus? There is a constraint that the burst size must not be greater than the TMC buffer size (see DDI 0461B 3.3.1), but buffer size is indicated by the TMC's MEMWIDTH register and the driver can determine that.
The situation I thought we were dealing with, is where the TMC presents a burst that the AXI spec says is valid but the downstream component in this SoC can't accept a burst of that size. Our understanding is that this would be a non-compliance in the downstream component - i.e. failure to handle a valid AXI transaction - and it would be appropriate to treat this as a SoC-specific issue triggering a SoC-specific workaround. (The alternative would be to represent the AXI topology in DT/ACPI and then drive the workaround from knowing what the downstream component is.)
Al
, so it's not a silicon bug if a SoC configures any burst size rather than '15'
(which is hard coded in current driver). We can use DT binding to allow a platform to pass its specific burst size and the driver could set the 'correct' burst size for register AXICTL. For backward compatibility, if this property is not set, the driver will set the default value 'TMC_AXICTL_WR_BURST_16'.
Thanks, Leo
Hi Al,
On Tue, Aug 24, 2021 at 02:55:07PM +0000, Al Grant wrote:
[...]
> 16 (0xF) should work for all silicon, as AXI allows burst sizes up to 16. > So unless we've missed something, this is an implementation > non-compliance and we should not be penalising compliant > implementations by reducing the default burst size - the > question is how we can enable the upstream kernel to > workaround the issue on this chip only, and to me that sounds > like it needs something that can be triggered by a setting in DT/ACPI.
[...]
we can add an optional property like below:
Optional property for TMC:
- arm,burst-size: burst size initiated by the TMC on the AXI master interface. The burst size can be in the range [0..15], the setting supports one data transfer per burst to maximum of 16 data transfers per burst. If don't set this property, the driver will set to 15 (16 data transfers per burst) as default value.
I don't think it's a right way to use CoreSight ROM to read out part number and producer number, and based on these numbers to set burst size. The main reason is this will add SoC specific code into the driver, and DT binding is just used to decouple the platform specific with the driver code, so with the DT binding, the driver works as common as possible.
Using errata workaround also is not the right thing to do. Based on the TMC spec (ARM DDI 0461B), the burst size on AXI bus could be different on different SoCs
Hi Leo, where are you reading that this is a property of the AXI bus?
No, I cannot find any description in TMC spec (ARM DDI 0461B) to say AXICTL.WrBurstLen is a property of the AXI bus.
After read the register AXICTL (DDI 0461B 3.3.15), I mixed the concepts, AXICTL.WrBurstLen is an attribution for TMC to initiate the max number of data transfers, it's not a property for AXI bus itself.
Sorry I introduced confusion.
There is a constraint that the burst size must not be greater than the TMC buffer size (see DDI 0461B 3.3.1), but buffer size is indicated by the TMC's MEMWIDTH register and the driver can determine that.
DEVID.MEMWIDTH: The width of the memory interface databus for ETB/ETF, DEVID.MEMWIDT = 2 * (ATB datawidth) for ETR, DEVID.MEMWIDT = ATB datawidth
AXICTL.WrBurstLen: The maximum number of data transfer that can occur within each burst.
RSZ.RSZ: The size of the local RAM buffer (in 32-bit words).
And there have requirement for burst size and TMC buffer size:
2 ^ (DEVID.MEMWIDTH - 2) * AXICTL.WrBurstLen <= RSZ.RSZ
The situation I thought we were dealing with, is where the TMC presents a burst that the AXI spec says is valid but the downstream component in this SoC can't accept a burst of that size. Our understanding is that this would be a non-compliance in the downstream component - i.e. failure to handle a valid AXI transaction - and it would be appropriate to treat this as a SoC-specific issue triggering a SoC-specific workaround. (The alternative would be to represent the AXI topology in DT/ACPI and then drive the workaround from knowing what the downstream component is.)
Okay, I see, the downstream component (e.g. memory controller) has the hardware constraint, it imposes the limitation back on TMC burst size.
The downstream can be a memory controller, or any other component. I understand your suggestion that use DT device node to bind TMC and downstream device, and based on the property in the downstream device we can get to know the constriant for burst size.
But here it's hard to unify property for downstream components; and I checked a bit for the latest Device tree spec, if we create the binding between TMC device and a memory node ('memory' is a general node in DT binding), the 'memory' node doesn't provide property for burst size.
So seems to me, a feasible (and simple) way is still to add a property for TMC, since the TMC is connected to its downstream component through AXI bus, the developer for a platform has the knowledge for the constraint for the max burst size.
* Optional property for TMC:
* arm,max-burst-size: the max burst size initiated by the TMC on the AXI master interface. The burst size can be in the range [0..15], the setting supports one data transfer per burst to maximum of 16 data transfers per burst.
When a platform has specific constriant for the maximum burst size (e.g. caused by its downstream component), it can set this property for the maximum burst size; if this property isn't set, the driver will set to 15 (16 data transfers per burst) as default value.
Hope I am not arbitrary for this, so I am curious what's opinions from others. any thoughts?
Thanks, Leo
[1] https://github.com/devicetree-org/devicetree-specification/releases/download...