This moves remaining AMBA ACPI devices into respective platform drivers for
enabling ACPI based power management support. This series applies on latest
coresight/next branch. This series has been built, and boot tested on a DT
based coresight platform. Although this still requires some more testing on
ACPI based coresight platforms.
https://git.gitlab.arm.com/linux-arm/linux-anshuman.git (amba_other_acpi_migration_v3)
Cc: Lorenzo Pieralisi <lpieralisi(a)kernel.org>
Cc: Sudeep Holla <sudeep.holla(a)arm.com>
Cc: Suzuki K Poulose <suzuki.poulose(a)arm.com>
Cc: Mike Leach <mike.leach(a)linaro.org>
Cc: James Clark <james.clark(a)arm.com>
Cc: Maxime Coquelin <mcoquelin.stm32(a)gmail.com>
Cc: Alexandre Torgue <alexandre.torgue(a)foss.st.com>
Cc: linux-acpi(a)vger.kernel.org
Cc: linux-arm-kernel(a)lists.infradead.org
Cc: linux-kernel(a)vger.kernel.org
Cc: coresight(a)lists.linaro.org
Cc: linux-stm32(a)st-md-mailman.stormreply.com
Changes in V3:
- Split coresight_init_driver/remove_driver() helpers into a separate patch
- Added 'drvdata->pclk' comments in replicator, funnel, tpiu, tmc, and stm devices
- Updated funnel, and replicator drivers to use these new helpers
- Check for drvdata instead of drvdata->pclk in suspend and resume paths in catu,
tmc and debug devices
- Added patch to extract device name from AMBA pid based table lookup for stm
- Added patch to extract device properties from AMBA pid based table look for tmc
- Dropped pm_runtime_put() from common __probe() functions
- Handled pm_runtime_put() in AMBA driver in success path
- Handled pm_runtime_put() in platform driver in both success and error paths
Changes in V2:
https://lore.kernel.org/all/20231201062053.1268492-1-anshuman.khandual@arm.…
- Dropped redundant devm_ioremap_resource() hunk from tmc_platform_probe()
- Defined coresight_[init|remove]_driver() for both AMBA/platform drivers
- Changed catu, tmc, tpiu, stm and debug coresight drivers to use the new
helpers avoiding build issues arising from module_amba_driver(), and
module_platform_driver() being on the same file
Changes in V1:
https://lore.kernel.org/all/20231027072943.3418997-1-anshuman.khandual@arm.…
- Replaced all IS_ERR() instances with IS_ERR_OR_NULL() as per Suzuki
Changes in RFC:
https://lore.kernel.org/all/20230921042040.1334641-1-anshuman.khandual@arm.…
Anshuman Khandual (10):
coresight: stm: Extract device name from AMBA pid based table lookup
coresight: tmc: Extract device properties from AMBA pid based table lookup
coresight: Add helpers registering/removing both AMBA and platform drivers
coresight: replicator: Move ACPI support from AMBA driver to platform driver
coresight: funnel: Move ACPI support from AMBA driver to platform driver
coresight: catu: Move ACPI support from AMBA driver to platform driver
coresight: tpiu: Move ACPI support from AMBA driver to platform driver
coresight: tmc: Move ACPI support from AMBA driver to platform driver
coresight: stm: Move ACPI support from AMBA driver to platform driver
coresight: debug: Move ACPI support from AMBA driver to platform driver
drivers/acpi/arm64/amba.c | 8 -
drivers/hwtracing/coresight/coresight-catu.c | 140 +++++++++++++---
drivers/hwtracing/coresight/coresight-catu.h | 1 +
drivers/hwtracing/coresight/coresight-core.c | 29 ++++
.../hwtracing/coresight/coresight-cpu-debug.c | 141 ++++++++++++++--
.../hwtracing/coresight/coresight-funnel.c | 87 +++++-----
drivers/hwtracing/coresight/coresight-priv.h | 10 ++
.../coresight/coresight-replicator.c | 81 ++++-----
drivers/hwtracing/coresight/coresight-stm.c | 115 +++++++++++--
.../hwtracing/coresight/coresight-tmc-core.c | 156 +++++++++++++++---
drivers/hwtracing/coresight/coresight-tmc.h | 2 +
drivers/hwtracing/coresight/coresight-tpiu.c | 99 +++++++++--
include/linux/coresight.h | 7 +
13 files changed, 713 insertions(+), 163 deletions(-)
--
2.25.1
Hi Steve
On 04/12/2023 21:41, Steve Clevenger wrote:
>
> Hi Suzuki,
>
> On 9/7/2023 2:31 PM, Steve Clevenger wrote:
>>
>> Hi Suzuki,
>>
>> On 9/1/2023 5:22 AM, Suzuki K Poulose wrote:
>>> Hi Steve
>>>
>>> On 30/08/2023 17:04, Suzuki K Poulose wrote:
>>>> Cc: Sudeep
>>>>
>>>> Hi Steve
>>>>
>>>> On 28/08/2023 17:35, Steve Clevenger wrote:
>>>>>
>>>>> Hi Suzuki,
>>>>>
>>>>> On 8/27/2023 2:35 PM, Suzuki K Poulose wrote:
>>>>>> Hi Steve
>>>>>>
>>>>>> On 26/08/2023 01:14, Steve Clevenger wrote:
>>>>>>>
>>>>>>> Unfortunately, I tested with the original patch not [PATCH V2]. I've
>>>>>>> remedied this. My results below:
>>>>>>>
>>>>>>> [root@sut01sys-b212 linux]# cat
>>>>>>> /sys/devices/system/cpu/cpu123/ARMHC501\:23/tmc_etr35/buf_modes_available
>>>>>>> auto flat catu
>>>>>>> [root@sut01sys-b212 linux]# cat
>>>>>>> /sys/devices/system/cpu/cpu123/ARMHC501\:23/tmc_etr35/buf_mode_preferred
>>>>>>> auto
>>>>>>> [root@sut01sys-b212 linux]# echo "catu" >
>>>>>>> /sys/devices/system/cpu/cpu123/ARMHC501\:23/tmc_etr35/buf_mode_preferred
>>>>>>> [root@sut01sys-b212 linux]# cat
>>>>>>> /sys/devices/system/cpu/cpu123/ARMHC501\:23/tmc_etr35/buf_mode_preferred
>>>>>>> catu
>>>>>>>
>>>>>>> As with the V1 patch, auto defaults to catu.
>>>>>>>
>>>>>>> I expected to see tmc-sg (former default) as an available mode, but do
>>>>>>> not. As I recall, the buffer mode defaulted to ETR scatter-gather
>>>>>>> prior
>>>>>>> to this patch. Must this capability now be explicitly advertised? I've
>>>>>>> seen this done as "arm,scatter-gather" in device trees, but not
>>>>>>> used by
>>>>>>> Ampere. Perhaps someone can enlighten me.
>>>>>>
>>>>>> Yes, you must add that property to the TMC-ETR node (for both DT and
>>>>>> ACPI). In the past, almost all of the TMC-ETRs (except Juno board)
>>>>>> locked up the system while using the SG mode (due to the interconnect
>>>>>> issues, something to do with the transaction). Thus, we decided to
>>>>>> add a property explicitly enabling this for a given platform.
>>>>>>
>>>>>> When you mentioned, it was using TMC-ETR SG mode, how did you verify
>>>>>> this ? Please be aware that the table allocation code etc are shared
>>>>>> by both TMC-SG and CATU.
>>>>>>
>>>>>
>>>>> You might recall how this started. I had no way to test the CATU due to
>>>>> the order the ETR modes defaulted (Flat, ETR-SG, CATU). For test
>>>>> purposes, I programmatically swapped the ETR-SG/CATU order and could
>>>>> then verify CATU operation by the driver calling into CATU code. This
>>>>
>>>> So, were you using the DT based boot for the above runs ?
>>>>
>>>>> suggests Flat mode was bypassed, and the driver defaulted to ETR-SG
>>>>> prior to this hack. This didn't offer the user any control, hence my
>>>>> feature request. Note that most of the early Ampere self-hosted trace
>>>>> collection used ETR-SG. Now I can't select it.
>>>>
>>>>
>>>>>
>>>>> How is this property described in the ACPI? The "ACPI for CoreSight™ 1.1
>>>>> Platform Design Document" (DEN0067) doesn't describe this.
>>>>
>>>> This is not specified in the ACPI platform design document. I can get
>>>> it fixed. Ideally we need a property describing that the scatter-gather
>>>> mode is safe to use.
>>>
>>> Looks like this is not straight forward copying of DT property. We are
>>> investigating this on our side and will get back to you.
>
> I noticed this work is queued for 6.7 (coresight-next-6.7). Do you have
> an update to the ACPI platform design document Ampere can use to base an
> update so the scatter-gather mode can be used?
I have chased this with the specification team, and we should have
something published soon. Apologies for the delay.
Suzuki
>
> Thanks,
> Steve
>
>>>
>>
>> The intent behind my request was to have a way to SysFS configure the
>> (available) ETR mode. Unless there's a change to the ACPI, the CATU is
>> the only SG option for Ampere in the near term.
>>
>> Thanks,
>> Steve
>>
>>
>>> Suzuki
>>>
>>>
>>>>
>>>> DT uses "arm,scatter-gather" property [0] and this is what we now expect
>>>> in the ACPI based systems too.
>>>>
>>>> https://elixir.bootlin.com/linux/latest/source/Documentation/devicetree/bin…
>>>>
>>>> Does it sound fine ?
>>>>
>>>> Suzuki
>>>>
>>>>>
>>>>> Thanks,
>>>>> Steve
>>>>>
>>>>>
>>>>>> Kind regards
>>>>>> Suzuki
>>>>>>
>>>>>>>
>>>>>>> Steve C.
>>>>>>>
>>>>>>> On 8/23/2023 4:10 PM, Steve Clevenger wrote:
>>>>>>>>
>>>>>>>> Here's some quick feedback. My system shows two modes available; auto
>>>>>>>> catu
>>>>>>>>
>>>>>>>> etr_buf_mode_current is writable. I expected to see tmc-sg (former
>>>>>>>> default) listed in etr_buf_modes_available but it doesn't show up.
>>>>>>>>
>>>>>>>> Note that both the auto and catu etr_buf_mode_current settings
>>>>>>>> default
>>>>>>>> to catu. My understanding is auto should revert to the default
>>>>>>>> behavior.
>>>>>>>> On my system the default was tmc-sg.
>>>>>>>>
>>>>>>>> More later.
>>>>>>>>
>>>>>>>> [root@sut01sys-b212 kernel]# cat
>>>>>>>> /sys/devices/system/cpu/cpu20/ARMHC501\:60/tmc_etr96/etr_buf_modes_available
>>>>>>>>
>>>>>>>> auto catu
>>>>>>>> [root@sut01sys-b212 kernel]# cat
>>>>>>>> /sys/devices/system/cpu/cpu20/ARMHC501\:60/tmc_etr96/etr_buf_mode_current
>>>>>>>> catu
>>>>>>>> [root@sut01sys-b212 kernel]# echo "catu" >
>>>>>>>> /sys/devices/system/cpu/cpu20/ARMHC501\:60/tmc_etr96/etr_buf_mode_current
>>>>>>>> [root@sut01sys-b212 kernel]# cat
>>>>>>>> /sys/devices/system/cpu/cpu20/ARMHC501\:60/tmc_etr96/etr_buf_mode_current
>>>>>>>> catu
>>>>>>>>
>>>>>>>> Steve C.
>>>>>>>>
>>>>>>>>
>>>>>>>> On 8/21/2023 12:40 PM, Steve Clevenger wrote:
>>>>>>>>>
>>>>>>>>> Hi Suzuki,
>>>>>>>>>
>>>>>>>>> I may be able to test it this week. You've already pointed me at the
>>>>>>>>> patch thread(s). The main holdup is I need to merge the 6.6 pending
>>>>>>>>> platform work in order to use the Ampere ACPI. I couldn't get these
>>>>>>>>> patches to apply directly to 6.4 last I tried.
>>>>>>>>>
>>>>>>>>> Steve C.
>>>>>>>>>
>>>>>>>>> On 8/18/2023 2:39 AM, Suzuki K Poulose wrote:
>>>>>>>>>> Cc: Steve
>>>>>>>>>>
>>>>>>>>>> Steve,
>>>>>>>>>>
>>>>>>>>>> Are you able to test this with CATU ?
>>>>>>>>>>
>>>>>>>>>>
>>>>>>>>>> On 18/08/2023 09:21, Anshuman Khandual wrote:
>>>>>>>>>>> Currently TMC-ETR automatically selects the buffer mode from all
>>>>>>>>>>> available
>>>>>>>>>>> methods in the following sequentially fallback manner - also in
>>>>>>>>>>> that
>>>>>>>>>>> order.
>>>>>>>>>>>
>>>>>>>>>>> 1. FLAT mode with or without IOMMU
>>>>>>>>>>> 2. TMC-ETR-SG (scatter gather) mode when available
>>>>>>>>>>> 3. CATU mode when available
>>>>>>>>>>>
>>>>>>>>>>> But this order might not be ideal for all situations. For
>>>>>>>>>>> example if
>>>>>>>>>>> there
>>>>>>>>>>> is a CATU connected to ETR, it may be better to use TMC-ETR
>>>>>>>>>>> scatter
>>>>>>>>>>> gather
>>>>>>>>>>> method, rather than CATU. But hard coding such order changes will
>>>>>>>>>>> prevent
>>>>>>>>>>> us from testing or using a particular mode. This change provides
>>>>>>>>>>> following
>>>>>>>>>>> new sysfs tunables for the user to control TMC-ETR buffer mode
>>>>>>>>>>> explicitly,
>>>>>>>>>>> if required. This adds following new sysfs files for buffer mode
>>>>>>>>>>> selection
>>>>>>>>>>> purpose explicitly in the user space.
>>>>>>>>>>>
>>>>>>>>>>> /sys/bus/coresight/devices/tmc_etr<N>/buf_modes_available
>>>>>>>>>>> /sys/bus/coresight/devices/tmc_etr<N>/buf_mode_preferred
>>>>>>>>>>>
>>>>>>>>>>> $ cat buf_modes_available
>>>>>>>>>>> auto flat tmc-sg catu ------------------> Supported TMC-ETR
>>>>>>>>>>> buffer
>>>>>>>>>>> modes
>>>>>>>>>>>
>>>>>>>>>>> $ echo catu > buf_mode_preferred -------> Explicit buffer mode
>>>>>>>>>>> request
>>>>>>>>>>>
>>>>>>>>>>> But explicit user request has to be within supported ETR buffer
>>>>>>>>>>> modes
>>>>>>>>>>> only.
>>>>>>>>>>> These sysfs interface files are exclussive to ETR, and hence these
>>>>>>>>>>> are
>>>>>>>>>>> not
>>>>>>>>>>> available for other TMC devices such as ETB or ETF etc.
>>>>>>>>>>>
>>>>>>>>>>> A new auto' mode (i.e ETR_MODE_AUTO) has been added to help
>>>>>>>>>>> fallback
>>>>>>>>>>> to the
>>>>>>>>>>> existing default behaviour, when user provided preferred buffer
>>>>>>>>>>> mode
>>>>>>>>>>> fails.
>>>>>>>>>>> ETR_MODE_FLAT and ETR_MODE_AUTO are always available as preferred
>>>>>>>>>>> modes.
>>>>>>>>>>>
>>>>>>>>>>> Cc: Suzuki K Poulose <suzuki.poulose(a)arm.com>
>>>>>>>>>>> Cc: Mike Leach <mike.leach(a)linaro.org>
>>>>>>>>>>> Cc: James Clark <james.clark(a)arm.com>
>>>>>>>>>>> Cc: Leo Yan <leo.yan(a)linaro.org>
>>>>>>>>>>> Cc: Alexander Shishkin <alexander.shishkin(a)linux.intel.com>
>>>>>>>>>>> Cc: coresight(a)lists.linaro.org
>>>>>>>>>>> Cc: linux-arm-kernel(a)lists.infradead.org
>>>>>>>>>>> Cc: linux-kernel(a)vger.kernel.org
>>>>>>>>>>> Signed-off-by: Anshuman Khandual <anshuman.khandual(a)arm.com>
>>>>>>>>>>> ---
>>>>>>>>>>> This applies on v6.5-rc6
>>>>>>>>>>>
>>>>>>>>>>> Changes in V2:
>>>>>>>>>>>
>>>>>>>>>>> - Renamed sysfs file etr_buf_modes_available as
>>>>>>>>>>> buf_modes_available
>>>>>>>>>>> - Renamed sysfs file buf_mode_current as buf_mode_preferred
>>>>>>>>>>> - Renamed etr_supports_flat_mode() as etr_can_use_flat_mode()
>>>>>>>>>>> - Renamed coresight_tmc_groups[] as coresight_etf_groups[]
>>>>>>>>>>> - Reused coresight_tmc_group[] for trigger_cntr and buffer_size
>>>>>>>>>>> - Fallback trying ETR_MODE_AUTO when user preferred mode fails
>>>>>>>>>>> - Moved ETR sysfs details into coresight-tmc-etr.c
>>>>>>>>>>> - Dropped etr_can_use_flat_mode() check while offering
>>>>>>>>>>> ETR_MODE_FLAT
>>>>>>>>>>> in sysfs
>>>>>>>>>>> - Moved struct etr_buf_hw inside coresight-tmc-etr.c
>>>>>>>>>>> - Moved get_etr_buf_hw() and etr_can_use_flat_mode() inside
>>>>>>>>>>> coresight-tmc-etr.c
>>>>>>>>>>> - Updated month in
>>>>>>>>>>> Documentation/ABI/testing/sysfs-bus-coresight-devices-tmc
>>>>>>>>>>>
>>>>>>>>>>> Changes in V1:
>>>>>>>>>>>
>>>>>>>>>>> https://lore.kernel.org/all/20230728084837.276551-1-anshuman.khandual@arm.c…
>>>>>>>>>>>
>>>>>>>>>>> .../testing/sysfs-bus-coresight-devices-tmc | 16 +++
>>>>>>>>>>> .../hwtracing/coresight/coresight-tmc-core.c | 15 ++-
>>>>>>>>>>> .../hwtracing/coresight/coresight-tmc-etr.c | 111
>>>>>>>>>>> ++++++++++++++++--
>>>>>>>>>>> drivers/hwtracing/coresight/coresight-tmc.h | 3 +
>>>>>>>>>>> 4 files changed, 131 insertions(+), 14 deletions(-)
>>>>>>>>>>
>>>>>>>>>>
>>>>>>>>>> Looks good to me.
>>>>>>>>>>
>>>>>>>>>> Suzuki
>>>>>>>>>>
>>>>>>>>>>
>>>>>>
>>>>
>>>
On 06/12/2023 03:14, Greg KH wrote:
> On Thu, Nov 16, 2023 at 05:01:19PM +0000, Suzuki K Poulose wrote:
>> Hi Greg
>>
>> Please find fixes for hwtracing/coresight subsystem, targetting Linux v6.7.
>
> Sorry for the delay, now pulled and pushed out.
Thank you, Greg!
Suzuki
This moves remaining AMBA ACPI devices into respective platform drivers for
enabling ACPI based power management support. This series applies on latest
coresight/next branch. This series has been built, and boot tested on a DT
based coresight platform. Although this still requires some more testing on
ACPI based coresight platforms.
https://git.gitlab.arm.com/linux-arm/linux-anshuman.git (amba_other_acpi_migration_v2)
Cc: Lorenzo Pieralisi <lpieralisi(a)kernel.org>
Cc: Sudeep Holla <sudeep.holla(a)arm.com>
Cc: Suzuki K Poulose <suzuki.poulose(a)arm.com>
Cc: Mike Leach <mike.leach(a)linaro.org>
Cc: James Clark <james.clark(a)arm.com>
Cc: Maxime Coquelin <mcoquelin.stm32(a)gmail.com>
Cc: Alexandre Torgue <alexandre.torgue(a)foss.st.com>
Cc: linux-acpi(a)vger.kernel.org
Cc: linux-arm-kernel(a)lists.infradead.org
Cc: linux-kernel(a)vger.kernel.org
Cc: coresight(a)lists.linaro.org
Cc: linux-stm32(a)st-md-mailman.stormreply.com
Changes in V2:
- Dropped redundant devm_ioremap_resource() hunk from tmc_platform_probe()
- Defined coresight_[init|remove]_driver() for both AMBA/platform drivers
- Changed catu, tmc, tpiu, stm and debug coresight drivers to use the new
helpers avoiding build issues arising from module_amba_driver(), and
module_platform_driver() being on the same file
Changes in V1:
https://lore.kernel.org/all/20231027072943.3418997-1-anshuman.khandual@arm.…
- Replaced all IS_ERR() instances with IS_ERR_OR_NULL() as per Suzuki
Changes in RFC:
https://lore.kernel.org/all/20230921042040.1334641-1-anshuman.khandual@arm.…
Anshuman Khandual (7):
coresight: replicator: Move ACPI support from AMBA driver to platform driver
coresight: funnel: Move ACPI support from AMBA driver to platform driver
coresight: catu: Move ACPI support from AMBA driver to platform driver
coresight: tpiu: Move ACPI support from AMBA driver to platform driver
coresight: tmc: Move ACPI support from AMBA driver to platform driver
coresight: stm: Move ACPI support from AMBA driver to platform driver
coresight: debug: Move ACPI support from AMBA driver to platform driver
drivers/acpi/arm64/amba.c | 8 -
drivers/hwtracing/coresight/coresight-catu.c | 130 +++++++++++++---
drivers/hwtracing/coresight/coresight-catu.h | 1 +
drivers/hwtracing/coresight/coresight-core.c | 29 ++++
.../hwtracing/coresight/coresight-cpu-debug.c | 141 ++++++++++++++++--
.../hwtracing/coresight/coresight-funnel.c | 49 +++---
.../coresight/coresight-replicator.c | 44 +++---
drivers/hwtracing/coresight/coresight-stm.c | 91 +++++++++--
.../hwtracing/coresight/coresight-tmc-core.c | 130 ++++++++++++++--
drivers/hwtracing/coresight/coresight-tmc.h | 1 +
drivers/hwtracing/coresight/coresight-tpiu.c | 87 ++++++++++-
include/linux/coresight.h | 7 +
12 files changed, 608 insertions(+), 110 deletions(-)
--
2.25.1
On 04/12/2023 09:48, Marc Zyngier wrote:
> On Thu, 19 Oct 2023 17:55:01 +0100,
> James Clark <james.clark(a)arm.com> wrote:
>>
>> Add an extra iflag to signify if the TRFCR register is accessible.
>> Because TRBE requires FEAT_TRF, DEBUG_STATE_SAVE_TRBE still has the same
>> behavior even though it's only set when FEAT_TRF is present.
>>
>> The following holes are left in struct kvm_vcpu_arch, but there aren't
>> enough other 8 bit fields to rearrange it to leave any hole smaller than
>> 7 bytes:
>>
>> u8 cflags; /* 2292 1 */
>> /* XXX 1 byte hole, try to pack */
>> u16 iflags; /* 2294 2 */
>> u8 sflags; /* 2296 1 */
>> bool pause; /* 2297 1 */
>> /* XXX 6 bytes hole, try to pack */
>>
>> Signed-off-by: James Clark <james.clark(a)arm.com>
>> ---
>> arch/arm64/include/asm/kvm_host.h | 4 +++-
>> arch/arm64/kvm/debug.c | 22 ++++++++++++++++++----
>> 2 files changed, 21 insertions(+), 5 deletions(-)
>>
>> diff --git a/arch/arm64/include/asm/kvm_host.h b/arch/arm64/include/asm/kvm_host.h
>> index 7c82927ddaf2..0f0bf8e641bd 100644
>> --- a/arch/arm64/include/asm/kvm_host.h
>> +++ b/arch/arm64/include/asm/kvm_host.h
>> @@ -535,7 +535,7 @@ struct kvm_vcpu_arch {
>> u8 cflags;
>>
>> /* Input flags to the hypervisor code, potentially cleared after use */
>> - u8 iflags;
>> + u16 iflags;
>>
>> /* State flags for kernel bookkeeping, unused by the hypervisor code */
>> u8 sflags;
>> @@ -741,6 +741,8 @@ struct kvm_vcpu_arch {
>> #define DEBUG_STATE_SAVE_TRBE __vcpu_single_flag(iflags, BIT(6))
>> /* vcpu running in HYP context */
>> #define VCPU_HYP_CONTEXT __vcpu_single_flag(iflags, BIT(7))
>> +/* Save trace filter controls */
>> +#define DEBUG_STATE_SAVE_TRFCR __vcpu_single_flag(iflags, BIT(8))
>>
>> /* SVE enabled for host EL0 */
>> #define HOST_SVE_ENABLED __vcpu_single_flag(sflags, BIT(0))
>> diff --git a/arch/arm64/kvm/debug.c b/arch/arm64/kvm/debug.c
>> index 8725291cb00a..20cdd40b3c42 100644
>> --- a/arch/arm64/kvm/debug.c
>> +++ b/arch/arm64/kvm/debug.c
>> @@ -331,14 +331,28 @@ void kvm_arch_vcpu_load_debug_state_flags(struct kvm_vcpu *vcpu)
>> !(read_sysreg_s(SYS_PMBIDR_EL1) & BIT(PMBIDR_EL1_P_SHIFT)))
>> vcpu_set_flag(vcpu, DEBUG_STATE_SAVE_SPE);
>>
>> - /* Check if we have TRBE implemented and available at the host */
>> - if (cpuid_feature_extract_unsigned_field(dfr0, ID_AA64DFR0_EL1_TraceBuffer_SHIFT) &&
>> - !(read_sysreg_s(SYS_TRBIDR_EL1) & TRBIDR_EL1_P))
>> - vcpu_set_flag(vcpu, DEBUG_STATE_SAVE_TRBE);
>> + /*
>> + * Save TRFCR on nVHE if FEAT_TRF (TraceFilt) exists. This will be
>> + * done in cases where use of TRBE doesn't completely disable trace and
>> + * handles the exclude_host/exclude_guest rules of the trace session.
>
> This comment provides zero information. What will be done? Under which
> conditions? What are the rules?
>
>> + */
>> + if (cpuid_feature_extract_unsigned_field(dfr0, ID_AA64DFR0_EL1_TraceFilt_SHIFT)) {
>> + vcpu_set_flag(vcpu, DEBUG_STATE_SAVE_TRFCR);
>> + /*
>> + * Check if we have TRBE implemented and available at the host. If it's
>> + * in use at the time of guest switch it will need to be disabled and
>> + * then restored. The architecture mandates FEAT_TRF with TRBE, so we
>> + * only need to check for TRBE after TRF.
>> + */
>> + if (cpuid_feature_extract_unsigned_field(dfr0, ID_AA64DFR0_EL1_TraceBuffer_SHIFT) &&
>> + !(read_sysreg_s(SYS_TRBIDR_EL1) & TRBIDR_EL1_P))
>> + vcpu_set_flag(vcpu, DEBUG_STATE_SAVE_TRBE);
>> + }
>
> Multiple questions:
>
> - Why is it safe to trust the local CPU's capability rather than the
> consolidated view from the cpufeature infrastructure?
The coresight driver is capable of handling heterogeneous systems. i.e.,
some CPUs may not have FEAT_TRF or FEAT_TRBE. This could be for various
reasons (e.g., CPU Erratum disabling TRBE, though not used now). It
already needs to treat each CPU differently (due to the capabilities
of the ETM).
That said, we could reject events with exclude_guest/exclude_host flags
on CPUs that do not support FEAT_TRF. But that doesn't solve checking
the local capability.
Suzuki
>
> - Why defer the saving of the registers if there are no changes made
> to them in the interval?
>
> Thanks,
>
> M.
>
On 04/12/2023 09:59, Marc Zyngier wrote:
> On Thu, 19 Oct 2023 17:55:02 +0100,
> James Clark <james.clark(a)arm.com> wrote:
>>
>> Add an interface for the Coresight driver to use to set the value of the
>> TRFCR register for the guest. This register controls the exclude
>> settings for trace at different exception levels, and is used to honor
>> the exclude_host and exclude_guest parameters from the Perf session.
>> This will be used to later write TRFCR_EL1 on nVHE at guest switch. For
>> VHE, the host trace is controlled by TRFCR_EL2 and thus we can write to
>> the TRFCR_EL1 immediately. Because guest writes to the register are
>> trapped, the value will persist and can't be modified.
>>
>> The settings must be copied to the vCPU before each run in the same
>> way that PMU events are, because the per-cpu struct isn't accessible in
>> protected mode.
>
> Then maybe we should look at a better way of sharing global data
> between EL1 and EL2 instead of copying stuff ad-nauseam?
>
That probably makes sense, I can have a look into that.
>>
>> Signed-off-by: James Clark <james.clark(a)arm.com>
>> ---
>> arch/arm64/include/asm/kvm_host.h | 3 +++
>> arch/arm64/kvm/arm.c | 1 +
>> arch/arm64/kvm/debug.c | 26 ++++++++++++++++++++++++++
>> 3 files changed, 30 insertions(+)
>>
>> diff --git a/arch/arm64/include/asm/kvm_host.h b/arch/arm64/include/asm/kvm_host.h
>> index 0f0bf8e641bd..e1852102550d 100644
>> --- a/arch/arm64/include/asm/kvm_host.h
>> +++ b/arch/arm64/include/asm/kvm_host.h
>> @@ -1125,6 +1125,8 @@ void kvm_arch_vcpu_put_debug_state_flags(struct kvm_vcpu *vcpu);
>> void kvm_set_pmu_events(u32 set, struct perf_event_attr *attr);
>> void kvm_clr_pmu_events(u32 clr);
>> bool kvm_set_pmuserenr(u64 val);
>> +void kvm_etm_set_guest_trfcr(u64 trfcr_guest);
>> +void kvm_etm_update_vcpu_events(struct kvm_vcpu *vcpu);
>> #else
>> static inline void kvm_set_pmu_events(u32 set, struct perf_event_attr *attr) {}
>> static inline void kvm_clr_pmu_events(u32 clr) {}
>> @@ -1132,6 +1134,7 @@ static inline bool kvm_set_pmuserenr(u64 val)
>> {
>> return false;
>> }
>> +static inline void kvm_etm_set_guest_trfcr(u64 trfcr_guest) {}
>> #endif
>>
>> void kvm_vcpu_load_sysregs_vhe(struct kvm_vcpu *vcpu);
>> diff --git a/arch/arm64/kvm/arm.c b/arch/arm64/kvm/arm.c
>> index 0f717b6a9151..e4d846f2f665 100644
>> --- a/arch/arm64/kvm/arm.c
>> +++ b/arch/arm64/kvm/arm.c
>> @@ -1015,6 +1015,7 @@ int kvm_arch_vcpu_ioctl_run(struct kvm_vcpu *vcpu)
>> kvm_vgic_flush_hwstate(vcpu);
>>
>> kvm_pmu_update_vcpu_events(vcpu);
>> + kvm_etm_update_vcpu_events(vcpu);
>>
>> /*
>> * Ensure we set mode to IN_GUEST_MODE after we disable
>> diff --git a/arch/arm64/kvm/debug.c b/arch/arm64/kvm/debug.c
>> index 20cdd40b3c42..2ab41b954512 100644
>> --- a/arch/arm64/kvm/debug.c
>> +++ b/arch/arm64/kvm/debug.c
>> @@ -23,6 +23,12 @@
>>
>> static DEFINE_PER_CPU(u64, mdcr_el2);
>>
>> +/*
>> + * Per CPU value for TRFCR that should be applied to any guest vcpu that may
>> + * run on that core in the future.
>> + */
>> +static DEFINE_PER_CPU(u64, guest_trfcr);
>> +
>> /**
>> * save/restore_guest_debug_regs
>> *
>> @@ -356,3 +362,23 @@ void kvm_arch_vcpu_put_debug_state_flags(struct kvm_vcpu *vcpu)
>> vcpu_clear_flag(vcpu, DEBUG_STATE_SAVE_TRBE);
>> vcpu_clear_flag(vcpu, DEBUG_STATE_SAVE_TRFCR);
>> }
>> +
>> +void kvm_etm_set_guest_trfcr(u64 trfcr_guest)
>> +{
>> + if (has_vhe())
>> + write_sysreg_s(trfcr_guest, SYS_TRFCR_EL12);
>> + else
>> + *this_cpu_ptr(&guest_trfcr) = trfcr_guest;
>> +}
>> +EXPORT_SYMBOL_GPL(kvm_etm_set_guest_trfcr);
>
> How does the ETM code know what guests it impacts? Don't you have some
> per-process context already?
>
It doesn't know what guests it impacts, it just does it blindly based on
host CPU and whatever guest might run on the CPU in the future. PMU
events are the same.
We do have per-process context for per-process sessions, so if that was
the VM process we might have been able to do something with that info.
But we also have per-cpu sessions that would trace anything that runs on
that CPU, so to be able to support that mode I think it has to be done
without knowing about any guest.
>> +
>> +/*
>> + * Updates the vcpu's view of the etm events for this cpu. Must be
>> + * called before every vcpu run after disabling interrupts, to ensure
>> + * that an interrupt cannot fire and update the structure.
>> + */
>> +void kvm_etm_update_vcpu_events(struct kvm_vcpu *vcpu)
>> +{
>> + if (!has_vhe() && vcpu_get_flag(vcpu, DEBUG_STATE_SAVE_TRFCR))
>> + ctxt_sys_reg(&vcpu->arch.ctxt, TRFCR_EL1) = *this_cpu_ptr(&guest_trfcr);
>> +}
>
> Why this requirement of updating it at all times? Why can't this be
> done in a more lazy way, using the flags to instruct the hypervisor
> what and when to load it?
>
> M.
>
I could probably add a flag that gets set if the guest value should be
different to the host value. I was just trying to keep it simple and in
terms of just what the registers should be.
The PMU one has something similar where it doesn't write anything if
kvm_pmu_switch_needed() is false, but that's only on the path where the
host sets the events, it still always does the copy in
kvm_pmu_update_vcpu_events() before the guest switch.
I suppose if I make the change to have the shared global data then the
copy isn't needed and this function and kvm_pmu_update_vcpu_events()
will just get deleted.
Thanks
James
On 04/12/2023 18:37, kernel test robot wrote:
> Hi James,
>
> kernel test robot noticed the following build warnings:
>
> [auto build test WARNING on atorgue-stm32/stm32-next]
> [also build test WARNING on linus/master v6.7-rc4 next-20231204]
> [If your patch is applied to the wrong git tree, kindly drop us a note.
> And when submitting patch, we suggest to use '--base' as documented in
> https://git-scm.com/docs/git-format-patch#_base_tree_information]
>
> url: https://github.com/intel-lab-lkp/linux/commits/James-Clark/coresight-Make-c…
> base: https://git.kernel.org/pub/scm/linux/kernel/git/atorgue/stm32.git stm32-next
> patch link: https://lore.kernel.org/r/20231115162834.355598-1-james.clark%40arm.com
> patch subject: [PATCH v2] coresight: Make current W=1 warnings default
> config: arm-randconfig-r131-20231117 (https://download.01.org/0day-ci/archive/20231205/202312050158.FKpxuafP-lkp@…)
> compiler: arm-linux-gnueabi-gcc (GCC) 13.2.0
> reproduce: (https://download.01.org/0day-ci/archive/20231205/202312050158.FKpxuafP-lkp@…)
>
> If you fix the issue in a separate patch/commit (i.e. not just a new version of
> the same patch/commit), kindly add following tags
> | Reported-by: kernel test robot <lkp(a)intel.com>
> | Closes: https://lore.kernel.org/oe-kbuild-all/202312050158.FKpxuafP-lkp@intel.com/
>
> sparse warnings: (new ones prefixed by >>)
>>> drivers/hwtracing/coresight/coresight-etb10.c:840:17: sparse: sparse: Using plain integer as NULL pointer
> drivers/hwtracing/coresight/coresight-etb10.c: note: in included file (through include/linux/mmzone.h, include/linux/gfp.h, include/linux/xarray.h, ...):
> include/linux/page-flags.h:242:46: sparse: sparse: self-comparison always evaluates to false
> --
>>> drivers/hwtracing/coresight/coresight-funnel.c:395:17: sparse: sparse: Using plain integer as NULL pointer
> drivers/hwtracing/coresight/coresight-funnel.c: note: in included file (through include/linux/mmzone.h, include/linux/gfp.h, include/linux/slab.h, ...):
> include/linux/page-flags.h:242:46: sparse: sparse: self-comparison always evaluates to false
> --
>>> drivers/hwtracing/coresight/coresight-tpdm.c:242:17: sparse: sparse: Using plain integer as NULL pointer
> drivers/hwtracing/coresight/coresight-tpdm.c: note: in included file (through include/linux/mmzone.h, include/linux/gfp.h, include/linux/xarray.h, ...):
> include/linux/page-flags.h:242:46: sparse: sparse: self-comparison always evaluates to false
>
> vim +840 drivers/hwtracing/coresight/coresight-etb10.c
>
> 834
> 835 static const struct amba_id etb_ids[] = {
> 836 {
> 837 .id = 0x000bb907,
> 838 .mask = 0x000fffff,
> 839 },
> > 840 { 0, 0, 0 },
> 841 };
> 842
>
This should already be fixed in V3 here:
https://lore.kernel.org/linux-arm-kernel/20231123120459.287578-1-james.clar…
On 04/12/2023 09:29, Marc Zyngier wrote:
> On Thu, 19 Oct 2023 17:55:00 +0100,
> James Clark <james.clark(a)arm.com> wrote:
>>
>> pmscr_el1 and trfcr_el1 are currently special cased in the
>> host_debug_state struct, but they're just registers after all so give
>> them entries in the sysreg array and refer to them through the host
>> context.
>>
>> Reviewed-by: Suzuki K Poulose <suzuki.poulose(a)arm.com>
>> Signed-off-by: James Clark <james.clark(a)arm.com>
>> ---
>> arch/arm64/include/asm/kvm_host.h | 6 ++--
>> arch/arm64/include/asm/kvm_hyp.h | 4 +--
>> arch/arm64/kvm/hyp/nvhe/debug-sr.c | 44 +++++++++++++++---------------
>> arch/arm64/kvm/hyp/nvhe/switch.c | 4 +--
>> 4 files changed, 28 insertions(+), 30 deletions(-)
>>
>> diff --git a/arch/arm64/include/asm/kvm_host.h b/arch/arm64/include/asm/kvm_host.h
>> index 4a966c0d7373..7c82927ddaf2 100644
>> --- a/arch/arm64/include/asm/kvm_host.h
>> +++ b/arch/arm64/include/asm/kvm_host.h
>> @@ -437,6 +437,8 @@ enum vcpu_sysreg {
>> CNTHP_CVAL_EL2,
>> CNTHV_CTL_EL2,
>> CNTHV_CVAL_EL2,
>> + PMSCR_EL1, /* Statistical profiling extension */
>> + TRFCR_EL1, /* Self-hosted trace filters */
>
> Why this move? Are you also adding guest support for SPE?
>
> Until you do, I don't see the need to pollute the guest's sysreg
> namespace.
>
> M.
>
Ah ok yes, I think I misunderstood your previous review comment. You're
right I don't touch SPE and it's only ever for the host so I can leave
it where it was in struct host_debug_state.
James
On 12/1/23 18:05, Sudeep Holla wrote:
> On Fri, Dec 01, 2023 at 11:50:47AM +0530, Anshuman Khandual wrote:
>> Add support for the dynamic replicator device in the platform driver, which
>> can then be used on ACPI based platforms. This change would now allow
>> runtime power management for repliacator devices on ACPI based systems.
>>
>> The driver would try to enable the APB clock if available. Also, rename the
>> code to reflect the fact that it now handles both static and dynamic
>> replicators.
>>
>> Cc: Lorenzo Pieralisi <lpieralisi(a)kernel.org>
>> Cc: Sudeep Holla <sudeep.holla(a)arm.com>
>
> Except the minor nit below which may apply also for few other patches
> in the series
>
> Acked-by: Sudeep Holla <sudeep.holla(a)arm.com> # For ACPI related changes
> Tested-by: Sudeep Holla <sudeep.holla(a)arm.com> # Boot and driver probe only
>
> [...]
>
>> diff --git a/drivers/hwtracing/coresight/coresight-replicator.c b/drivers/hwtracing/coresight/coresight-replicator.c
>> index b6be73034996..64de0bee02ec 100644
>> --- a/drivers/hwtracing/coresight/coresight-replicator.c
>> +++ b/drivers/hwtracing/coresight/coresight-replicator.c
>> @@ -38,6 +38,7 @@ DEFINE_CORESIGHT_DEVLIST(replicator_devs, "replicator");
>> struct replicator_drvdata {
>> void __iomem *base;
>> struct clk *atclk;
>> + struct clk *pclk;
>
> [minor nit] Perhaps can be documented as well ?
Sure, will add the following comment above the structure.
@pclk: optional clock for the core parts of the replicator.
Hi All,
On a Linux system with isolated cpus and nohz_full[1] we see the following issue:
While profiling an application running on isolated CPU with Coresight via Perf
tool, we noticed that only one sample of trace data is collected. To collect
more samples, we tried using the -F max option.
$ perf record -e cs_etm/@tmc_etr0/ -F max -C 5 taskset -c 5 test-app
This results in a lot of trace data samples being generated. As a side-effect
several timer work interrupts are handled on that core which are affecting
latency of the application running on the isolated cpu.
Setting -F to a lower value isn't changing the behavior and still results
in the same amount of trace samples as -F max option.
Have you noticed this behavior with -F option ?
To avoid too many timer interrupts being handled on the profiled core, are
the following solutions viable ?
1. Send IPIs to enable/disable the ETMs and handle the timer interrupts on a
different Core.
2. CTI to enable/disable the ETMs on every timer expiry.
Can you please suggest any other solutions ?
With Regards,
Tanmay
[1] Kernel command line: isolcpus=4-23 nohz_full=4-23 rcu_nocbs=4-23