On Thu, 7 May 2020 at 14:03, Mike Leach <mike.leach@linaro.org> wrote:
Hi,

On Thu, 7 May 2020 at 18:45, Mathieu Poirier <mathieu.poirier@linaro.org> wrote:
>
> On Wed, 6 May 2020 at 16:55, Suzuki K Poulose <suzuki.poulose@arm.com> wrote:
> >
> > On 05/06/2020 08:43 PM, Mathieu Poirier wrote:
> > > On Wed, 6 May 2020 at 11:00, Mike Leach <mike.leach@linaro.org> wrote:
> > >>
> > >> Hi Mathieu,
> > >>
> > >> On Wed, 6 May 2020 at 17:49, Mathieu Poirier <mathieu.poirier@linaro.org> wrote:
> > >>>
> > >>> On Wed, 6 May 2020 at 04:56, Mike Leach <mike.leach@linaro.org> wrote:
> > >>>>
> > >>>> Hi Mathieu,
> > >>>>
> > >>>> On Mon, 4 May 2020 at 21:31, Mathieu Poirier <mathieu.poirier@linaro.org> wrote:
> > >>>>>
> > >>>>> On Fri, May 01, 2020 at 03:02:22PM +0100, Mike Leach wrote:
> > >>>>>> Adds a method to select a suitable sink connected to a given source.
> > >>>>>>
> > >>>>>> In cases where no sink is defined, the coresight_find_default_sink
> > >>>>>> routine can search from a given source, through the child connections
> > >>>>>> until a suitable sink is found.
> > >>>>>>
> > >>>>>> The suitability is defined in a sink_select_priority parameter on the
> > >>>>>> CoreSight device.
> > >>>>>>
> > >>>>>> This allows for default sink to be discovered were none is specified
> > >>>>>> (e.g. perf command line)
> > >>>>>>
> > >>>>>> Signed-off-by: Mike Leach <mike.leach@linaro.org>
> > >>>>>> ---
> > >>>>>>   drivers/hwtracing/coresight/coresight-priv.h |  2 +
> > >>>>>>   drivers/hwtracing/coresight/coresight.c      | 76 ++++++++++++++++++++
> > >>>>>>   include/linux/coresight.h                    | 10 ++-
> > >>>>>>   3 files changed, 87 insertions(+), 1 deletion(-)
> > >>>>>>
> > >>>>>> diff --git a/drivers/hwtracing/coresight/coresight-priv.h b/drivers/hwtracing/coresight/coresight-priv.h
> > >>>>>> index 890f9a5c97c6..856d8a29658c 100644
> > >>>>>> --- a/drivers/hwtracing/coresight/coresight-priv.h
> > >>>>>> +++ b/drivers/hwtracing/coresight/coresight-priv.h
> > >>>>>> @@ -150,6 +150,8 @@ int coresight_enable_path(struct list_head *path, u32 mode, void *sink_data);
> > >>>>>>   struct coresight_device *coresight_get_sink(struct list_head *path);
> > >>>>>>   struct coresight_device *coresight_get_enabled_sink(bool reset);
> > >>>>>>   struct coresight_device *coresight_get_sink_by_id(u32 id);
> > >>>>>> +struct coresight_device *
> > >>>>>> +coresight_find_default_sink(struct coresight_device *csdev);
> > >>>>>>   struct list_head *coresight_build_path(struct coresight_device *csdev,
> > >>>>>>                                       struct coresight_device *sink);
> > >>>>>>   void coresight_release_path(struct list_head *path);
> > >>>>>> diff --git a/drivers/hwtracing/coresight/coresight.c b/drivers/hwtracing/coresight/coresight.c
> > >>>>>> index c71553c09f8e..0564d26f5eb3 100644
> > >>>>>> --- a/drivers/hwtracing/coresight/coresight.c
> > >>>>>> +++ b/drivers/hwtracing/coresight/coresight.c
> > >>>>>> @@ -769,6 +769,82 @@ void coresight_release_path(struct list_head *path)
> > >>>>>>        path = NULL;
> > >>>>>>   }
> > >>>>>>
> > >>>>>> +/**
> > >>>>>> + * coresight_find_sink - recursive function to walk trace connections from
> > >>>>>> + * source to find a suitable default sink.
> > >>>>>> + *
> > >>>>>> + * @csdev: source / current device to check.
> > >>>>>> + * @first_sink: the first lower priority sink found.
> > >>>>>> + *
> > >>>>>> + * Sinks are given priority weighting, the first HI priority sink will be
> > >>>>>> + * used, if not found then the next highest priority sink will be recorded
> > >>>>>> + * for possible later use.
> > >>>>>> + * (Higher numbers indicate lower priorities, 0 not a sink or not to be
> > >>>>>> + * selected as default)
> > >>>>>
> > >>>>> I think we should go for higher number == higher priority because 1) it is more
> > >>>>> intuitive and 2) it allows us to grow with minimal changes.  Right now ETB10,
> > >>>>> TMC-ETF and TPIU are '0' and ETR is '1'.  If new IP comes out we can assign them
> > >>>>> higher numbers with little changes.
> > >>>>>
> > >>>>
> > >>>> Went for this as it matches priority scheme for linux kernel where
> > >>>> 'nice' numbers have higher value for lower priority (only difference
> > >>>> being we use 0 as a "don't use" value and never use -ve values) & I
> > >>>> was thinking 1st priority is more important than 2nd priority etc.
> > >>>
> > >>> The "nice" priorities is one of the least intuitive heuristic to ever
> > >>> come out of the kernel and I would like to avoid any design that
> > >>> follows the same pattern.
> > >>>
> > >>>>
> > >>>>>> + *
> > >>>>>> + * This will walk the connection path from a source (etm) till a suitable
> > >>>>>> + * sink is encountered and return that sink to the original caller.
> > >>>>>> + *
> > >>>>>> + * return HI priority sink, or NULL if not found.
> > >>>>>> + * Best lower priority sink returned in first_sink.
> > >>>>>> + */
> > >>>>>> +static struct coresight_device *
> > >>>>>> +coresight_find_sink(struct coresight_device *csdev,
> > >>>>>> +                 struct coresight_device **first_sink)
> > >>>>>> +{
> > >>>>>> +     int i;
> > >>>>>> +
> > >>>>>> +     /* return if highest priority sink */
> > >>>>>> +     if (csdev->sink_select_priority == CORESIGHT_SINK_SELECT_PRIORITY_HI)
> > >>>>>> +             return csdev;
> > >>>>>> +
> > >>>>>> +     /* record if lower priority sink */
> > >>>>>> +     if (csdev->sink_select_priority > CORESIGHT_SINK_SELECT_PRIORITY_HI) {
> > >>>>>> +             if ((*first_sink == NULL) ||
> > >>>>>> +                 ((*first_sink)->sink_select_priority >
> > >>>>>> +                  csdev->sink_select_priority)) {
> > >>>>>> +                     *first_sink = csdev;
> > >>>>>> +             }
> > >>>>>> +     }
> > >>>>>> +
> > >>>>>> +     /*
> > >>>>>> +      * Not a sink we want - recursively explore each port found on
> > >>>>>> +      * this element.
> > >>>>>> +      */
> > >>>>>> +     for (i = 0; i < csdev->pdata->nr_outport; i++) {
> > >>>>>> +             struct coresight_device *child_dev, *sink;
> > >>>>>> +
> > >>>>>> +             child_dev = csdev->pdata->conns[i].child_dev;
> > >>>>>> +             if (child_dev) {
> > >>>>>> +                     sink = coresight_find_sink(child_dev, first_sink);
> > >>>>>> +                     if (sink)
> > >>>>>> +                             return sink;
> > >>>>>
> > >>>>> If @csdev is a replicator and a sink is found on port '0' components connected,
> > >>>>> directly or not, to port '1' won't be searched.
> > >>>>
> > >>>> Correct - by design - return the first found highest priority sink.
> > >>>>
> > >>>>> Also, instead of contending
> > >>>>> with a 'sink' and a 'first_sink' I would simply make coresight_find_sink()
> > >>>>> return the highest priority sink found from any @csdev port.  That should
> > >>>>> simplify thing a little.
> > >>>>
> > >>>> sink is used to early terminate the search if we find a sink that has
> > >>>> the highest possible priority.
> > >>>
> > >>> For now this will work but it won't when 1:1 topologies start coming
> > >>> around.  I would like to find a solution where we can deal with N:1,
> > >>> 1:1 and 1:1 + global sink topologies.
> > >>>
> > >>
> > >> I don't understand why you think this will not work with other
> > >> topologies. It will work perfectly well with 1:1 as it walks from
> > >> source to sink, so will hit the correct sink for the source.
> > >
> > > 1:1 will work but 1:1 + global sink won't if the topology is organised
> > > in such a way that the global (and common) sink is discovered first.
> > > I would like to believe a common sink won't be found in 1:1 topologies
> > > but that isn't guaranteed.
> > >
> > >>
> > >>>>
> > >>>>>
> > >>>>>> +             }
> > >>>>>> +     }
> > >>>>>> +     return NULL;
> > >>>>>> +}
> > >>>>>> +
> > >>>>>> +/**
> > >>>>>> + * coresight_find_default_sink: Find a sink suitable for use as a
> > >>>>>> + * default sink.
> > >>>>>> + *
> > >>>>>> + * @csdev: starting source to find a connected sink.
> > >>>>>> + *
> > >>>>>> + * Walks connections graph looking for a suitable sink to enable for the
> > >>>>>> + * supplied source. Uses sink_select_priority values.
> > >>>>>> + *
> > >>>>>> + * Used in cases where the CoreSight user (perf / sysfs) has not selected a
> > >>>>>> + * sink.
> > >>>>>> + */
> > >>>>>> +struct coresight_device *
> > >>>>>> +coresight_find_default_sink(struct coresight_device *csdev)
> > >>>>>> +{
> > >>>>>> +     struct coresight_device *first_sink = NULL, *sink;
> > >>>>>> +
> > >>>>>> +     sink = coresight_find_sink(csdev, &first_sink);
> > >>>>>> +     return sink ? sink : first_sink;
> > >>>>>> +}
> > >>>>>> +
> > >>>>>>   /** coresight_validate_source - make sure a source has the right credentials
> > >>>>>>    *  @csdev:  the device structure for a source.
> > >>>>>>    *  @function:       the function this was called from.
> > >>>>>> diff --git a/include/linux/coresight.h b/include/linux/coresight.h
> > >>>>>> index 193cc9dbf448..ff443761c8fc 100644
> > >>>>>> --- a/include/linux/coresight.h
> > >>>>>> +++ b/include/linux/coresight.h
> > >>>>>> @@ -161,10 +161,13 @@ struct coresight_connection {
> > >>>>>>    * @enable:  'true' if component is currently part of an active path.
> > >>>>>>    * @activated:       'true' only if a _sink_ has been activated.  A sink can be
> > >>>>>>    *           activated but not yet enabled.  Enabling for a _sink_
> > >>>>>> - *           appens when a source has been selected for that it.
> > >>>>>> + *           happens when a source has been selected for that it.
> > >>>>>
> > >>>>> This is obviously wrong, so much I can't even figure out what I meant.  Please
> > >>>>> remove the entire sentence in a cleanup patch that goes at the beginning of this
> > >>>>> set.
> > >>>>>
> > >>>>>>    * @ea:              Device attribute for sink representation under PMU directory.
> > >>>>>>    * @ect_dev: Associated cross trigger device. Not part of the trace data
> > >>>>>>    *           path or connections.
> > >>>>>> + * @sink_select_priority: Used when the system is required to choose a default
> > >>>>>> + *           sink. Higher numbers indicate lower priority. Unused (0) if not
> > >>>>>> + *           a sink / or sink not supported as a default.
> > >>>>>>    */
> > >>>>>>   struct coresight_device {
> > >>>>>>        struct coresight_platform_data *pdata;
> > >>>>>> @@ -180,8 +183,13 @@ struct coresight_device {
> > >>>>>>        struct dev_ext_attribute *ea;
> > >>>>>>        /* cross trigger handling */
> > >>>>>>        struct coresight_device *ect_dev;
> > >>>>>> +     /* mark priority when system choosing a default sink  */
> > >>>>>> +     int sink_select_priority;
> > >>>>>
> > >>>>> When I first looked at this set I didn't think there was a need for something
> > >>>>> like this but after reflection we do since information about the kind of sink
> > >>>>> (ETB10, TMC-ETF/ETR, TPIU) isn't available from the coresight_device.  You can
> > >>>>> also forget my comment about things not working for ETB10 and TPIU, it is
> > >>>>> clearer now.
> > >>>>>
> > >>>>> I would simply call this "priority" and move it in the sink section, i.e just
> > >>>>> under @activated.  For extra credit please move @ea after @ops (also in a prequel
> > >>>>> patch) since it is used for all coresight_device.
> > >>>>>
> > >>>>
> > >>>> could do this - or rename per suzukis suggestion.
> > >>>>
> > >>>>>>   };
> > >>>>>>
> > >>>>>> +/* The highest priority sink - will terminate default sink search */
> > >>>>>> +#define CORESIGHT_SINK_SELECT_PRIORITY_HI    1
> > >>>>>
> > >>>>> Why not calling this CORESIGHT_PRIORITY_ETR, so that when the next best thing
> > >>>>> comes out we can have CORESIGHT_PRIORITY_NEXT_BEST_THING.  Hopefully we won't
> > >>>>> need to do that because everyone will move to 1:1 topologies but I know better.
> > >>>>>
> > >>>> I wanted to completely decouple the "Type" of sink from the selection
> > >>>> priority. It is not important that it is an ETR now or maybe something
> > >>>> else later.
> > >>>> If something new comes along that is better then it is up to the sink
> > >>>> drivers to allocate priorities accordingly. At present it is simple as
> > >>>> a single driver covers the sinks of interest, but in future we may
> > >>>> need to be more dynamic about allocating sinks.
> > >>>>
> > >>>
> > >>> When newETR comes out CORESIGHT_PRIORITY_NEWETR can be defined with a
> > >>> priority of '2'.  With higher number == higher priority we wouldn't
> > >>> have to change any other driver, just assign the new priority in the
> > >>> probe() code of newETR.
> > >>>
> > >>> Just like the search algorithm above, we may have to think further
> > >>> about what we want to do.
> > >>>
> > >>
> > >> The search algorithm was predicated on a consensus I thought was
> > >> reached in the previous discussions on this matter - do a simple find
> > >> first ETR.
> > >
> > > You are correct, we did have a consensus but what we agreed on clearly
> > > has some drawbacks.  As with everyone else I am unhappy to revisit the
> > > topic but I'd rather do that now than having to live with something
> > > that doesn't scale.
> > >
> > >>
> > >> I am happy to change this again if this is no longer agreed as the way forwards.
> > >
> > > I think that in 1:1 and 1:1 + global sink topologies the sink closest
> > > to the ETM should be selected, regardless of priorities.  Things get
> > > trickier with N:1 topologies and, given all the possible scenarios, I
> > > am wondering if using priorities at all is the way to go.
> >
> > We could have per CPU sinks and global sinks.
>
> Correct, and that is a topology that creates a lot of problems.
>
> > So if we find the closest
> > "sink" from the ETM, we should be fine. Also we need not impose priority
> > among the "sinks" that can route the trace to system memory. i.e, we
> > could think of the sink "subtypes" as the implicit priority.
> >
> > Given we want to use something that can stream the trace to memory on a
> > sink (ETR, Future IPs) and not something with an internal memory
> > (ETB/ETF/ETB10), could we simply use the coresight_dev_subtype_sink
> > instead of defining priorities ?
> > i.e: Add a new subtypefor SINKs that can push the trace to system memory.
> >
> > enum coresight_dev_subtype_sink {
> >          CORESIGHT_DEV_SUBTYPE_SINK_NONE,
> >          CORESIGHT_DEV_SUBTYPE_SINK_PORT,
> >          CORESIGHT_DEV_SUBTYPE_SINK_BUFFER,
> > +       CORESIGHT_DEV_SUBTYPE_SINK_SYS_RAM,
> > };

I had avoided adding a new sub-type as it implied a lot of rework of
code beyond just the current problem.
However, a more careful look has shown that
CORESIGHT_DEV_SUBTYPE_SINK* are never actually used anywhere at
present so this may be a realistic option.,

>
> I like that part as it saves us from adding a new field to struct
> coresight_device.
>
> >
> > Then the priority is implicit by higher subtype_sink value. With this
> > and the closest sink (no of components to the sink) can be easily done.
> >
> > >
> > > I can't remember if we have considered the device tree before, but
> > > that would be really easy to do.  We could introduce a new "default"
> > > property, only valid on sink devices, that would indicate what sink to

The first patchset was based on exactly this. It was felt this was not
a good approach as ACPI support would be needed too.


Sorry for the noise... I review so much code I can't keep track of it all.
 
> > > use if one isn't explicitly given on the command line.  If a sink
> > > isn't specified on the command line and a default sink wasn't
> > > specified in the DT, then we take the sink closest to the ETM.
> >
> > Personally I do not prefer DT approach.
>
> Same here.
>
> >
> > Something like the below should work for all the cases ?
> >
> > struct coresight_device *
> > coresight_find_sink(struct coresight_device *dev, int *pdepth)
> > {
> >         int i, depth;
> >         struct coresight_device *res = NULL;
> >
> >         /* We have reached sink at the given depth, end search */
> >         if (dev->type == SINK)
> >                 return dev;
> >
> >         for (i = 0; i < dev->nr_outports; i ++) {
> >                 struct coresight_device *tmp;
> >                 int t = 0;
> >
> >                 if (!dev->conns[i].child_dev)
> >                         continue;
> >                 tmp = coresight_find_sink(dev->conns[i].child_dev, &t);
> >                 if (!tmp)
> >                         continue;
> >                 /* First sink found, set initialize the result */
> >                 if (!res) {
> >                         res = tmp;
> >                         depth = t;
> >                         continue;
> >                 }
> >
> >                 /*
> >                  * Update the result if :
> >                  * a) if we have found a better sink type
> >                  * b) Or something closer than the current one with
> >                  *    same capability
> >                  */
> >                 if ((res->subtype.sink_subtype < tmp->subtype.sink_subtype) ||
> >                     (t < depth)) {
> >                         depth = t;
> >                         res = tmp;
> >                 }
>
> In a 1:1 + global sink topology where the CPU sinks are ETB and the
> global sink is an ETR, if the global sink is discovered after a CPU
> sink, the global sink will be selected.  In a N:1 topology, if a TPIU
> is located closest to the ETMs and discovered after an ETR, the TPIU
> will be selected.I think

The priority schema used in the current patch set deliberately
discounts TPIU as a selectable sink. This must be a feature of any
automated scheme as selecting a TPIU (or any off-chip sink) fior
purely self hosted trace can never be valid.

If we consider the CS-BSA document, using ETB as a 1:1 sink would not
be recommended - it requires that 1:1 topologies use trace capture to
system memory. An intervening ETF between ETM and ETR is permitted.
Here we can discount a 1:1 schema witn ETM:ETB, and reasonably assume
either ETM:<ETR or similar> or <ETM>:<ETF>:<ETR or similar>

Using an ETB wasn't a good example.  The same example, i.e 1:1 + global sink where the CPU sinks are ETRs and the global sink is ETR-2.0.  But if we prioritise based on hops we should be fine. 

[1]. https://static.docs.arm.com/den0068/10/ARM-DEN-0068_coresight_base_system_architecture_1.0.pdf
 

> The problem is that our search criteria are changing based on the mode
> we're in.  1:1 we want the shortest distance from source while N:1 a
> type of sink is preferred.  Finding the kind of topology we have on a
> system will allow us to select the right search heuristic.  One way to
> do that is to keep track of the amount of sinks registered on a system
> as they get registered.  If nr_sinks >= NR_CPUs, we have a 1:1 or a
> 1:1 + global sink.  Otherwise we're back to the good old N:1.
>
not necessarily true - a 4 core 2 cluster config, with ETF per cluster
+ ETR + TPIU actually has 4 sinks and 4 cores, but is N:1. (juno-r1/2
gets close to this with a 6:4 CPU:sink ratio).


I don't think we do need to change the search criteria - just get the
prioritisation correct. Use priority + distance to pick, prioritize
system memory devices (ETR etc) over local buffer devices (ETF/ETB).
This should be sufficient for current and envisaged (by CS-BSA) future systems.

Things get much easier if we narrow the field of supported topologies to what is in the CS-BSA.  In that case picking the closest sink should come first, using priorities to break any ties.
 

Regards

Mike

> What do you guys think?
>
> >         }
> >
> >         /* Adjust the depth to accommodate the current dev */
> >         if (res && pdepth)
> >                 *pdepth = depth + 1;
> >
> >         return res;
> > }
> >
> >
> >
> > }
> >
> > struct coresight_device * coresight_find_default_sink(source)
> > {
> >         return coresight_find_sink(source, NULL);
> > }
> >
> > Thoughts ?
> >
> > Suzuki
> >
> > >
> > > Let me know what you think...
> > >
> > > Thanks,
> > > Mathieu
> > >
> > >>
> > >> Regards
> > >>
> > >> Mike
> > >>
> > >>>>
> > >>>>
> > >>>>> For you next revision, please rebase on coresight-next, I couldn't apply this
> > >>>>> one properly.
> > >>>>>
> > >>>>
> > >>>> Sorry - was using Linux-5.7-rc3 main branch. Hadn;t realised there had
> > >>>> been sufficient diversion that this would not also work on your
> > >>>> coresight branch.
> > >>>>
> > >>>> Mike
> > >>>>
> > >>>>
> > >>>>> Thanks,
> > >>>>> Mathieu
> > >>>>>
> > >>>>>
> > >>>>>> +
> > >>>>>>   /*
> > >>>>>>    * coresight_dev_list - Mapping for devices to "name" index for device
> > >>>>>>    * names.
> > >>>>>> --
> > >>>>>> 2.17.1
> > >>>>>>
> > >>>>
> > >>>>
> > >>>>
> > >>>> --
> > >>>> Mike Leach
> > >>>> Principal Engineer, ARM Ltd.
> > >>>> Manchester Design Centre. UK
> > >>
> > >>
> > >>
> > >> --
> > >> Mike Leach
> > >> Principal Engineer, ARM Ltd.
> > >> Manchester Design Centre. UK
> >



--
Mike Leach
Principal Engineer, ARM Ltd.
Manchester Design Centre. UK