Current code syncs the buffer range is [offset, offset+len), it doesn't consider the case when the trace data is wrapped around, in this case 'offset+len' is bigger than 'etr_buf->size'. Thus it syncs buffer out of the memory buffer, and it also misses to sync buffer from the start of the memory.
This patch corrects the memory sync ranges, when detects the wrapping around case, it splits into two chunks: one chunk is the tail of the buffer and another chunk is from the start of the buffer after wrapping around.
Signed-off-by: Leo Yan leo.yan@linaro.org --- .../hwtracing/coresight/coresight-tmc-etr.c | 18 +++++++++++++++++- 1 file changed, 17 insertions(+), 1 deletion(-)
diff --git a/drivers/hwtracing/coresight/coresight-tmc-etr.c b/drivers/hwtracing/coresight/coresight-tmc-etr.c index 888b0f929d33..a1afefcbf175 100644 --- a/drivers/hwtracing/coresight/coresight-tmc-etr.c +++ b/drivers/hwtracing/coresight/coresight-tmc-etr.c @@ -780,7 +780,23 @@ static void tmc_etr_sync_sg_buf(struct etr_buf *etr_buf, u64 rrp, u64 rwp) else etr_buf->len = ((w_offset < r_offset) ? etr_buf->size : 0) + w_offset - r_offset; - tmc_sg_table_sync_data_range(table, r_offset, etr_buf->len); + + if (r_offset + etr_buf->len > etr_buf->size) { + int len1, len2; + + /* + * If trace data is wrapped around, sync AUX bounce buffer + * for two chunks: "len1" is for the trace date length at + * the tail of bounce buffer, and "len2" is the length from + * the start of the buffer after wrapping around. + */ + len1 = etr_buf->size - r_offset; + len2 = etr_buf->len - len1; + tmc_sg_table_sync_data_range(table, r_offset, len1); + tmc_sg_table_sync_data_range(table, 0, len2); + } else { + tmc_sg_table_sync_data_range(table, r_offset, etr_buf->len); + } }
static const struct etr_buf_operations etr_sg_buf_ops = {
Current code misses to handle the case for the trace wrapping around, thus it fails to sync the complete memory ranges in catu mode.
This patch corrects the memory sync ranges, when detects the wrapping around case, it splits into two chunks: one chunk is the tail of the buffer and another chunk is from the start of the buffer after wrapping around.
Signed-off-by: Leo Yan leo.yan@linaro.org --- drivers/hwtracing/coresight/coresight-catu.c | 19 ++++++++++++++++++- 1 file changed, 18 insertions(+), 1 deletion(-)
diff --git a/drivers/hwtracing/coresight/coresight-catu.c b/drivers/hwtracing/coresight/coresight-catu.c index e0740c6dbd54..634af451f0d3 100644 --- a/drivers/hwtracing/coresight/coresight-catu.c +++ b/drivers/hwtracing/coresight/coresight-catu.c @@ -323,7 +323,24 @@ static void catu_sync_etr_buf(struct etr_buf *etr_buf, u64 rrp, u64 rwp) }
etr_buf->offset = r_offset; - tmc_sg_table_sync_data_range(catu_table, r_offset, etr_buf->len); + + if (r_offset + etr_buf->len > etr_buf->size) { + int len1, len2; + + /* + * If trace data is wrapped around, sync AUX bounce buffer + * for two chunks: "len1" is for the trace date length at + * the tail of bounce buffer, and "len2" is the length from + * the start of the buffer after wrapping around. + */ + len1 = etr_buf->size - r_offset; + len2 = etr_buf->len - len1; + tmc_sg_table_sync_data_range(catu_table, r_offset, len1); + tmc_sg_table_sync_data_range(catu_table, 0, len2); + } else { + tmc_sg_table_sync_data_range(catu_table, r_offset, + etr_buf->len); + } }
static int catu_alloc_etr_buf(struct tmc_drvdata *tmc_drvdata,
Hi Leo,
On 10/07/2021 08:02, Leo Yan wrote:
Current code misses to handle the case for the trace wrapping around, thus it fails to sync the complete memory ranges in catu mode.
Similar to the tmc-sg, the infrastructure take care of wrapping the page indices to sync the correct memory.
Suzuki
This patch corrects the memory sync ranges, when detects the wrapping around case, it splits into two chunks: one chunk is the tail of the buffer and another chunk is from the start of the buffer after wrapping around.
Signed-off-by: Leo Yan leo.yan@linaro.org
drivers/hwtracing/coresight/coresight-catu.c | 19 ++++++++++++++++++- 1 file changed, 18 insertions(+), 1 deletion(-)
diff --git a/drivers/hwtracing/coresight/coresight-catu.c b/drivers/hwtracing/coresight/coresight-catu.c index e0740c6dbd54..634af451f0d3 100644 --- a/drivers/hwtracing/coresight/coresight-catu.c +++ b/drivers/hwtracing/coresight/coresight-catu.c @@ -323,7 +323,24 @@ static void catu_sync_etr_buf(struct etr_buf *etr_buf, u64 rrp, u64 rwp) } etr_buf->offset = r_offset;
- tmc_sg_table_sync_data_range(catu_table, r_offset, etr_buf->len);
- if (r_offset + etr_buf->len > etr_buf->size) {
int len1, len2;
/*
* If trace data is wrapped around, sync AUX bounce buffer
* for two chunks: "len1" is for the trace date length at
* the tail of bounce buffer, and "len2" is the length from
* the start of the buffer after wrapping around.
*/
len1 = etr_buf->size - r_offset;
len2 = etr_buf->len - len1;
tmc_sg_table_sync_data_range(catu_table, r_offset, len1);
tmc_sg_table_sync_data_range(catu_table, 0, len2);
- } else {
tmc_sg_table_sync_data_range(catu_table, r_offset,
etr_buf->len);
- } }
static int catu_alloc_etr_buf(struct tmc_drvdata *tmc_drvdata,
Hi Leo,
On 10/07/2021 08:02, Leo Yan wrote:
Current code syncs the buffer range is [offset, offset+len), it doesn't consider the case when the trace data is wrapped around, in this case 'offset+len' is bigger than 'etr_buf->size'. Thus it syncs buffer out of the memory buffer, and it also misses to sync buffer from the start of the memory.
I doubt this claim is valid. We do the sync properly, taking the page corresponding to the "offset" wrapping it around in "page" index.
Here is the code :
void tmc_sg_table_sync_data_range(struct tmc_sg_table *table, u64 offset, u64 size) { int i, index, start; int npages = DIV_ROUND_UP(size, PAGE_SIZE); struct device *real_dev = table->dev->parent; struct tmc_pages *data = &table->data_pages;
start = offset >> PAGE_SHIFT; for (i = start; i < (start + npages); i++) { index = i % data->nr_pages; dma_sync_single_for_cpu(real_dev, data->daddrs[index], PAGE_SIZE, DMA_FROM_DEVICE); } }
See that the npages accounts for the "size" requested and we wrap the "index" by the total number of pages in the buffer and pick the right page.
So, I think this fix is not needed.
Cheers Suzuki
This patch corrects the memory sync ranges, when detects the wrapping around case, it splits into two chunks: one chunk is the tail of the buffer and another chunk is from the start of the buffer after wrapping around.
Signed-off-by: Leo Yan leo.yan@linaro.org
.../hwtracing/coresight/coresight-tmc-etr.c | 18 +++++++++++++++++- 1 file changed, 17 insertions(+), 1 deletion(-)
diff --git a/drivers/hwtracing/coresight/coresight-tmc-etr.c b/drivers/hwtracing/coresight/coresight-tmc-etr.c index 888b0f929d33..a1afefcbf175 100644 --- a/drivers/hwtracing/coresight/coresight-tmc-etr.c +++ b/drivers/hwtracing/coresight/coresight-tmc-etr.c @@ -780,7 +780,23 @@ static void tmc_etr_sync_sg_buf(struct etr_buf *etr_buf, u64 rrp, u64 rwp) else etr_buf->len = ((w_offset < r_offset) ? etr_buf->size : 0) + w_offset - r_offset;
- tmc_sg_table_sync_data_range(table, r_offset, etr_buf->len);
- if (r_offset + etr_buf->len > etr_buf->size) {
int len1, len2;
/*
* If trace data is wrapped around, sync AUX bounce buffer
* for two chunks: "len1" is for the trace date length at
* the tail of bounce buffer, and "len2" is the length from
* the start of the buffer after wrapping around.
*/
len1 = etr_buf->size - r_offset;
len2 = etr_buf->len - len1;
tmc_sg_table_sync_data_range(table, r_offset, len1);
tmc_sg_table_sync_data_range(table, 0, len2);
- } else {
tmc_sg_table_sync_data_range(table, r_offset, etr_buf->len);
- } }
static const struct etr_buf_operations etr_sg_buf_ops = {
On Mon, Jul 12, 2021 at 11:25:54AM +0100, Suzuki Kuruppassery Poulose wrote:
Hi Leo,
On 10/07/2021 08:02, Leo Yan wrote:
Current code syncs the buffer range is [offset, offset+len), it doesn't consider the case when the trace data is wrapped around, in this case 'offset+len' is bigger than 'etr_buf->size'. Thus it syncs buffer out of the memory buffer, and it also misses to sync buffer from the start of the memory.
I doubt this claim is valid. We do the sync properly, taking the page corresponding to the "offset" wrapping it around in "page" index.
Here is the code :
void tmc_sg_table_sync_data_range(struct tmc_sg_table *table, u64 offset, u64 size) { int i, index, start; int npages = DIV_ROUND_UP(size, PAGE_SIZE); struct device *real_dev = table->dev->parent; struct tmc_pages *data = &table->data_pages;
start = offset >> PAGE_SHIFT; for (i = start; i < (start + npages); i++) { index = i % data->nr_pages; dma_sync_single_for_cpu(real_dev, data->daddrs[index], PAGE_SIZE, DMA_FROM_DEVICE); }
}
See that the npages accounts for the "size" requested and we wrap the "index" by the total number of pages in the buffer and pick the right page.
So, I think this fix is not needed.
Ouch, you are right :) Let's drop these two patches.
Thanks, Leo