This patch series is to refine the memory barriers for AUX ring buffer.
Patches 01 ~ 04 to address the barriers usage in the kernel. The first patch is to make clear comment for how to use the barriers between the data store and aux_head store, this asks the driver to make sure the data is visible. Patches 02 ~ 04 is to refine the drivers for barriers after the data store.
Patch 05 is to use WRITE_ONCE() for updating aux_tail.
Patches 06 ~ 09 is to drop the legacy __sync functions, and polish for duplicate code and cleanup the build and feature test after SYNC_COMPARE_AND_SWAP is not used.
For easier review and more clear patch organization, comparing against to the previous patch series, the patches for support compat mode for AUX trace have been left out and will be sent out as a separate patch set.
This patch set have been tested on Arm64 Juno platform.
Changes from v4: - Refined comment for CoreSight ETR/ETF drivers (Suzuki/Peter); - Changed to use compiler barrier for BTS (mentioned by Peter, but have not received response from Intel developer); - Refined the coding style for patch 07 (Adrian).
Changes from v3: - Removed the inapprocate paragraph in the commit log for patch "perf auxtrace: Drop legacy __sync functions" (Adrian); - Added new patch to remove feature-sync-compare-and-swap test (Adrian); - Th patch for "perf auxtrace: Use WRITE_ONCE() for updating aux_tail", is a standlone and simple change, so moved it ahead in the patch set for better ordering; - Minor improvement for commit logs in the last two patches.
Changes from v2: - Removed auxtrace_mmap__read_snapshot_head(), which has the duplicated code with auxtrace_mmap__read_head(); - Cleanuped the build for HAVE_SYNC_COMPARE_AND_SWAP_SUPPORT (Adrian); - Added global variable "kernel_is_64_bit" (Adrian); - Added compat variants compat_auxtrace_mmap__{read_head|write_tail} (Adrian).
Leo Yan (9): perf/ring_buffer: Add comment for barriers on AUX ring buffer coresight: tmc-etr: Add barrier after updating AUX ring buffer coresight: tmc-etf: Add comment for store ordering perf/x86: Add compiler barrier after updating BTS perf auxtrace: Use WRITE_ONCE() for updating aux_tail perf auxtrace: Drop legacy __sync functions perf auxtrace: Remove auxtrace_mmap__read_snapshot_head() perf: Cleanup for HAVE_SYNC_COMPARE_AND_SWAP_SUPPORT tools: Remove feature-sync-compare-and-swap feature detection
arch/x86/events/intel/bts.c | 6 ++++ .../hwtracing/coresight/coresight-tmc-etf.c | 5 +++ .../hwtracing/coresight/coresight-tmc-etr.c | 8 +++++ kernel/events/ring_buffer.c | 9 ++++++ tools/build/Makefile.feature | 1 - tools/build/feature/Makefile | 4 --- tools/build/feature/test-all.c | 4 --- .../feature/test-sync-compare-and-swap.c | 15 --------- tools/perf/Makefile.config | 4 --- tools/perf/util/auxtrace.c | 18 +++-------- tools/perf/util/auxtrace.h | 31 +------------------ 11 files changed, 34 insertions(+), 71 deletions(-) delete mode 100644 tools/build/feature/test-sync-compare-and-swap.c
AUX ring buffer applies almost the same barriers as perf ring buffer, but there has an exception for ordering between writing the AUX trace data and updating user_page::aux_head.
This patch adds comment for how to use the barriers on AUX ring buffer, and gives comment to ask the drivers to flush the trace data into AUX ring buffer prior to updating user_page::aux_head.
Signed-off-by: Leo Yan leo.yan@linaro.org Acked-by: Peter Zijlstra (Intel) peterz@infradead.org --- kernel/events/ring_buffer.c | 9 +++++++++ 1 file changed, 9 insertions(+)
diff --git a/kernel/events/ring_buffer.c b/kernel/events/ring_buffer.c index 52868716ec35..5cf6579be05e 100644 --- a/kernel/events/ring_buffer.c +++ b/kernel/events/ring_buffer.c @@ -509,6 +509,15 @@ void perf_aux_output_end(struct perf_output_handle *handle, unsigned long size) perf_event_aux_event(handle->event, aux_head, size, handle->aux_flags);
+ /* + * See perf_output_put_handle(), AUX ring buffer applies the same + * barrier pairing as the perf ring buffer; except for B, since + * AUX ring buffer is written by hardware trace, we cannot simply + * use the generic memory barrier (like smp_wmb()) prior to update + * user_page::aux_head, the hardware trace driver takes the + * responsibility to ensure the trace data has been flushed into + * the AUX buffer before calling perf_aux_output_end(). + */ WRITE_ONCE(rb->user_page->aux_head, rb->aux_head); if (rb_need_aux_wakeup(rb)) wakeup = true;
Hi Peter,
On Mon, Aug 09, 2021 at 07:13:59PM +0800, Leo Yan wrote:
AUX ring buffer applies almost the same barriers as perf ring buffer, but there has an exception for ordering between writing the AUX trace data and updating user_page::aux_head.
This patch adds comment for how to use the barriers on AUX ring buffer, and gives comment to ask the drivers to flush the trace data into AUX ring buffer prior to updating user_page::aux_head.
Signed-off-by: Leo Yan leo.yan@linaro.org Acked-by: Peter Zijlstra (Intel) peterz@infradead.org
You have given the ACK tag before, could you pick up this patch?
Thanks, Leo
kernel/events/ring_buffer.c | 9 +++++++++ 1 file changed, 9 insertions(+)
diff --git a/kernel/events/ring_buffer.c b/kernel/events/ring_buffer.c index 52868716ec35..5cf6579be05e 100644 --- a/kernel/events/ring_buffer.c +++ b/kernel/events/ring_buffer.c @@ -509,6 +509,15 @@ void perf_aux_output_end(struct perf_output_handle *handle, unsigned long size) perf_event_aux_event(handle->event, aux_head, size, handle->aux_flags);
- /*
* See perf_output_put_handle(), AUX ring buffer applies the same
* barrier pairing as the perf ring buffer; except for B, since
* AUX ring buffer is written by hardware trace, we cannot simply
* use the generic memory barrier (like smp_wmb()) prior to update
* user_page::aux_head, the hardware trace driver takes the
* responsibility to ensure the trace data has been flushed into
* the AUX buffer before calling perf_aux_output_end().
WRITE_ONCE(rb->user_page->aux_head, rb->aux_head); if (rb_need_aux_wakeup(rb)) wakeup = true;*/
-- 2.25.1
Since a memory barrier is required between AUX trace data store and aux_head store, and the AUX trace data is filled with memcpy(), it's sufficient to use smp_wmb() so can ensure the trace data is visible prior to updating aux_head.
Signed-off-by: Leo Yan leo.yan@linaro.org Reviewed-by: Suzuki K Poulose suzuki.poulose@arm.com --- drivers/hwtracing/coresight/coresight-tmc-etr.c | 8 ++++++++ 1 file changed, 8 insertions(+)
diff --git a/drivers/hwtracing/coresight/coresight-tmc-etr.c b/drivers/hwtracing/coresight/coresight-tmc-etr.c index acdb59e0e661..13fd1fc730ed 100644 --- a/drivers/hwtracing/coresight/coresight-tmc-etr.c +++ b/drivers/hwtracing/coresight/coresight-tmc-etr.c @@ -1563,6 +1563,14 @@ tmc_update_etr_buffer(struct coresight_device *csdev, */ if (etr_perf->snapshot) handle->head += size; + + /* + * Ensure that the AUX trace data is visible before the aux_head + * is updated via perf_aux_output_end(), as expected by the + * perf ring buffer. + */ + smp_wmb(); + out: /* * Don't set the TRUNCATED flag in snapshot mode because 1) the
Hi Mathieu, Suzuki,
On Mon, Aug 09, 2021 at 07:14:00PM +0800, Leo Yan wrote:
Since a memory barrier is required between AUX trace data store and aux_head store, and the AUX trace data is filled with memcpy(), it's sufficient to use smp_wmb() so can ensure the trace data is visible prior to updating aux_head.
Signed-off-by: Leo Yan leo.yan@linaro.org Reviewed-by: Suzuki K Poulose suzuki.poulose@arm.com
Could you pick up patches 02 and 03 in this series? Please note, patch 02 has the review tag from Suzuki, but I didn't receive the review tag for patch 03.
If anything need to follow up, just let me know. Thanks!
drivers/hwtracing/coresight/coresight-tmc-etr.c | 8 ++++++++ 1 file changed, 8 insertions(+)
diff --git a/drivers/hwtracing/coresight/coresight-tmc-etr.c b/drivers/hwtracing/coresight/coresight-tmc-etr.c index acdb59e0e661..13fd1fc730ed 100644 --- a/drivers/hwtracing/coresight/coresight-tmc-etr.c +++ b/drivers/hwtracing/coresight/coresight-tmc-etr.c @@ -1563,6 +1563,14 @@ tmc_update_etr_buffer(struct coresight_device *csdev, */ if (etr_perf->snapshot) handle->head += size;
- /*
* Ensure that the AUX trace data is visible before the aux_head
* is updated via perf_aux_output_end(), as expected by the
* perf ring buffer.
*/
- smp_wmb();
out: /* * Don't set the TRUNCATED flag in snapshot mode because 1) the -- 2.25.1
On 29/08/2021 11:55, Leo Yan wrote:
Hi Mathieu, Suzuki,
On Mon, Aug 09, 2021 at 07:14:00PM +0800, Leo Yan wrote:
Since a memory barrier is required between AUX trace data store and aux_head store, and the AUX trace data is filled with memcpy(), it's sufficient to use smp_wmb() so can ensure the trace data is visible prior to updating aux_head.
Signed-off-by: Leo Yan leo.yan@linaro.org Reviewed-by: Suzuki K Poulose suzuki.poulose@arm.com
Could you pick up patches 02 and 03 in this series? Please note, patch 02 has the review tag from Suzuki, but I didn't receive the review tag for patch 03.
If anything need to follow up, just let me know. Thanks!
I have picked up both the patches.
Thanks Suzuki
drivers/hwtracing/coresight/coresight-tmc-etr.c | 8 ++++++++ 1 file changed, 8 insertions(+)
diff --git a/drivers/hwtracing/coresight/coresight-tmc-etr.c b/drivers/hwtracing/coresight/coresight-tmc-etr.c index acdb59e0e661..13fd1fc730ed 100644 --- a/drivers/hwtracing/coresight/coresight-tmc-etr.c +++ b/drivers/hwtracing/coresight/coresight-tmc-etr.c @@ -1563,6 +1563,14 @@ tmc_update_etr_buffer(struct coresight_device *csdev, */ if (etr_perf->snapshot) handle->head += size;
- /*
* Ensure that the AUX trace data is visible before the aux_head
* is updated via perf_aux_output_end(), as expected by the
* perf ring buffer.
*/
- smp_wmb();
- out: /*
- Don't set the TRUNCATED flag in snapshot mode because 1) the
-- 2.25.1
Since the function CS_LOCK() has contained memory barrier mb(), it ensures the visibility of the AUX trace data before updating the aux_head, thus it's needless to add any explicit barrier anymore.
Add comment to make clear for the barrier usage for ETF.
Signed-off-by: Leo Yan leo.yan@linaro.org --- drivers/hwtracing/coresight/coresight-tmc-etf.c | 5 +++++ 1 file changed, 5 insertions(+)
diff --git a/drivers/hwtracing/coresight/coresight-tmc-etf.c b/drivers/hwtracing/coresight/coresight-tmc-etf.c index cd0fb7bfba68..8debd4f40f06 100644 --- a/drivers/hwtracing/coresight/coresight-tmc-etf.c +++ b/drivers/hwtracing/coresight/coresight-tmc-etf.c @@ -553,6 +553,11 @@ static unsigned long tmc_update_etf_buffer(struct coresight_device *csdev, if (buf->snapshot) handle->head += to_read;
+ /* + * CS_LOCK() contains mb() so it can ensure visibility of the AUX trace + * data before the aux_head is updated via perf_aux_output_end(), which + * is expected by the perf ring buffer. + */ CS_LOCK(drvdata->base); out: spin_unlock_irqrestore(&drvdata->spinlock, flags);
On 09/08/2021 12:14, Leo Yan wrote:
Since the function CS_LOCK() has contained memory barrier mb(), it ensures the visibility of the AUX trace data before updating the aux_head, thus it's needless to add any explicit barrier anymore.
Add comment to make clear for the barrier usage for ETF.
Signed-off-by: Leo Yan leo.yan@linaro.org
drivers/hwtracing/coresight/coresight-tmc-etf.c | 5 +++++ 1 file changed, 5 insertions(+)
diff --git a/drivers/hwtracing/coresight/coresight-tmc-etf.c b/drivers/hwtracing/coresight/coresight-tmc-etf.c index cd0fb7bfba68..8debd4f40f06 100644 --- a/drivers/hwtracing/coresight/coresight-tmc-etf.c +++ b/drivers/hwtracing/coresight/coresight-tmc-etf.c @@ -553,6 +553,11 @@ static unsigned long tmc_update_etf_buffer(struct coresight_device *csdev, if (buf->snapshot) handle->head += to_read;
- /*
* CS_LOCK() contains mb() so it can ensure visibility of the AUX trace
* data before the aux_head is updated via perf_aux_output_end(), which
* is expected by the perf ring buffer.
CS_LOCK(drvdata->base); out: spin_unlock_irqrestore(&drvdata->spinlock, flags);*/
I will queue this.
Thanks Suzuki
Since BTS is coherent, simply add a compiler barrier to separate the BTS update and aux_head store.
Signed-off-by: Leo Yan leo.yan@linaro.org --- arch/x86/events/intel/bts.c | 6 ++++++ 1 file changed, 6 insertions(+)
diff --git a/arch/x86/events/intel/bts.c b/arch/x86/events/intel/bts.c index 6320d2cfd9d3..974e917e65b2 100644 --- a/arch/x86/events/intel/bts.c +++ b/arch/x86/events/intel/bts.c @@ -209,6 +209,12 @@ static void bts_update(struct bts_ctx *bts) } else { local_set(&buf->data_size, head); } + + /* + * Since BTS is coherent, just add compiler barrier to ensure + * BTS updating is ordered against bts::handle::event. + */ + barrier(); }
static int
Hi Peter, or any x86 maintainer,
On Mon, Aug 09, 2021 at 07:14:02PM +0800, Leo Yan wrote:
Since BTS is coherent, simply add a compiler barrier to separate the BTS update and aux_head store.
Could you reivew this patch and check if BTS needs the comipler barrier in this case? Thanks.
Signed-off-by: Leo Yan leo.yan@linaro.org
arch/x86/events/intel/bts.c | 6 ++++++ 1 file changed, 6 insertions(+)
diff --git a/arch/x86/events/intel/bts.c b/arch/x86/events/intel/bts.c index 6320d2cfd9d3..974e917e65b2 100644 --- a/arch/x86/events/intel/bts.c +++ b/arch/x86/events/intel/bts.c @@ -209,6 +209,12 @@ static void bts_update(struct bts_ctx *bts) } else { local_set(&buf->data_size, head); }
- /*
* Since BTS is coherent, just add compiler barrier to ensure
* BTS updating is ordered against bts::handle::event.
*/
- barrier();
} static int -- 2.25.1
Use WRITE_ONCE() for updating aux_tail, so can avoid unexpected memory behaviour.
Signed-off-by: Leo Yan leo.yan@linaro.org Acked-by: Adrian Hunter adrian.hunter@intel.com Acked-by: Peter Zijlstra (Intel) peterz@infradead.org --- tools/perf/util/auxtrace.h | 2 +- 1 file changed, 1 insertion(+), 1 deletion(-)
diff --git a/tools/perf/util/auxtrace.h b/tools/perf/util/auxtrace.h index cc1c1b9cec9c..79227b8864cd 100644 --- a/tools/perf/util/auxtrace.h +++ b/tools/perf/util/auxtrace.h @@ -480,7 +480,7 @@ static inline void auxtrace_mmap__write_tail(struct auxtrace_mmap *mm, u64 tail) /* Ensure all reads are done before we write the tail out */ smp_mb(); #if BITS_PER_LONG == 64 || !defined(HAVE_SYNC_COMPARE_AND_SWAP_SUPPORT) - pc->aux_tail = tail; + WRITE_ONCE(pc->aux_tail, tail); #else do { old_tail = __sync_val_compare_and_swap(&pc->aux_tail, 0, 0);
Em Mon, Aug 09, 2021 at 07:14:03PM +0800, Leo Yan escreveu:
Use WRITE_ONCE() for updating aux_tail, so can avoid unexpected memory behaviour.
Thanks, applied to perf/core.
- Arnaldo
Signed-off-by: Leo Yan leo.yan@linaro.org Acked-by: Adrian Hunter adrian.hunter@intel.com Acked-by: Peter Zijlstra (Intel) peterz@infradead.org
tools/perf/util/auxtrace.h | 2 +- 1 file changed, 1 insertion(+), 1 deletion(-)
diff --git a/tools/perf/util/auxtrace.h b/tools/perf/util/auxtrace.h index cc1c1b9cec9c..79227b8864cd 100644 --- a/tools/perf/util/auxtrace.h +++ b/tools/perf/util/auxtrace.h @@ -480,7 +480,7 @@ static inline void auxtrace_mmap__write_tail(struct auxtrace_mmap *mm, u64 tail) /* Ensure all reads are done before we write the tail out */ smp_mb(); #if BITS_PER_LONG == 64 || !defined(HAVE_SYNC_COMPARE_AND_SWAP_SUPPORT)
- pc->aux_tail = tail;
- WRITE_ONCE(pc->aux_tail, tail);
#else do { old_tail = __sync_val_compare_and_swap(&pc->aux_tail, 0, 0); -- 2.25.1
The main purpose for using __sync built-in functions is to support compat mode for 32-bit perf with 64-bit kernel. But using these built-in functions might cause potential issues.
__sync functions originally support Intel Itanium processoer [1] but it cannot promise to support all 32-bit archs. Now these functions have become the legacy functions.
Considering __sync functions cannot really fix the 64-bit value atomicity on 32-bit archs, thus this patch drops __sync functions.
Credits to Peter for detailed analysis.
[1] https://gcc.gnu.org/onlinedocs/gcc/_005f_005fsync-Builtins.html#g_t_005f_005...
Suggested-by: Peter Zijlstra peterz@infradead.org Signed-off-by: Leo Yan leo.yan@linaro.org --- tools/perf/util/auxtrace.h | 19 ------------------- 1 file changed, 19 deletions(-)
diff --git a/tools/perf/util/auxtrace.h b/tools/perf/util/auxtrace.h index 79227b8864cd..4f9176368134 100644 --- a/tools/perf/util/auxtrace.h +++ b/tools/perf/util/auxtrace.h @@ -440,12 +440,6 @@ struct auxtrace_cache;
#ifdef HAVE_AUXTRACE_SUPPORT
-/* - * In snapshot mode the mmapped page is read-only which makes using - * __sync_val_compare_and_swap() problematic. However, snapshot mode expects - * the buffer is not updated while the snapshot is made (e.g. Intel PT disables - * the event) so there is not a race anyway. - */ static inline u64 auxtrace_mmap__read_snapshot_head(struct auxtrace_mmap *mm) { struct perf_event_mmap_page *pc = mm->userpg; @@ -459,11 +453,7 @@ static inline u64 auxtrace_mmap__read_snapshot_head(struct auxtrace_mmap *mm) static inline u64 auxtrace_mmap__read_head(struct auxtrace_mmap *mm) { struct perf_event_mmap_page *pc = mm->userpg; -#if BITS_PER_LONG == 64 || !defined(HAVE_SYNC_COMPARE_AND_SWAP_SUPPORT) u64 head = READ_ONCE(pc->aux_head); -#else - u64 head = __sync_val_compare_and_swap(&pc->aux_head, 0, 0); -#endif
/* Ensure all reads are done after we read the head */ smp_rmb(); @@ -473,19 +463,10 @@ static inline u64 auxtrace_mmap__read_head(struct auxtrace_mmap *mm) static inline void auxtrace_mmap__write_tail(struct auxtrace_mmap *mm, u64 tail) { struct perf_event_mmap_page *pc = mm->userpg; -#if BITS_PER_LONG != 64 && defined(HAVE_SYNC_COMPARE_AND_SWAP_SUPPORT) - u64 old_tail; -#endif
/* Ensure all reads are done before we write the tail out */ smp_mb(); -#if BITS_PER_LONG == 64 || !defined(HAVE_SYNC_COMPARE_AND_SWAP_SUPPORT) WRITE_ONCE(pc->aux_tail, tail); -#else - do { - old_tail = __sync_val_compare_and_swap(&pc->aux_tail, 0, 0); - } while (!__sync_bool_compare_and_swap(&pc->aux_tail, old_tail, tail)); -#endif }
int auxtrace_mmap__mmap(struct auxtrace_mmap *mm,
Em Mon, Aug 09, 2021 at 07:14:04PM +0800, Leo Yan escreveu:
The main purpose for using __sync built-in functions is to support compat mode for 32-bit perf with 64-bit kernel. But using these built-in functions might cause potential issues.
__sync functions originally support Intel Itanium processoer [1] but it cannot promise to support all 32-bit archs. Now these functions have become the legacy functions.
Considering __sync functions cannot really fix the 64-bit value atomicity on 32-bit archs, thus this patch drops __sync functions.
Credits to Peter for detailed analysis.
Thanks, applied to perf/core.
- Arnaldo
[1] https://gcc.gnu.org/onlinedocs/gcc/_005f_005fsync-Builtins.html#g_t_005f_005...
Suggested-by: Peter Zijlstra peterz@infradead.org Signed-off-by: Leo Yan leo.yan@linaro.org
tools/perf/util/auxtrace.h | 19 ------------------- 1 file changed, 19 deletions(-)
diff --git a/tools/perf/util/auxtrace.h b/tools/perf/util/auxtrace.h index 79227b8864cd..4f9176368134 100644 --- a/tools/perf/util/auxtrace.h +++ b/tools/perf/util/auxtrace.h @@ -440,12 +440,6 @@ struct auxtrace_cache; #ifdef HAVE_AUXTRACE_SUPPORT -/*
- In snapshot mode the mmapped page is read-only which makes using
- __sync_val_compare_and_swap() problematic. However, snapshot mode expects
- the buffer is not updated while the snapshot is made (e.g. Intel PT disables
- the event) so there is not a race anyway.
- */
static inline u64 auxtrace_mmap__read_snapshot_head(struct auxtrace_mmap *mm) { struct perf_event_mmap_page *pc = mm->userpg; @@ -459,11 +453,7 @@ static inline u64 auxtrace_mmap__read_snapshot_head(struct auxtrace_mmap *mm) static inline u64 auxtrace_mmap__read_head(struct auxtrace_mmap *mm) { struct perf_event_mmap_page *pc = mm->userpg; -#if BITS_PER_LONG == 64 || !defined(HAVE_SYNC_COMPARE_AND_SWAP_SUPPORT) u64 head = READ_ONCE(pc->aux_head); -#else
- u64 head = __sync_val_compare_and_swap(&pc->aux_head, 0, 0);
-#endif /* Ensure all reads are done after we read the head */ smp_rmb(); @@ -473,19 +463,10 @@ static inline u64 auxtrace_mmap__read_head(struct auxtrace_mmap *mm) static inline void auxtrace_mmap__write_tail(struct auxtrace_mmap *mm, u64 tail) { struct perf_event_mmap_page *pc = mm->userpg; -#if BITS_PER_LONG != 64 && defined(HAVE_SYNC_COMPARE_AND_SWAP_SUPPORT)
- u64 old_tail;
-#endif /* Ensure all reads are done before we write the tail out */ smp_mb(); -#if BITS_PER_LONG == 64 || !defined(HAVE_SYNC_COMPARE_AND_SWAP_SUPPORT) WRITE_ONCE(pc->aux_tail, tail); -#else
- do {
old_tail = __sync_val_compare_and_swap(&pc->aux_tail, 0, 0);
- } while (!__sync_bool_compare_and_swap(&pc->aux_tail, old_tail, tail));
-#endif } int auxtrace_mmap__mmap(struct auxtrace_mmap *mm, -- 2.25.1
Since the function auxtrace_mmap__read_snapshot_head() is exactly same with auxtrace_mmap__read_head(), whether the session is in snapshot mode or not, it's unified to use function auxtrace_mmap__read_head() for reading AUX buffer head.
And the function auxtrace_mmap__read_snapshot_head() is unused so this patch removes it.
Signed-off-by: Leo Yan leo.yan@linaro.org --- tools/perf/util/auxtrace.c | 13 +++++-------- tools/perf/util/auxtrace.h | 10 ---------- 2 files changed, 5 insertions(+), 18 deletions(-)
diff --git a/tools/perf/util/auxtrace.c b/tools/perf/util/auxtrace.c index cb19669d2a5b..2dcf3d12ba32 100644 --- a/tools/perf/util/auxtrace.c +++ b/tools/perf/util/auxtrace.c @@ -1686,14 +1686,11 @@ static int __auxtrace_mmap__read(struct mmap *map, union perf_event ev; void *data1, *data2;
- if (snapshot) { - head = auxtrace_mmap__read_snapshot_head(mm); - if (auxtrace_record__find_snapshot(itr, mm->idx, mm, data, - &head, &old)) - return -1; - } else { - head = auxtrace_mmap__read_head(mm); - } + head = auxtrace_mmap__read_head(mm); + + if (snapshot && + auxtrace_record__find_snapshot(itr, mm->idx, mm, data, &head, &old)) + return -1;
if (old == head) return 0; diff --git a/tools/perf/util/auxtrace.h b/tools/perf/util/auxtrace.h index 4f9176368134..d68a5e80b217 100644 --- a/tools/perf/util/auxtrace.h +++ b/tools/perf/util/auxtrace.h @@ -440,16 +440,6 @@ struct auxtrace_cache;
#ifdef HAVE_AUXTRACE_SUPPORT
-static inline u64 auxtrace_mmap__read_snapshot_head(struct auxtrace_mmap *mm) -{ - struct perf_event_mmap_page *pc = mm->userpg; - u64 head = READ_ONCE(pc->aux_head); - - /* Ensure all reads are done after we read the head */ - smp_rmb(); - return head; -} - static inline u64 auxtrace_mmap__read_head(struct auxtrace_mmap *mm) { struct perf_event_mmap_page *pc = mm->userpg;
Em Mon, Aug 09, 2021 at 07:14:05PM +0800, Leo Yan escreveu:
Since the function auxtrace_mmap__read_snapshot_head() is exactly same with auxtrace_mmap__read_head(), whether the session is in snapshot mode or not, it's unified to use function auxtrace_mmap__read_head() for reading AUX buffer head.
And the function auxtrace_mmap__read_snapshot_head() is unused so this patch removes it.
Thanks, applied to perf/core.
- Arnaldo
Signed-off-by: Leo Yan leo.yan@linaro.org
tools/perf/util/auxtrace.c | 13 +++++-------- tools/perf/util/auxtrace.h | 10 ---------- 2 files changed, 5 insertions(+), 18 deletions(-)
diff --git a/tools/perf/util/auxtrace.c b/tools/perf/util/auxtrace.c index cb19669d2a5b..2dcf3d12ba32 100644 --- a/tools/perf/util/auxtrace.c +++ b/tools/perf/util/auxtrace.c @@ -1686,14 +1686,11 @@ static int __auxtrace_mmap__read(struct mmap *map, union perf_event ev; void *data1, *data2;
- if (snapshot) {
head = auxtrace_mmap__read_snapshot_head(mm);
if (auxtrace_record__find_snapshot(itr, mm->idx, mm, data,
&head, &old))
return -1;
- } else {
head = auxtrace_mmap__read_head(mm);
- }
- head = auxtrace_mmap__read_head(mm);
- if (snapshot &&
auxtrace_record__find_snapshot(itr, mm->idx, mm, data, &head, &old))
return -1;
if (old == head) return 0; diff --git a/tools/perf/util/auxtrace.h b/tools/perf/util/auxtrace.h index 4f9176368134..d68a5e80b217 100644 --- a/tools/perf/util/auxtrace.h +++ b/tools/perf/util/auxtrace.h @@ -440,16 +440,6 @@ struct auxtrace_cache; #ifdef HAVE_AUXTRACE_SUPPORT -static inline u64 auxtrace_mmap__read_snapshot_head(struct auxtrace_mmap *mm) -{
- struct perf_event_mmap_page *pc = mm->userpg;
- u64 head = READ_ONCE(pc->aux_head);
- /* Ensure all reads are done after we read the head */
- smp_rmb();
- return head;
-}
static inline u64 auxtrace_mmap__read_head(struct auxtrace_mmap *mm) { struct perf_event_mmap_page *pc = mm->userpg; -- 2.25.1
Since the __sync functions have been dropped, This patch removes unused build and checking for HAVE_SYNC_COMPARE_AND_SWAP_SUPPORT in perf tool.
Signed-off-by: Leo Yan leo.yan@linaro.org --- tools/perf/Makefile.config | 4 ---- tools/perf/util/auxtrace.c | 5 ----- 2 files changed, 9 deletions(-)
diff --git a/tools/perf/Makefile.config b/tools/perf/Makefile.config index eb8e487ef90b..4a0d9a6defc7 100644 --- a/tools/perf/Makefile.config +++ b/tools/perf/Makefile.config @@ -349,10 +349,6 @@ CXXFLAGS += $(INC_FLAGS)
LIBPERF_CFLAGS := $(CORE_CFLAGS) $(EXTRA_CFLAGS)
-ifeq ($(feature-sync-compare-and-swap), 1) - CFLAGS += -DHAVE_SYNC_COMPARE_AND_SWAP_SUPPORT -endif - ifeq ($(feature-pthread-attr-setaffinity-np), 1) CFLAGS += -DHAVE_PTHREAD_ATTR_SETAFFINITY_NP endif diff --git a/tools/perf/util/auxtrace.c b/tools/perf/util/auxtrace.c index 2dcf3d12ba32..f33f09b8b535 100644 --- a/tools/perf/util/auxtrace.c +++ b/tools/perf/util/auxtrace.c @@ -130,11 +130,6 @@ int auxtrace_mmap__mmap(struct auxtrace_mmap *mm, return 0; }
-#if BITS_PER_LONG != 64 && !defined(HAVE_SYNC_COMPARE_AND_SWAP_SUPPORT) - pr_err("Cannot use AUX area tracing mmaps\n"); - return -1; -#endif - pc->aux_offset = mp->offset; pc->aux_size = mp->len;
Em Mon, Aug 09, 2021 at 07:14:06PM +0800, Leo Yan escreveu:
Since the __sync functions have been dropped, This patch removes unused build and checking for HAVE_SYNC_COMPARE_AND_SWAP_SUPPORT in perf tool.
Thanks, applied to perf/core.
- Arnaldo
Signed-off-by: Leo Yan leo.yan@linaro.org
tools/perf/Makefile.config | 4 ---- tools/perf/util/auxtrace.c | 5 ----- 2 files changed, 9 deletions(-)
diff --git a/tools/perf/Makefile.config b/tools/perf/Makefile.config index eb8e487ef90b..4a0d9a6defc7 100644 --- a/tools/perf/Makefile.config +++ b/tools/perf/Makefile.config @@ -349,10 +349,6 @@ CXXFLAGS += $(INC_FLAGS) LIBPERF_CFLAGS := $(CORE_CFLAGS) $(EXTRA_CFLAGS) -ifeq ($(feature-sync-compare-and-swap), 1)
- CFLAGS += -DHAVE_SYNC_COMPARE_AND_SWAP_SUPPORT
-endif
ifeq ($(feature-pthread-attr-setaffinity-np), 1) CFLAGS += -DHAVE_PTHREAD_ATTR_SETAFFINITY_NP endif diff --git a/tools/perf/util/auxtrace.c b/tools/perf/util/auxtrace.c index 2dcf3d12ba32..f33f09b8b535 100644 --- a/tools/perf/util/auxtrace.c +++ b/tools/perf/util/auxtrace.c @@ -130,11 +130,6 @@ int auxtrace_mmap__mmap(struct auxtrace_mmap *mm, return 0; } -#if BITS_PER_LONG != 64 && !defined(HAVE_SYNC_COMPARE_AND_SWAP_SUPPORT)
- pr_err("Cannot use AUX area tracing mmaps\n");
- return -1;
-#endif
- pc->aux_offset = mp->offset; pc->aux_size = mp->len;
2.25.1
Since the __sync functions have been removed from perf, it's needless for perf tool to test the feature sync-compare-and-swap.
The feature test is not used by any other components, remove it.
Signed-off-by: Leo Yan leo.yan@linaro.org --- tools/build/Makefile.feature | 1 - tools/build/feature/Makefile | 4 ---- tools/build/feature/test-all.c | 4 ---- tools/build/feature/test-sync-compare-and-swap.c | 15 --------------- 4 files changed, 24 deletions(-) delete mode 100644 tools/build/feature/test-sync-compare-and-swap.c
diff --git a/tools/build/Makefile.feature b/tools/build/Makefile.feature index 04a8e3db8a54..3dd2f68366f9 100644 --- a/tools/build/Makefile.feature +++ b/tools/build/Makefile.feature @@ -34,7 +34,6 @@ FEATURE_TESTS_BASIC := \ dwarf_getlocations \ eventfd \ fortify-source \ - sync-compare-and-swap \ get_current_dir_name \ gettid \ glibc \ diff --git a/tools/build/feature/Makefile b/tools/build/feature/Makefile index ec203e28407f..eff55d287db1 100644 --- a/tools/build/feature/Makefile +++ b/tools/build/feature/Makefile @@ -9,7 +9,6 @@ FILES= \ test-dwarf_getlocations.bin \ test-eventfd.bin \ test-fortify-source.bin \ - test-sync-compare-and-swap.bin \ test-get_current_dir_name.bin \ test-glibc.bin \ test-gtk2.bin \ @@ -260,9 +259,6 @@ $(OUTPUT)test-libdw-dwarf-unwind.bin: $(OUTPUT)test-libbabeltrace.bin: $(BUILD) # -lbabeltrace provided by $(FEATURE_CHECK_LDFLAGS-libbabeltrace)
-$(OUTPUT)test-sync-compare-and-swap.bin: - $(BUILD) - $(OUTPUT)test-compile-32.bin: $(CC) -m32 -o $@ test-compile.c
diff --git a/tools/build/feature/test-all.c b/tools/build/feature/test-all.c index 464873883396..920439527291 100644 --- a/tools/build/feature/test-all.c +++ b/tools/build/feature/test-all.c @@ -106,10 +106,6 @@ # include "test-libdw-dwarf-unwind.c" #undef main
-#define main main_test_sync_compare_and_swap -# include "test-sync-compare-and-swap.c" -#undef main - #define main main_test_zlib # include "test-zlib.c" #undef main diff --git a/tools/build/feature/test-sync-compare-and-swap.c b/tools/build/feature/test-sync-compare-and-swap.c deleted file mode 100644 index 3bc6b0768a53..000000000000 --- a/tools/build/feature/test-sync-compare-and-swap.c +++ /dev/null @@ -1,15 +0,0 @@ -// SPDX-License-Identifier: GPL-2.0 -#include <stdint.h> - -volatile uint64_t x; - -int main(int argc, char *argv[]) -{ - uint64_t old, new = argc; - - (void)argv; - do { - old = __sync_val_compare_and_swap(&x, 0, 0); - } while (!__sync_bool_compare_and_swap(&x, old, new)); - return old == new; -}
Em Mon, Aug 09, 2021 at 07:14:07PM +0800, Leo Yan escreveu:
Since the __sync functions have been removed from perf, it's needless for perf tool to test the feature sync-compare-and-swap.
The feature test is not used by any other components, remove it.
Thanks, applied to perf/core.
- Arnaldo
Signed-off-by: Leo Yan leo.yan@linaro.org
tools/build/Makefile.feature | 1 - tools/build/feature/Makefile | 4 ---- tools/build/feature/test-all.c | 4 ---- tools/build/feature/test-sync-compare-and-swap.c | 15 --------------- 4 files changed, 24 deletions(-) delete mode 100644 tools/build/feature/test-sync-compare-and-swap.c
diff --git a/tools/build/Makefile.feature b/tools/build/Makefile.feature index 04a8e3db8a54..3dd2f68366f9 100644 --- a/tools/build/Makefile.feature +++ b/tools/build/Makefile.feature @@ -34,7 +34,6 @@ FEATURE_TESTS_BASIC := \ dwarf_getlocations \ eventfd \ fortify-source \
sync-compare-and-swap \ get_current_dir_name \ gettid \ glibc \
diff --git a/tools/build/feature/Makefile b/tools/build/feature/Makefile index ec203e28407f..eff55d287db1 100644 --- a/tools/build/feature/Makefile +++ b/tools/build/feature/Makefile @@ -9,7 +9,6 @@ FILES= \ test-dwarf_getlocations.bin \ test-eventfd.bin \ test-fortify-source.bin \
test-sync-compare-and-swap.bin \ test-get_current_dir_name.bin \ test-glibc.bin \ test-gtk2.bin \
@@ -260,9 +259,6 @@ $(OUTPUT)test-libdw-dwarf-unwind.bin: $(OUTPUT)test-libbabeltrace.bin: $(BUILD) # -lbabeltrace provided by $(FEATURE_CHECK_LDFLAGS-libbabeltrace) -$(OUTPUT)test-sync-compare-and-swap.bin:
- $(BUILD)
$(OUTPUT)test-compile-32.bin: $(CC) -m32 -o $@ test-compile.c diff --git a/tools/build/feature/test-all.c b/tools/build/feature/test-all.c index 464873883396..920439527291 100644 --- a/tools/build/feature/test-all.c +++ b/tools/build/feature/test-all.c @@ -106,10 +106,6 @@ # include "test-libdw-dwarf-unwind.c" #undef main -#define main main_test_sync_compare_and_swap -# include "test-sync-compare-and-swap.c" -#undef main
#define main main_test_zlib # include "test-zlib.c" #undef main diff --git a/tools/build/feature/test-sync-compare-and-swap.c b/tools/build/feature/test-sync-compare-and-swap.c deleted file mode 100644 index 3bc6b0768a53..000000000000 --- a/tools/build/feature/test-sync-compare-and-swap.c +++ /dev/null @@ -1,15 +0,0 @@ -// SPDX-License-Identifier: GPL-2.0 -#include <stdint.h>
-volatile uint64_t x;
-int main(int argc, char *argv[]) -{
- uint64_t old, new = argc;
- (void)argv;
- do {
old = __sync_val_compare_and_swap(&x, 0, 0);
- } while (!__sync_bool_compare_and_swap(&x, old, new));
- return old == new;
-}
2.25.1