This patch series is to refine the memory barriers for AUX ring buffer.
Patches 01 ~ 04 to address the barriers usage in the kernel. The first patch is to make clear comment for how to use the barriers between the data store and aux_head store, this asks the driver to make sure the data is visible. Patches 02 ~ 04 is to refine the drivers for barriers after the data store.
Patches 05 ~ 07 is to drop the legacy __sync functions, and polish for duplicate code and cleanup the build after SYNC_COMPARE_AND_SWAP is not used.
Patch 08 is to use WRITE_ONCE() for updating aux_tail.
Since the 64-bit value's atomicity is not promised on 32-bit perf, the last two patches tries to fixup for perf tool when it runs in compat mode. Patch 09 introduces a new global variable to indicate the kernel runs in 64-bit mode which can be used to confirm if in compat mode; patch 10 introduces variant functions for accessing AUX head/tail, it can resolve the aotmicity issue for reading head pointer, and for the tail write overflow issue it returns error to notify the tool to exit.
Have testes the patches on Arm64 Juno platform.
Changes from v2: - Removed auxtrace_mmap__read_snapshot_head(), which has the duplicated code with auxtrace_mmap__read_head(); - Cleanuped the build for HAVE_SYNC_COMPARE_AND_SWAP_SUPPORT (Adrian); - Added global variable "kernel_is_64_bit" (Adrian); - Added compat variants compat_auxtrace_mmap__{read_head|write_tail} (Adrian).
Leo Yan (10): perf/ring_buffer: Add comment for barriers on AUX ring buffer coresight: tmc-etr: Add barrier after updating AUX ring buffer coresight: tmc-etf: Add comment for store ordering perf/x86: Add barrier after updating bts perf auxtrace: Drop legacy __sync functions perf auxtrace: Remove auxtrace_mmap__read_snapshot_head() perf: Cleanup for HAVE_SYNC_COMPARE_AND_SWAP_SUPPORT perf auxtrace: Use WRITE_ONCE() for updating aux_tail perf env: Set kernel bit mode perf auxtrace: Add compat_auxtrace_mmap__{read_head|write_tail}
arch/x86/events/intel/bts.c | 3 + .../hwtracing/coresight/coresight-tmc-etf.c | 6 + .../hwtracing/coresight/coresight-tmc-etr.c | 8 ++ kernel/events/ring_buffer.c | 9 ++ tools/perf/Makefile.config | 4 - tools/perf/util/auxtrace.c | 19 ++- tools/perf/util/auxtrace.h | 109 ++++++++++++++---- tools/perf/util/env.c | 17 ++- tools/perf/util/env.h | 1 + 9 files changed, 136 insertions(+), 40 deletions(-)
AUX ring buffer applies almost the same barriers as perf ring buffer, but there has an exception for ordering between writing the AUX trace data and updating user_page::aux_head.
This patch adds comment for how to use the barriers on AUX ring buffer, and gives comment to ask the drivers to flush the trace data into AUX ring buffer prior to updating user_page::aux_head.
Signed-off-by: Leo Yan leo.yan@linaro.org Acked-by: Peter Zijlstra (Intel) peterz@infradead.org --- kernel/events/ring_buffer.c | 9 +++++++++ 1 file changed, 9 insertions(+)
diff --git a/kernel/events/ring_buffer.c b/kernel/events/ring_buffer.c index 52868716ec35..5cf6579be05e 100644 --- a/kernel/events/ring_buffer.c +++ b/kernel/events/ring_buffer.c @@ -509,6 +509,15 @@ void perf_aux_output_end(struct perf_output_handle *handle, unsigned long size) perf_event_aux_event(handle->event, aux_head, size, handle->aux_flags);
+ /* + * See perf_output_put_handle(), AUX ring buffer applies the same + * barrier pairing as the perf ring buffer; except for B, since + * AUX ring buffer is written by hardware trace, we cannot simply + * use the generic memory barrier (like smp_wmb()) prior to update + * user_page::aux_head, the hardware trace driver takes the + * responsibility to ensure the trace data has been flushed into + * the AUX buffer before calling perf_aux_output_end(). + */ WRITE_ONCE(rb->user_page->aux_head, rb->aux_head); if (rb_need_aux_wakeup(rb)) wakeup = true;
Since a memory barrier is required between AUX trace data store and aux_head store, and the AUX trace data is filled with memcpy(), it's sufficient to use smp_wmb() so can ensure the trace data is visible prior to updating aux_head.
Signed-off-by: Leo Yan leo.yan@linaro.org --- drivers/hwtracing/coresight/coresight-tmc-etr.c | 8 ++++++++ 1 file changed, 8 insertions(+)
diff --git a/drivers/hwtracing/coresight/coresight-tmc-etr.c b/drivers/hwtracing/coresight/coresight-tmc-etr.c index acdb59e0e661..713205db15a1 100644 --- a/drivers/hwtracing/coresight/coresight-tmc-etr.c +++ b/drivers/hwtracing/coresight/coresight-tmc-etr.c @@ -1563,6 +1563,14 @@ tmc_update_etr_buffer(struct coresight_device *csdev, */ if (etr_perf->snapshot) handle->head += size; + + /* + * It requires the ordering between the AUX trace data and aux_head + * store, below smp_wmb() ensures the AUX trace data is visible prior + * to updating aux_head. + */ + smp_wmb(); + out: /* * Don't set the TRUNCATED flag in snapshot mode because 1) the
AUX ring buffer is required to separate the data store and aux_head store, since the function CS_LOCK() has contained memory barrier mb(), mb() is a more conservative barrier than smp_wmb() on Arm32/Arm64, thus it's needless to add any explicit barrier anymore.
Add comment to make clear for the barrier usage for ETF.
Signed-off-by: Leo Yan leo.yan@linaro.org --- drivers/hwtracing/coresight/coresight-tmc-etf.c | 6 ++++++ 1 file changed, 6 insertions(+)
diff --git a/drivers/hwtracing/coresight/coresight-tmc-etf.c b/drivers/hwtracing/coresight/coresight-tmc-etf.c index 45b85edfc690..9a42ee689921 100644 --- a/drivers/hwtracing/coresight/coresight-tmc-etf.c +++ b/drivers/hwtracing/coresight/coresight-tmc-etf.c @@ -553,6 +553,12 @@ static unsigned long tmc_update_etf_buffer(struct coresight_device *csdev, if (buf->snapshot) handle->head += to_read;
+ /* + * AUX ring buffer requires to use memory barrier to separate the trace + * data store and aux_head store, because CS_LOCK() contains mb() which + * gives more heavy barrier than smp_wmb(), it's not necessary to + * explicitly invoke any barrier. + */ CS_LOCK(drvdata->base); out: spin_unlock_irqrestore(&drvdata->spinlock, flags);
Add barrier wmb() to separate the AUX data store and aux_head store.
Signed-off-by: Leo Yan leo.yan@linaro.org --- arch/x86/events/intel/bts.c | 3 +++ 1 file changed, 3 insertions(+)
diff --git a/arch/x86/events/intel/bts.c b/arch/x86/events/intel/bts.c index 6320d2cfd9d3..4a015d160bc5 100644 --- a/arch/x86/events/intel/bts.c +++ b/arch/x86/events/intel/bts.c @@ -209,6 +209,9 @@ static void bts_update(struct bts_ctx *bts) } else { local_set(&buf->data_size, head); } + + /* The WMB separates data store and aux_head store matches. */ + wmb(); }
static int
The main purpose for using __sync built-in functions is to support compat mode for 32-bit perf with 64-bit kernel. But using these built-in functions might cause couple potential issues.
Firstly, __sync functions originally support Intel Itanium processoer [1] but it cannot promise to support all 32-bit archs. Now these functions have become the legacy functions.
As Peter also pointed out the logic issue in the function auxtrace_mmap__write_tail(), it does a cmpxchg with 0 values to load old_tail, and then executes a further cmpxchg with old_tail to write the new tail. If consider the aux_tail might be assigned to '0' in the middle of loops, this can introduce mess for AUX buffer if the kernel fetches the temporary value '0'.
Considering __sync functions cannot really fix the 64-bit value atomicity on 32-bit archs, thus this patch drops __sync functions.
Credits to Peter for detailed analysis.
[1] https://gcc.gnu.org/onlinedocs/gcc/_005f_005fsync-Builtins.html#g_t_005f_005...
Suggested-by: Peter Zijlstra peterz@infradead.org Signed-off-by: Leo Yan leo.yan@linaro.org --- tools/perf/util/auxtrace.h | 19 ------------------- 1 file changed, 19 deletions(-)
diff --git a/tools/perf/util/auxtrace.h b/tools/perf/util/auxtrace.h index cc1c1b9cec9c..f489ca159997 100644 --- a/tools/perf/util/auxtrace.h +++ b/tools/perf/util/auxtrace.h @@ -440,12 +440,6 @@ struct auxtrace_cache;
#ifdef HAVE_AUXTRACE_SUPPORT
-/* - * In snapshot mode the mmapped page is read-only which makes using - * __sync_val_compare_and_swap() problematic. However, snapshot mode expects - * the buffer is not updated while the snapshot is made (e.g. Intel PT disables - * the event) so there is not a race anyway. - */ static inline u64 auxtrace_mmap__read_snapshot_head(struct auxtrace_mmap *mm) { struct perf_event_mmap_page *pc = mm->userpg; @@ -459,11 +453,7 @@ static inline u64 auxtrace_mmap__read_snapshot_head(struct auxtrace_mmap *mm) static inline u64 auxtrace_mmap__read_head(struct auxtrace_mmap *mm) { struct perf_event_mmap_page *pc = mm->userpg; -#if BITS_PER_LONG == 64 || !defined(HAVE_SYNC_COMPARE_AND_SWAP_SUPPORT) u64 head = READ_ONCE(pc->aux_head); -#else - u64 head = __sync_val_compare_and_swap(&pc->aux_head, 0, 0); -#endif
/* Ensure all reads are done after we read the head */ smp_rmb(); @@ -473,19 +463,10 @@ static inline u64 auxtrace_mmap__read_head(struct auxtrace_mmap *mm) static inline void auxtrace_mmap__write_tail(struct auxtrace_mmap *mm, u64 tail) { struct perf_event_mmap_page *pc = mm->userpg; -#if BITS_PER_LONG != 64 && defined(HAVE_SYNC_COMPARE_AND_SWAP_SUPPORT) - u64 old_tail; -#endif
/* Ensure all reads are done before we write the tail out */ smp_mb(); -#if BITS_PER_LONG == 64 || !defined(HAVE_SYNC_COMPARE_AND_SWAP_SUPPORT) pc->aux_tail = tail; -#else - do { - old_tail = __sync_val_compare_and_swap(&pc->aux_tail, 0, 0); - } while (!__sync_bool_compare_and_swap(&pc->aux_tail, old_tail, tail)); -#endif }
int auxtrace_mmap__mmap(struct auxtrace_mmap *mm,
Since the function auxtrace_mmap__read_snapshot_head() is exactly same with auxtrace_mmap__read_head(), whether the session is in snapshot mode or not, it's unified to use function auxtrace_mmap__read_head() for reading AUX buffer head.
And the function auxtrace_mmap__read_snapshot_head() is unused so this patch removes it.
Signed-off-by: Leo Yan leo.yan@linaro.org --- tools/perf/util/auxtrace.c | 5 ++--- tools/perf/util/auxtrace.h | 10 ---------- 2 files changed, 2 insertions(+), 13 deletions(-)
diff --git a/tools/perf/util/auxtrace.c b/tools/perf/util/auxtrace.c index 9350eeb3a3fc..44bd04dbb6d6 100644 --- a/tools/perf/util/auxtrace.c +++ b/tools/perf/util/auxtrace.c @@ -1686,13 +1686,12 @@ static int __auxtrace_mmap__read(struct mmap *map, union perf_event ev; void *data1, *data2;
+ head = auxtrace_mmap__read_head(mm); + if (snapshot) { - head = auxtrace_mmap__read_snapshot_head(mm); if (auxtrace_record__find_snapshot(itr, mm->idx, mm, data, &head, &old)) return -1; - } else { - head = auxtrace_mmap__read_head(mm); }
if (old == head) diff --git a/tools/perf/util/auxtrace.h b/tools/perf/util/auxtrace.h index f489ca159997..1a2f42980e3f 100644 --- a/tools/perf/util/auxtrace.h +++ b/tools/perf/util/auxtrace.h @@ -440,16 +440,6 @@ struct auxtrace_cache;
#ifdef HAVE_AUXTRACE_SUPPORT
-static inline u64 auxtrace_mmap__read_snapshot_head(struct auxtrace_mmap *mm) -{ - struct perf_event_mmap_page *pc = mm->userpg; - u64 head = READ_ONCE(pc->aux_head); - - /* Ensure all reads are done after we read the head */ - smp_rmb(); - return head; -} - static inline u64 auxtrace_mmap__read_head(struct auxtrace_mmap *mm) { struct perf_event_mmap_page *pc = mm->userpg;
Since the __sync functions have been dropped, This patch removes unused build and checking for HAVE_SYNC_COMPARE_AND_SWAP_SUPPORT in perf tool.
Note, there have a test for SYNC_COMPARE_AND_SWAP and the test file is located in build/feature/test-sync-compare-and-swap.c. Since there still has several components using the sync functions, it's deliberately to not be removed.
$ cd linux/tools $ git grep __sync_val_compare_and_swap | awk '{ printf $1"\n" }' build/feature/test-sync-compare-and-swap.c: include/asm-generic/atomic-gcc.h: testing/selftests/bpf/progs/atomics.c: testing/selftests/bpf/progs/atomics.c: testing/selftests/bpf/progs/atomics.c: testing/selftests/bpf/progs/atomics.c: testing/selftests/futex/include/atomic.h: testing/selftests/futex/include/futextest.h:
Signed-off-by: Leo Yan leo.yan@linaro.org --- tools/perf/Makefile.config | 4 ---- tools/perf/util/auxtrace.c | 5 ----- 2 files changed, 9 deletions(-)
diff --git a/tools/perf/Makefile.config b/tools/perf/Makefile.config index eb8e487ef90b..4a0d9a6defc7 100644 --- a/tools/perf/Makefile.config +++ b/tools/perf/Makefile.config @@ -349,10 +349,6 @@ CXXFLAGS += $(INC_FLAGS)
LIBPERF_CFLAGS := $(CORE_CFLAGS) $(EXTRA_CFLAGS)
-ifeq ($(feature-sync-compare-and-swap), 1) - CFLAGS += -DHAVE_SYNC_COMPARE_AND_SWAP_SUPPORT -endif - ifeq ($(feature-pthread-attr-setaffinity-np), 1) CFLAGS += -DHAVE_PTHREAD_ATTR_SETAFFINITY_NP endif diff --git a/tools/perf/util/auxtrace.c b/tools/perf/util/auxtrace.c index 44bd04dbb6d6..db6255b55c90 100644 --- a/tools/perf/util/auxtrace.c +++ b/tools/perf/util/auxtrace.c @@ -130,11 +130,6 @@ int auxtrace_mmap__mmap(struct auxtrace_mmap *mm, return 0; }
-#if BITS_PER_LONG != 64 && !defined(HAVE_SYNC_COMPARE_AND_SWAP_SUPPORT) - pr_err("Cannot use AUX area tracing mmaps\n"); - return -1; -#endif - pc->aux_offset = mp->offset; pc->aux_size = mp->len;
Use WRITE_ONCE() for updating aux_tail, so can avoid unexpected memory behaviour.
Signed-off-by: Leo Yan leo.yan@linaro.org Acked-by: Adrian Hunter adrian.hunter@intel.com Acked-by: Peter Zijlstra (Intel) peterz@infradead.org --- tools/perf/util/auxtrace.h | 2 +- 1 file changed, 1 insertion(+), 1 deletion(-)
diff --git a/tools/perf/util/auxtrace.h b/tools/perf/util/auxtrace.h index 1a2f42980e3f..d68a5e80b217 100644 --- a/tools/perf/util/auxtrace.h +++ b/tools/perf/util/auxtrace.h @@ -456,7 +456,7 @@ static inline void auxtrace_mmap__write_tail(struct auxtrace_mmap *mm, u64 tail)
/* Ensure all reads are done before we write the tail out */ smp_mb(); - pc->aux_tail = tail; + WRITE_ONCE(pc->aux_tail, tail); }
int auxtrace_mmap__mmap(struct auxtrace_mmap *mm,
It's useful to know that the kernel is running in 32-bit or 64-bit mode. E.g. We can decide perf tool is running in compat mode when detects kernel is running in 64-bit mode and the tool is in 32-bit mode with the compiler macro BITS_PER_LONG is 32.
This patch adds a global variable "kernel_is_64_bit", during the environment initialization for the session, the kernel running mode is decided by checking the architecture string.
Signed-off-by: Leo Yan leo.yan@linaro.org --- tools/perf/util/env.c | 17 ++++++++++++++++- tools/perf/util/env.h | 1 + 2 files changed, 17 insertions(+), 1 deletion(-)
diff --git a/tools/perf/util/env.c b/tools/perf/util/env.c index ebc5e9ad35db..345635a2e842 100644 --- a/tools/perf/util/env.c +++ b/tools/perf/util/env.c @@ -11,6 +11,7 @@ #include <stdlib.h> #include <string.h>
+int kernel_is_64_bit; struct perf_env perf_env;
#ifdef HAVE_LIBBPF_SUPPORT @@ -172,6 +173,19 @@ static void perf_env__purge_bpf(struct perf_env *env __maybe_unused) } #endif // HAVE_LIBBPF_SUPPORT
+static void perf_env__init_kernel_mode(struct perf_env *env) +{ + const char *arch = perf_env__raw_arch(env); + + if (!strncmp(arch, "x86_64", 6) || !strncmp(arch, "aarch64", 7) || + !strncmp(arch, "arm64", 5) || !strncmp(arch, "mips64", 6) || + !strncmp(arch, "parisc64", 8) || !strncmp(arch, "riscv64", 7) || + !strncmp(arch, "s390x", 5) || !strncmp(arch, "sparc64", 7)) + kernel_is_64_bit = 1; + else + kernel_is_64_bit = 0; +} + void perf_env__exit(struct perf_env *env) { int i; @@ -217,13 +231,14 @@ void perf_env__exit(struct perf_env *env) zfree(&env->hybrid_cpc_nodes); }
-void perf_env__init(struct perf_env *env __maybe_unused) +void perf_env__init(struct perf_env *env) { #ifdef HAVE_LIBBPF_SUPPORT env->bpf_progs.infos = RB_ROOT; env->bpf_progs.btfs = RB_ROOT; init_rwsem(&env->bpf_progs.lock); #endif + perf_env__init_kernel_mode(env); }
int perf_env__set_cmdline(struct perf_env *env, int argc, const char *argv[]) diff --git a/tools/perf/util/env.h b/tools/perf/util/env.h index 6824a7423a2d..cc989ff49740 100644 --- a/tools/perf/util/env.h +++ b/tools/perf/util/env.h @@ -139,6 +139,7 @@ enum perf_compress_type { struct bpf_prog_info_node; struct btf_node;
+extern int kernel_is_64_bit; extern struct perf_env perf_env;
void perf_env__exit(struct perf_env *env);
When perf runs in compat mode (kernel in 64-bit mode and the perf is in 32-bit mode), the 64-bit value atomicity in the user space cannot be assured, E.g. on some architectures, the 64-bit value accessing is split into two instructions, one is the low 32-bit accessing and another is for the high 32-bit.
This patch introduces two functions compat_auxtrace_mmap__read_head() and compat_auxtrace_mmap__write_tail(), as their naming indicates, when perf tool works in compat mode, it uses these two functions to access the AUX head and tail. These two functions can allow the perf tool to work properly in certain conditions, e.g. when perf tool works in snapshot mode with only using AUX head pointer, or perf tool uses the AUX buffer and the tail is not bigger than 4GB.
When the perf tool cannot handle the case when the AUX tail is overflow for 4GB, the function compat_auxtrace_mmap__write_tail() returns -1 and the caller is changed to bail out for the overflow error.
Suggested-by: Adrian Hunter adrian.hunter@intel.com Signed-off-by: Leo Yan leo.yan@linaro.org --- tools/perf/util/auxtrace.c | 9 ++-- tools/perf/util/auxtrace.h | 94 +++++++++++++++++++++++++++++++++++++- 2 files changed, 98 insertions(+), 5 deletions(-)
diff --git a/tools/perf/util/auxtrace.c b/tools/perf/util/auxtrace.c index db6255b55c90..a430d74b279f 100644 --- a/tools/perf/util/auxtrace.c +++ b/tools/perf/util/auxtrace.c @@ -1766,10 +1766,13 @@ static int __auxtrace_mmap__read(struct mmap *map, mm->prev = head;
if (!snapshot) { - auxtrace_mmap__write_tail(mm, head); - if (itr->read_finish) { - int err; + int err;
+ err = auxtrace_mmap__write_tail(mm, head); + if (err < 0) + return err; + + if (itr->read_finish) { err = itr->read_finish(itr, mm->idx); if (err < 0) return err; diff --git a/tools/perf/util/auxtrace.h b/tools/perf/util/auxtrace.h index d68a5e80b217..66de7b6e65ec 100644 --- a/tools/perf/util/auxtrace.h +++ b/tools/perf/util/auxtrace.h @@ -18,6 +18,8 @@ #include <asm/bitsperlong.h> #include <asm/barrier.h>
+#include "env.h" + union perf_event; struct perf_session; struct evlist; @@ -440,23 +442,111 @@ struct auxtrace_cache;
#ifdef HAVE_AUXTRACE_SUPPORT
+/* + * In the compat mode kernel runs in 64-bit and perf tool runs in 32-bit mode, + * 32-bit perf tool cannot access 64-bit value atomically, which might lead to + * the issues caused by the below sequence on multiple CPUs: when perf tool + * accesses either the load operation or the store operation for 64-bit value, + * on some architectures the operation is divided into two instructions, one + * is for accessing the low 32-bit value and another is for the high 32-bit; + * thus these two user operations can give the kernel chances to access the + * 64-bit value, and thus leads to the unexpected load values. + * + * kernel (64-bit) user (32-bit) + * + * if (LOAD ->aux_tail) { --, LOAD ->aux_head_lo + * STORE $aux_data | ,---> + * FLUSH $aux_data | | LOAD ->aux_head_hi + * STORE ->aux_head --|-------` smp_rmb() + * } | LOAD $data + * | smp_mb() + * | STORE ->aux_tail_lo + * `-----------> + * STORE ->aux_tail_hi + * + * For this reason, it's impossible for the perf tool to work correctly when + * the AUX head or tail is bigger than 4GB (more than 32 bits length); and we + * can not simply limit the AUX ring buffer to less than 4GB, the reason is + * the pointers can be increased monotonically (e.g in snapshot mode), whatever + * the buffer size it is, at the end the head and tail can be bigger than 4GB + * and carry out to the high 32-bit. + * + * To mitigate the issues and improve the user experience, we can allow the + * perf tool working in certain conditions and bail out with error if detect + * any overflow cannot be handled. + * + * For reading the AUX head, it reads out the values for three times, and + * compares the high 4 bytes of the values between the first time and the last + * time, if there has no change for high 4 bytes injected by the kernel during + * the user reading sequence, it's safe for use the second value. + * + * When update the AUX tail and detects any carrying in the high 32 bits, it + * means there have two store operations in user space and it cannot promise + * the atomicity for 64-bit write, so return '-1' in this case to tell the + * caller an overflow error has happened. + */ +static inline u64 compat_auxtrace_mmap__read_head(struct auxtrace_mmap *mm) +{ + struct perf_event_mmap_page *pc = mm->userpg; + u64 first, second, last; + u64 mask = (u64)(UINT32_MAX) << 32; + + do { + first = READ_ONCE(pc->aux_head); + /* Ensure all reads are done after we read the head */ + smp_rmb(); + second = READ_ONCE(pc->aux_head); + /* Ensure all reads are done after we read the head */ + smp_rmb(); + last = READ_ONCE(pc->aux_head); + } while ((first & mask) != (last & mask)); + + return second; +} + +static inline int compat_auxtrace_mmap__write_tail(struct auxtrace_mmap *mm, + u64 tail) +{ + struct perf_event_mmap_page *pc = mm->userpg; + u64 mask = (u64)(UINT32_MAX) << 32; + + if (tail & mask) + return -1; + + /* Ensure all reads are done before we write the tail out */ + smp_mb(); + WRITE_ONCE(pc->aux_tail, tail); + return 0; +} + static inline u64 auxtrace_mmap__read_head(struct auxtrace_mmap *mm) { struct perf_event_mmap_page *pc = mm->userpg; - u64 head = READ_ONCE(pc->aux_head); + u64 head; + +#if BITS_PER_LONG == 32 + if (kernel_is_64_bit) + return compat_auxtrace_mmap__read_head(mm); +#endif + head = READ_ONCE(pc->aux_head);
/* Ensure all reads are done after we read the head */ smp_rmb(); return head; }
-static inline void auxtrace_mmap__write_tail(struct auxtrace_mmap *mm, u64 tail) +static inline int auxtrace_mmap__write_tail(struct auxtrace_mmap *mm, u64 tail) { struct perf_event_mmap_page *pc = mm->userpg;
+#if BITS_PER_LONG == 32 + if (kernel_is_64_bit) + return compat_auxtrace_mmap__write_tail(mm, tail); +#endif /* Ensure all reads are done before we write the tail out */ smp_mb(); WRITE_ONCE(pc->aux_tail, tail); + return 0; }
int auxtrace_mmap__mmap(struct auxtrace_mmap *mm,