Hi James,
On Mon, Aug 23, 2021 at 01:23:42PM +0100, James Clark wrote:
On 09/08/2021 12:27, Leo Yan wrote:
When the tool runs with compat mode on Arm platform, the kernel is in 64-bit mode and user space is in 32-bit mode; the user space can use instructions "ldrd" and "strd" for 64-bit value atomicity.
This patch adds compat_auxtrace_mmap__{read_head|write_tail} for arm building, it uses "ldrd" and "strd" instructions to ensure accessing atomicity for aux head and tail. The file arch/arm/util/auxtrace.c is built for arm and arm64 building, these two functions are not needed for arm64, so check the compiler macro "__arm__" to only include them for arm building.
Signed-off-by: Leo Yan leo.yan@linaro.org
tools/perf/arch/arm/util/auxtrace.c | 32 +++++++++++++++++++++++++++++ 1 file changed, 32 insertions(+)
diff --git a/tools/perf/arch/arm/util/auxtrace.c b/tools/perf/arch/arm/util/auxtrace.c index b187bddbd01a..c7c7ec0812d5 100644 --- a/tools/perf/arch/arm/util/auxtrace.c +++ b/tools/perf/arch/arm/util/auxtrace.c @@ -107,3 +107,35 @@ struct auxtrace_record *err = 0; return NULL; }
+#if defined(__arm__) +u64 compat_auxtrace_mmap__read_head(struct auxtrace_mmap *mm) +{
- struct perf_event_mmap_page *pc = mm->userpg;
- u64 result;
- __asm__ __volatile__(
+" ldrd %0, %H0, [%1]"
- : "=&r" (result)
- : "r" (&pc->aux_head), "Qo" (pc->aux_head)
- );
- return result;
+}
Hi Leo,
I see that this is a duplicate of the atomic read in arch/arm/include/asm/atomic.h
Exactly.
For x86, it's possible to include tools/include/asm/atomic.h, but that doesn't include arch/arm/include/asm/atomic.h and there are some other #ifdefs that might make it not so easy for Arm. Just wondering if you considered trying to include the existing one? Or decided that it was easier to duplicate it?
Good finding!
With you reminding, I recognized that the atomic operations for arm/arm64 should be improved for user space program. So far, perf tool simply uses the compiler's atomic implementations (from asm-generic/atomic-gcc.h) for arm/arm64; but for a more reliable implementation, I think we should improve the user space program with architecture's atomic instructions.
So I think your question should be converted to: should we export the arm/arm64 atomicity operations to user space program? Seems to me this is a challenge work, we need at least to finish below items:
- Support arm64 atomic operations and reuse kernel's arch/arm/include/asm/atomic.h; - Support arm atomic operation and reuse kernel's arch/arm/include/asm/atomic.h; - For aarch32 building, we need to use configurations to distinguish different cases, like LPAE, Armv7, and ARMv6 variants (so far I have no idea how to use a graceful way to distinguish these different building in perf tool).
I am not sure if there have any existed ongoing effort for this part, if anyone is working on this (or before have started related work), then definitely we should look into how we can reuse the arch's atomic headers.
Otherwise, I prefer to firstly merge this patch with dozen lines of duplicate code; afterwards, we can send a separate patch set to support arm/arm64 atomic operations in user space.
If any Arm/Arm64 maintainers could shed some light for this part work, I think it would be very helpful.
Other than that, I have tested that the change works with a 32bit build with snapshot and normal mode.
Reviewed by: James Clark james.clark@arm.com Tested by: James Clark james.clark@arm.com
Thanks for test and review!
Leo