Hi Russell,
On Mon, Jul 12, 2021 at 03:44:11PM +0100, Russell King (Oracle) wrote:
On Sun, Jul 11, 2021 at 06:41:05PM +0800, Leo Yan wrote:
When perf runs in compat mode (kernel in 64-bit mode and the perf is in 32-bit mode), the 64-bit value atomicity in the user space cannot be assured, E.g. on some architectures, the 64-bit value accessing is split into two instructions, one is for the low 32-bit word accessing and another is for the high 32-bit word.
Does this apply to 32-bit ARM code on aarch64? I would not have thought it would, as the structure member is a __u64 and compat_auxtrace_mmap__read_head() doesn't seem to be marking anything as packed, so the compiler _should_ be able to use a LDRD instruction to load the value.
I think essentially your question is relevant to the memory model. For 32-bit Arm application on aarch64, in the Armv8 architecture reference manual ARM DDI 0487F.c, chapter "E2.2.1 Requirements for single-copy atomicity" describes:
"LDM, LDC, LDRD, STM, STC, STRD, PUSH, POP, RFE, SRS, VLDM, VLDR, VSTM, and VSTR instructions are executed as a sequence of word-aligned word accesses. Each 32-bit word access is guaranteed to be single-copy atomic. The architecture does not require subsequences of two or more word accesses from the sequence to be single-copy atomic."
So I think LDRD/STRD instruction cannot promise the atomicity for loading or storing two words in 32-bit Arm.
And another thought is the functions compat_auxtrace_mmap__read_head() is a general function, I avoid to write it with any architecture specific instructions.
Is this a problem noticed on non-ARM architectures?
No, actually we just concluded the potential issue based on the analysis for the weak memory model.
Thanks, Leo