On Thu, Apr 29, 2021 at 7:37 PM Suzuki K Poulose suzuki.poulose@arm.com wrote:
On 29/04/2021 15:57, Arnd Bergmann wrote:
From: Arnd Bergmann arnd@arndb.de
clang-12 fails to build the etm4x driver with -fsanitize=array-bounds, where it decides to unroll certain loops in a way that result in a C variable getting put into an inline assembly
<instantiation>:1:7: error: expected constant expression in '.inst' directive .inst (0xd5200000|((((2) << 19) | ((1) << 16) | (((((((((((0x160 + (i * 4))))) >> 2))) >> 7) & 0x7)) << 12) | ((((((((((0x160 + (i * 4))))) >> 2))) & 0xf)) << 8) | (((((((((((0x160 + (i * 4))))) >> 2))) >> 4) & 0x7)) << 5)))|(.L__reg_num_x8)) ^ drivers/hwtracing/coresight/coresight-etm4x-core.c:702:4: note: while in macro instantiation etm4x_relaxed_read32(csa, TRCCNTVRn(i)); ^ drivers/hwtracing/coresight/coresight-etm4x.h:403:4: note: expanded from macro 'etm4x_relaxed_read32' read_etm4x_sysreg_offset((offset), false))) ^ drivers/hwtracing/coresight/coresight-etm4x.h:383:12: note: expanded from macro 'read_etm4x_sysreg_offset' __val = read_etm4x_sysreg_const_offset((offset)); \ ^ drivers/hwtracing/coresight/coresight-etm4x.h:149:2: note: expanded from macro 'read_etm4x_sysreg_const_offset' READ_ETM4x_REG(ETM4x_OFFSET_TO_REG(offset)) ^ drivers/hwtracing/coresight/coresight-etm4x.h:144:2: note: expanded from macro 'READ_ETM4x_REG' read_sysreg_s(ETM4x_REG_NUM_TO_SYSREG((reg))) ^ arch/arm64/include/asm/sysreg.h:1108:15: note: expanded from macro 'read_sysreg_s' asm volatile(__mrs_s("%0", r) : "=r" (__val)); \ ^ arch/arm64/include/asm/sysreg.h:1074:2: note: expanded from macro '__mrs_s' " mrs_s " v ", " __stringify(r) "\n" \ ^
This only happened in a few loops in which the array bounds sanitizer added a special case for an array overflow that clang determined to be possible, but any compiler is free to unroll any of the loops in the same way that breaks the sysreg macros.
Introduce helper functions that perform a sysreg access with a non-constant register number and use them in each call that passes a loop counter.
You don't need to add this special helper. We have the exact infrastructure already. So these could simply be replaced with:
csdev_access_xxx(csa, ...)
see :
include/linux/coresight.h
Ah, nice!
Do you mean replacing only the ones that use a nonconstant offset, or all of them? I guess changing all would avoid some really ugly magic macros, but the indirect function call and the switch() adds a few cycles of overhead every time and the code looks like it is micro-optimized for fast register access here.
Arnd