On Mon, 1 Sep 2025, Peter Zijlstra wrote:
On Mon, Sep 01, 2025 at 11:36:00AM +0200, Peter Zijlstra wrote:
Something like the completely untested below should do I suppose.
diff --git a/include/linux/instrumented.h b/include/linux/instrumented.h index 711a1f0d1a73..e39cdfe5a59e 100644 --- a/include/linux/instrumented.h +++ b/include/linux/instrumented.h @@ -67,6 +67,7 @@ static __always_inline void instrument_atomic_read(const volatile void *v, size_ { kasan_check_read(v, size); kcsan_check_atomic_read(v, size);
- WARN_ON_ONCE(IS_ENABLED(CONFIG_DEBUG_ATOMIC) && ((unsigned long)v & 3));
} /** @@ -81,6 +82,7 @@ static __always_inline void instrument_atomic_write(const volatile void *v, size { kasan_check_write(v, size); kcsan_check_atomic_write(v, size);
- WARN_ON_ONCE(IS_ENABLED(CONFIG_DEBUG_ATOMIC) && ((unsigned long)v & 3));
} /** @@ -95,6 +97,7 @@ static __always_inline void instrument_atomic_read_write(const volatile void *v, { kasan_check_write(v, size); kcsan_check_atomic_read_write(v, size);
- WARN_ON_ONCE(IS_ENABLED(CONFIG_DEBUG_ATOMIC) && ((unsigned long)v & 3));
} /**
Arguably, that should've been something like:
((unsigned long)v & (size-1))
I suppose.
I've been testing this patch and these new WARN_ON splats have revealed some issues.
To start with, I overlooked atomic64_t in my patch. But aligning that isn't sufficient in some cases: we have kmem caches of structs containing atomic64_t members, where kmem_cache_create() is used with a default alignment of 4, for example, the struct inode cache in fs/proc/inode.c. So I changed the above CONFIG_DEBUG_ATOMIC test to (unsigned long)v & (sizeof(long) - 1).
Another issue I encountered was atomic operations used on non-atomic types. The try_cmpxchg() in task_work_add() triggers the CONFIG_DEBUG_ATOMIC misalignment WARN because of the 850 byte offset of task_works into struct task_struct. I got around this by adding __aligned(sizeof(long)) to task_works, but maybe __aligned(sizeof(atomic_t)) would be better (?)
Another example of this problem (i.e. atomic operation used with non-atomic type) appears in wait_task_zombie() in kernel/exit.c, where cmpxchg() is used on exit_state, found at offset 418 in struct task_struct. I prevented this by adding __aligned(sizeof(long)) to exit_state. I'm not sure what the right patch is.
A different problem shows up in spi_setup_transport_attrs() where spi_dv_mutex() is used to coerce starget_data into struct spi_transport_attrs, which happens to contain some atomic storage. starget_data is found at the end of the dynamically allocated struct scsi_target (not an uncommon pattern). So starget_data ends up at offset 290, and struct spi_transport_attrs is also misaligned, as is dv_mutex. To get around this, I changed the definition of struct scsi_target:
--- a/include/scsi/scsi_device.h +++ b/include/scsi/scsi_device.h @@ -370,7 +370,7 @@ struct scsi_target { char scsi_level; enum scsi_target_state state; void *hostdata; /* available to low-level driver */ - unsigned long starget_data[]; /* for the transport */ + atomic64_t starget_data[]; /* for the transport */ /* starget_data must be the last element!!!! */ } __attribute__((aligned(sizeof(unsigned long))));
This patch works because starget_data is never accessed except where it is being cast as some other type. But there's probably a reason for the 'long' here that I don't know about... maybe I should use atomic_long_t or atomic_t.
I found a few structures in the VFS layer with a misaligned lock, which I patched my way around. There's still a WARN splat from down_write() for semaphores somewhere in the VFS and TTY layers, but I was unable to track down the relevant memory allocations:
[ 59.500000] ------------[ cut here ]------------ [ 59.500000] WARNING: CPU: 0 PID: 329 at ./include/linux/instrumented.h:101 rwsem_down_write_slowpath+0x26a/0x4ca [ 59.500000] Modules linked in: [ 59.500000] CPU: 0 UID: 0 PID: 329 Comm: (udev-worker) Not tainted 6.16.0-multi-00009-g2b4218de16c4 #1 NONE [ 59.500000] Stack from 01e4be64: [ 59.500000] 01e4be64 005b83e3 005b83e3 005aa650 00000065 00000009 01e4be88 00007398 [ 59.500000] 005b83e3 01e4be98 000353a0 005aa41f 01e3024c 01e4bed0 00002594 005aa650 [ 59.500000] 00000065 00507f2a 00000009 00000000 00000000 00000000 00000002 00000002 [ 59.500000] 00000000 00000000 01e3024c 01e4bf40 00507f2a 005aa650 00000065 00000009 [ 59.500000] 00000000 00000000 00000000 00141ef8 0013e360 fffffffe 0aba9500 01e3024c [ 59.500000] 01e4bfa0 00f981b0 00000001 01e4bf2a 01e30254 01e4bf2a 00070000 00020000 [ 59.500000] Call Trace: [<00007398>] dump_stack+0x10/0x16 [ 59.500000] [<000353a0>] __warn+0xd0/0xf8 [ 59.500000] [<00002594>] warn_slowpath_fmt+0x54/0x62 [ 59.500000] [<00507f2a>] rwsem_down_write_slowpath+0x26a/0x4ca [ 59.500000] [<00507f2a>] rwsem_down_write_slowpath+0x26a/0x4ca [ 59.500000] [<00141ef8>] iput+0x0/0x1fe [ 59.500000] [<0013e360>] dput+0x0/0x160 [ 59.500000] [<00070000>] defer_console_output+0x28/0x34 [ 59.500000] [<00020000>] _FP_CALL_TOP+0x25c2/0xd512 [ 59.500000] [<000101e4>] ikbd_mouse_y0_bot+0x4/0x1c [ 59.500000] [<0000ffff>] atari_keyb_init+0xf7/0x17a [ 59.500000] [<005081c2>] down_write+0x38/0xf6 [ 59.500000] [<0013733a>] do_unlinkat+0xc8/0x264 [ 59.500000] [<0013755e>] sys_unlink+0x1c/0x20 [ 59.500000] [<0000876a>] syscall+0x8/0xc [ 59.500000] [<0000c048>] die_if_kernel.part.0+0x2c/0x40 [ 59.500000] [ 59.500000] ---[ end trace 0000000000000000 ]---
All of my testing was done on m68k. It would be interesting to try this on some other 32-bit architecture that tolerates misaligned accesses. More misaligned 64-bit atomics would be unsurprising.
The patches for these experiments may be found at, https://github.com/fthain/linux/tree/atomic_t