[PATCH] atomic: Specify natural alignment for atomic_t

List overview All Threads
Download

newer

older

[PATCH RESEND]...

[PATCH v2 0/4] hwmon: (sht21) Add...

Finn Thain

25 Aug 2025 25 Aug '25

2:03 a.m.

Some recent commits incorrectly assumed the natural alignment of locks. That assumption fails on Linux/m68k (and, interestingly, would have failed on Linux/cris also). This leads to spurious warnings from the hang check code. Fix this bug by adding the necessary 'aligned' attribute.

Cc: Andrew Morton akpm@linux-foundation.org Cc: Geert Uytterhoeven geert@linux-m68k.org Cc: Lance Yang lance.yang@linux.dev Cc: Masami Hiramatsu mhiramat@kernel.org Cc: Eero Tamminen oak@helsinkinet.fi Cc: Peter Zijlstra peterz@infradead.org Cc: Will Deacon will@kernel.org Cc: stable@vger.kernel.org Reported-by: Eero Tamminen oak@helsinkinet.fi Closes: https://lore.kernel.org/lkml/CAMuHMdW7Ab13DdGs2acMQcix5ObJK0O2dG_Fxzr8_g58Rc... Fixes: e711faaafbe5 ("hung_task: replace blocker_mutex with encoded blocker") Signed-off-by: Finn Thain fthain@linux-m68k.org --- I tested this on m68k using GCC and it fixed the problem for me. AFAIK, the other architectures naturally align ints already so I'm expecting to see no effect there. --- include/linux/types.h | 2 +- 1 file changed, 1 insertion(+), 1 deletion(-)

diff --git a/include/linux/types.h b/include/linux/types.h index 6dfdb8e8e4c3..cd5b2b0f4b02 100644 --- a/include/linux/types.h +++ b/include/linux/types.h @@ -179,7 +179,7 @@ typedef phys_addr_t resource_size_t; typedef unsigned long irq_hw_number_t;

typedef struct { - int counter; + int counter __aligned(sizeof(int)); } atomic_t;

#define ATOMIC_INIT(i) { (i) }

-- 2.49.1

Show replies by date

Lance Yang

25 Aug 25 Aug

3:27 a.m.

Hi Finn,

Nice work, thanks for your patch!

On 2025/8/25 10:03, Finn Thain wrote:

...

Some recent commits incorrectly assumed the natural alignment of locks. That assumption fails on Linux/m68k (and, interestingly, would have failed on Linux/cris also). This leads to spurious warnings from the hang check code. Fix this bug by adding the necessary 'aligned' attribute.

Cc: Andrew Morton akpm@linux-foundation.org Cc: Geert Uytterhoeven geert@linux-m68k.org Cc: Lance Yang lance.yang@linux.dev Cc: Masami Hiramatsu mhiramat@kernel.org Cc: Eero Tamminen oak@helsinkinet.fi Cc: Peter Zijlstra peterz@infradead.org Cc: Will Deacon will@kernel.org Cc: stable@vger.kernel.org Reported-by: Eero Tamminen oak@helsinkinet.fi Closes: https://lore.kernel.org/lkml/CAMuHMdW7Ab13DdGs2acMQcix5ObJK0O2dG_Fxzr8_g58Rc... Fixes: e711faaafbe5 ("hung_task: replace blocker_mutex with encoded blocker") Signed-off-by: Finn Thain fthain@linux-m68k.org

I tested this on m68k using GCC and it fixed the problem for me. AFAIK, the other architectures naturally align ints already so I'm expecting to see no effect there.

Yeah, it is the correct approach for the spurious warnings on architectures like m68k, where the natural alignment of types can be less than 4 bytes.

...

include/linux/types.h | 2 +- 1 file changed, 1 insertion(+), 1 deletion(-)

diff --git a/include/linux/types.h b/include/linux/types.h index 6dfdb8e8e4c3..cd5b2b0f4b02 100644 --- a/include/linux/types.h +++ b/include/linux/types.h @@ -179,7 +179,7 @@ typedef phys_addr_t resource_size_t; typedef unsigned long irq_hw_number_t; typedef struct {

int counter;

int counter __aligned(sizeof(int)); } atomic_t;

#define ATOMIC_INIT(i) { (i) }

However, as we've seen from the kernel test robot's report on mt6660_chip, this won't solve the cases where a lock is forced to be unaligned by #pragma pack(1). That will still trigger warnings, IIUC.

Perhaps we should also apply the follwoing?

diff --git a/include/linux/hung_task.h b/include/linux/hung_task.h index 34e615c76ca5..940f8f3558f6 100644 --- a/include/linux/hung_task.h +++ b/include/linux/hung_task.h @@ -45,7 +45,7 @@ static inline void hung_task_set_blocker(void *lock, unsigned long type) * If the lock pointer matches the BLOCKER_TYPE_MASK, return * without writing anything. */ - if (WARN_ON_ONCE(lock_ptr & BLOCKER_TYPE_MASK)) + if (lock_ptr & BLOCKER_TYPE_MASK) return;

WRITE_ONCE(current->blocker, lock_ptr | type); @@ -53,8 +53,6 @@ static inline void hung_task_set_blocker(void *lock, unsigned long type)

static inline void hung_task_clear_blocker(void) { - WARN_ON_ONCE(!READ_ONCE(current->blocker)); - WRITE_ONCE(current->blocker, 0UL); }

Let the feature gracefully do nothing on that ;)

Finn Thain

3:59 a.m.

On Mon, 25 Aug 2025, Lance Yang wrote:

...

However, as we've seen from the kernel test robot's report on mt6660_chip, this won't solve the cases where a lock is forced to be unaligned by #pragma pack(1). That will still trigger warnings, IIUC.

I think you've misunderstood the warning that your patch produced. (BTW, I have not seen any warnings from my own patch, so far.)

The mistake you made in your patch was to add an alignment attribute to a member of a packed struct. That's why I suggested that you should align the lock instead.

Is there some problem with my approach?

Lance Yang

4:22 a.m.

On 2025/8/25 11:59, Finn Thain wrote:

...

On Mon, 25 Aug 2025, Lance Yang wrote:

...
However, as we've seen from the kernel test robot's report on mt6660_chip, this won't solve the cases where a lock is forced to be unaligned by #pragma pack(1). That will still trigger warnings, IIUC.

I think you've misunderstood the warning that your patch produced. (BTW, I have not seen any warnings from my own patch, so far.)

The mistake you made in your patch was to add an alignment attribute to a member of a packed struct. That's why I suggested that you should align the lock instead.

Apologies for the confusion. I was referring to the runtime warning from WARN_ON_ONCE, not a compile-time warning.

Finn Thain

4:07 a.m.

On Mon, 25 Aug 2025, Lance Yang wrote:

...

Perhaps we should also apply the follwoing?

diff --git a/include/linux/hung_task.h b/include/linux/hung_task.h index 34e615c76ca5..940f8f3558f6 100644 --- a/include/linux/hung_task.h +++ b/include/linux/hung_task.h @@ -45,7 +45,7 @@ static inline void hung_task_set_blocker(void *lock, unsigned long type) * If the lock pointer matches the BLOCKER_TYPE_MASK, return * without writing anything. */

if (WARN_ON_ONCE(lock_ptr & BLOCKER_TYPE_MASK))

if (lock_ptr & BLOCKER_TYPE_MASK) return;

WRITE_ONCE(current->blocker, lock_ptr | type);

@@ -53,8 +53,6 @@ static inline void hung_task_set_blocker(void *lock, unsigned long type)

static inline void hung_task_clear_blocker(void) {

WARN_ON_ONCE(!READ_ONCE(current->blocker));

WRITE_ONCE(current->blocker, 0UL);

}

Let the feature gracefully do nothing on that ;)

This is poor style indeed.

The conditional you've added to the hung task code has no real relevance to hung tasks. It doesn't belong there.

Of course, nobody wants that sort of logic to get duplicated at each site affected by the architectural quirk in question. Try to imagine if the whole kernel followed your example, and such unrelated conditionals were scattered across code base for a few decades. Now imagine trying to work on that code.

You can see special cases for architectural quirks in drivers, but we do try to avoid them. And this is not a driver.

Lance Yang

5 a.m.

On 2025/8/25 12:07, Finn Thain wrote:

...

On Mon, 25 Aug 2025, Lance Yang wrote:

...
Perhaps we should also apply the follwoing?

diff --git a/include/linux/hung_task.h b/include/linux/hung_task.h index 34e615c76ca5..940f8f3558f6 100644 --- a/include/linux/hung_task.h +++ b/include/linux/hung_task.h @@ -45,7 +45,7 @@ static inline void hung_task_set_blocker(void *lock, unsigned long type) * If the lock pointer matches the BLOCKER_TYPE_MASK, return * without writing anything. */

if (WARN_ON_ONCE(lock_ptr & BLOCKER_TYPE_MASK))

if (lock_ptr & BLOCKER_TYPE_MASK) return;

WRITE_ONCE(current->blocker, lock_ptr | type);

@@ -53,8 +53,6 @@ static inline void hung_task_set_blocker(void *lock, unsigned long type)

static inline void hung_task_clear_blocker(void) {

WARN_ON_ONCE(!READ_ONCE(current->blocker));

WRITE_ONCE(current->blocker, 0UL); }

Let the feature gracefully do nothing on that ;)

This is poor style indeed.

Thanks for the lesson!

...

The conditional you've added to the hung task code has no real relevance to hung tasks. It doesn't belong there.

You're right! The original pointer-encoding was a deliberate trade-off to save a field in task_struct, but as we're seeing now, that assumption is fragile and causing issues :(

...

Of course, nobody wants that sort of logic to get duplicated at each site affected by the architectural quirk in question. Try to imagine if the whole kernel followed your example, and such unrelated conditionals were scattered across code base for a few decades. Now imagine trying to work on that code.

I agree with you completely: scattering more alignment checks into core logic isn't the right long-term solution. It's not a clean design :(

...

You can see special cases for architectural quirks in drivers, but we do try to avoid them. And this is not a driver.

So, how about this?

What if we squash the runtime check fix into your patch? That would create a single, complete fix that can be cleanly backported to stop all the spurious warnings at once.

Then, as a follow-up, we can work on the proper long-term solution: changing the pointer-encoding and re-introducing a dedicated field for the blocker type.

Finn Thain

6:17 a.m.

On Mon, 25 Aug 2025, Lance Yang wrote:

...

What if we squash the runtime check fix into your patch?

Did my patch not solve the problem?

Lance Yang

7:46 a.m.

On 2025/8/25 14:17, Finn Thain wrote:

...

On Mon, 25 Aug 2025, Lance Yang wrote:

...
What if we squash the runtime check fix into your patch?

Did my patch not solve the problem?

Hmm... it should solve the problem for natural alignment, which is a critical fix.

But it cannot solve the problem of forced misalignment from drivers using #pragma pack(1). The runtime warning will still trigger in those cases.

I built a simple test module on a kernel with your patch applied:

``` #include <linux/module.h> #include <linux/init.h>

struct __attribute__((packed)) test_container { char padding[49]; struct mutex io_lock; };

static int __init alignment_init(void) { struct test_container cont; pr_info("io_lock address offset mod 4: %lu\n", (unsigned long)&cont.io_lock % 4); return 0; }

static void __exit alignment_exit(void) { pr_info("Module unloaded\n"); }

module_init(alignment_init); module_exit(alignment_exit); MODULE_LICENSE("GPL"); MODULE_AUTHOR("x"); MODULE_DESCRIPTION("x"); ```

Result from dmesg: [Mon Aug 25 15:44:50 2025] io_lock address offset mod 4: 1

As we can see, a packed struct can still force the entire mutex object to an unaligned address. With an address like this, the WARN_ON_ONCE can still be triggered.

That's why I proposed squashing the runtime check fix into your patch. Then it can be cleanly backported to stop all the spurious warnings at once.

I hope this clarifies things.

Finn Thain

10:49 a.m.

[Belated Cc linux-m68k...]

On Mon, 25 Aug 2025, Lance Yang wrote:

...

On 2025/8/25 14:17, Finn Thain wrote:

...
On Mon, 25 Aug 2025, Lance Yang wrote:

...
What if we squash the runtime check fix into your patch?

Did my patch not solve the problem?

Hmm... it should solve the problem for natural alignment, which is a critical fix.

But it cannot solve the problem of forced misalignment from drivers using #pragma pack(1). The runtime warning will still trigger in those cases.

I built a simple test module on a kernel with your patch applied:
#include <linux/module.h>
#include <linux/init.h>

struct __attribute__((packed)) test_container {
    char padding[49];
    struct mutex io_lock;
};

static int __init alignment_init(void)
{
    struct test_container cont;
    pr_info("io_lock address offset mod 4: %lu\n", (unsigned long)&cont.io_lock % 4);
    return 0;
}

static void __exit alignment_exit(void)
{
    pr_info("Module unloaded\n");
}

module_init(alignment_init);
module_exit(alignment_exit);
MODULE_LICENSE("GPL");
MODULE_AUTHOR("x");
MODULE_DESCRIPTION("x");
Result from dmesg: [Mon Aug 25 15:44:50 2025] io_lock address offset mod 4: 1

Thanks for sending code to illustrate your point. Unfortunately, I was not able to reproduce your results. Tested on x86, your test module shows no misalignment:

[131840.349042] io_lock address offset mod 4: 0

Tested on m68k I also get 0, given the patch at the top of this thread:

[ 0.400000] io_lock address offset mod 4: 0

...

As we can see, a packed struct can still force the entire mutex object to an unaligned address. With an address like this, the WARN_ON_ONCE can still be triggered.

I don't think so. But there is something unexpected going on here -- the output from pahole appears to say io_lock is at offset 49, which seems to contradict the printk() output above.

struct test_container { char padding[49]; /* 0 49 */ struct mutex io_lock __attribute__((__aligned__(1))); /* 49 12 */

/* size: 61, cachelines: 1, members: 2 */ /* forced alignments: 1 */ /* last cacheline: 61 bytes */ } __attribute__((__packed__));

Lance Yang

11:19 a.m.

Thanks for digging deeper!

On 2025/8/25 18:49, Finn Thain wrote:

...

[Belated Cc linux-m68k...]

On Mon, 25 Aug 2025, Lance Yang wrote:

...
On 2025/8/25 14:17, Finn Thain wrote:

...
On Mon, 25 Aug 2025, Lance Yang wrote:

...
What if we squash the runtime check fix into your patch?

Did my patch not solve the problem?

Hmm... it should solve the problem for natural alignment, which is a critical fix.

But it cannot solve the problem of forced misalignment from drivers using #pragma pack(1). The runtime warning will still trigger in those cases.

I built a simple test module on a kernel with your patch applied:
#include <linux/module.h>
#include <linux/init.h>

struct __attribute__((packed)) test_container {
     char padding[49];
     struct mutex io_lock;
};

static int __init alignment_init(void)
{
     struct test_container cont;
     pr_info("io_lock address offset mod 4: %lu\n", (unsigned long)&cont.io_lock % 4);
     return 0;
}

static void __exit alignment_exit(void)
{
     pr_info("Module unloaded\n");
}

module_init(alignment_init);
module_exit(alignment_exit);
MODULE_LICENSE("GPL");
MODULE_AUTHOR("x");
MODULE_DESCRIPTION("x");
Result from dmesg: [Mon Aug 25 15:44:50 2025] io_lock address offset mod 4: 1
Thanks for sending code to illustrate your point. Unfortunately, I was not able to reproduce your results. Tested on x86, your test module shows no misalignment:

[131840.349042] io_lock address offset mod 4: 0

Tested on m68k I also get 0, given the patch at the top of this thread:

[ 0.400000] io_lock address offset mod 4: 0

...
As we can see, a packed struct can still force the entire mutex object to an unaligned address. With an address like this, the WARN_ON_ONCE can still be triggered.

...

I don't think so. But there is something unexpected going on here -- the output from pahole appears to say io_lock is at offset 49, which seems to contradict the printk() output above.

Interesting! That contradiction is the key. It seems we're seeing different compiler behaviors.

I'm on GCC 14.2.0 (Debian 14.2.0-19), and it appears to be strictly respecting the packed attribute.

So let's print something more:

``` #include <linux/module.h> #include <linux/init.h>

struct __attribute__((packed)) test_container { char padding[49]; struct mutex io_lock; };

static int __init alignment_init(void) { struct test_container cont; pr_info("Container base address : %px\n", &cont); pr_info("io_lock member address : %px\n", &cont.io_lock); pr_info("io_lock address offset mod 4: %lu\n", (unsigned long)&cont.io_lock % 4); return 0; }

static void __exit alignment_exit(void) { pr_info("Module unloaded\n"); }

module_init(alignment_init); module_exit(alignment_exit); MODULE_LICENSE("GPL"); MODULE_AUTHOR("x"); MODULE_DESCRIPTION("x"); ```

Result from dmesg:

``` [Mon Aug 25 19:15:33 2025] Container base address : ff1100063570f840 [Mon Aug 25 19:15:33 2025] io_lock member address : ff1100063570f871 [Mon Aug 25 19:15:33 2025] io_lock address offset mod 4: 1 ```

io_lock is exactly base + 49, resulting in misalignment.

Seems like your compiler is cleverly re-aligning the whole struct on the stack, but we can't rely on that behavior, as it's not guaranteed across all compilers or versions. wdyt?

Lance Yang

11:36 a.m.

On 2025/8/25 19:19, Lance Yang wrote:

...

Thanks for digging deeper!

On 2025/8/25 18:49, Finn Thain wrote:

...
[Belated Cc linux-m68k...]

On Mon, 25 Aug 2025, Lance Yang wrote:

...
On 2025/8/25 14:17, Finn Thain wrote:

...
On Mon, 25 Aug 2025, Lance Yang wrote:

...
What if we squash the runtime check fix into your patch?

Did my patch not solve the problem?

Hmm... it should solve the problem for natural alignment, which is a critical fix.

But it cannot solve the problem of forced misalignment from drivers using #pragma pack(1). The runtime warning will still trigger in those cases.

I built a simple test module on a kernel with your patch applied:
#include <linux/module.h>
#include <linux/init.h>

struct __attribute__((packed)) test_container {
     char padding[49];
     struct mutex io_lock;
};

static int __init alignment_init(void)
{
     struct test_container cont;
     pr_info("io_lock address offset mod 4: %lu\n", (unsigned 
long)&cont.io_lock % 4);
     return 0;
}

static void __exit alignment_exit(void)
{
     pr_info("Module unloaded\n");
}

module_init(alignment_init);
module_exit(alignment_exit);
MODULE_LICENSE("GPL");
MODULE_AUTHOR("x");
MODULE_DESCRIPTION("x");
Result from dmesg: [Mon Aug 25 15:44:50 2025] io_lock address offset mod 4: 1
Thanks for sending code to illustrate your point. Unfortunately, I was not able to reproduce your results. Tested on x86, your test module shows no misalignment:

[131840.349042] io_lock address offset mod 4: 0

Tested on m68k I also get 0, given the patch at the top of this thread:

[ 0.400000] io_lock address offset mod 4: 0

...
As we can see, a packed struct can still force the entire mutex object to an unaligned address. With an address like this, the WARN_ON_ONCE can still be triggered.
...
I don't think so. But there is something unexpected going on here -- the output from pahole appears to say io_lock is at offset 49, which seems to contradict the printk() output above.

Interesting! That contradiction is the key. It seems we're seeing different compiler behaviors.

I'm on GCC 14.2.0 (Debian 14.2.0-19), and it appears to be strictly respecting the packed attribute.

So let's print something more:
#include <linux/module.h>
#include <linux/init.h>

struct __attribute__((packed)) test_container {
     char padding[49];
     struct mutex io_lock;
};

static int __init alignment_init(void)
{
     struct test_container cont;
     pr_info("Container base address      : %px\n", &cont);
     pr_info("io_lock member address      : %px\n", &cont.io_lock);
     pr_info("io_lock address offset mod 4: %lu\n", (unsigned 
long)&cont.io_lock % 4);
     return 0;
}

static void __exit alignment_exit(void)
{
     pr_info("Module unloaded\n");
}

module_init(alignment_init);
module_exit(alignment_exit);
MODULE_LICENSE("GPL");
MODULE_AUTHOR("x");
MODULE_DESCRIPTION("x");
Result from dmesg:
[Mon Aug 25 19:15:33 2025] Container base address      : ff1100063570f840
[Mon Aug 25 19:15:33 2025] io_lock member address      : ff1100063570f871
[Mon Aug 25 19:15:33 2025] io_lock address offset mod 4: 1
io_lock is exactly base + 49, resulting in misalignment.

Seems like your compiler is cleverly re-aligning the whole struct on the stack, but we can't rely on that behavior, as it's not guaranteed across all compilers or versions. wdyt?

Same here, using a global static variable instead of a local one. The result is consistently misaligned.

``` #include <linux/module.h> #include <linux/init.h>

static struct __attribute__((packed)) test_container { char padding[49]; struct mutex io_lock; } cont;

static int __init alignment_init(void) { pr_info("Container base address : %px\n", &cont); pr_info("io_lock member address : %px\n", &cont.io_lock); pr_info("io_lock address offset mod 4: %lu\n", (unsigned long)&cont.io_lock % 4); return 0; }

static void __exit alignment_exit(void) { pr_info("Module unloaded\n"); }

module_init(alignment_init); module_exit(alignment_exit); MODULE_LICENSE("GPL"); MODULE_AUTHOR("x"); MODULE_DESCRIPTION("x"); ```

Result from dmesg:

``` [Mon Aug 25 19:33:28 2025] Container base address : ffffffffc28f0940 [Mon Aug 25 19:33:28 2025] io_lock member address : ffffffffc28f0971 [Mon Aug 25 19:33:28 2025] io_lock address offset mod 4: 1 ```

Finn Thain

27 Aug 27 Aug

11:43 p.m.

On Mon, 25 Aug 2025, Lance Yang wrote:

...

Same here, using a global static variable instead of a local one. The result is consistently misaligned.

#include <linux/module.h>
#include <linux/init.h>

static struct __attribute__((packed)) test_container {
    char padding[49];
    struct mutex io_lock;
} cont;

static int __init alignment_init(void)
{
    pr_info("Container base address      : %px\n", &cont);
    pr_info("io_lock member address      : %px\n", &cont.io_lock);
    pr_info("io_lock address offset mod 4: %lu\n", (unsigned long)&cont.io_lock % 4);
    return 0;
}

static void __exit alignment_exit(void)
{
    pr_info("Module unloaded\n");
}

module_init(alignment_init);
module_exit(alignment_exit);
MODULE_LICENSE("GPL");
MODULE_AUTHOR("x");
MODULE_DESCRIPTION("x");

Result from dmesg:

[Mon Aug 25 19:33:28 2025] Container base address      : ffffffffc28f0940
[Mon Aug 25 19:33:28 2025] io_lock member address      : ffffffffc28f0971
[Mon Aug 25 19:33:28 2025] io_lock address offset mod 4: 1

FTR, I was able to reproduce that result (i.e. static storage):

[ 0.320000] Container base address : 0055d9d0 [ 0.320000] io_lock member address : 0055da01 [ 0.320000] io_lock address offset mod 4: 1

I think the experiments you sent previously would have demonstrated the same result, except for the unpredictable base address that you sensibly logged in this version.

Lance Yang

28 Aug 28 Aug

2:05 a.m.

On 2025/8/28 07:43, Finn Thain wrote:

...

On Mon, 25 Aug 2025, Lance Yang wrote:

...
Same here, using a global static variable instead of a local one. The result is consistently misaligned.
#include <linux/module.h>
#include <linux/init.h>

static struct __attribute__((packed)) test_container {
     char padding[49];
     struct mutex io_lock;
} cont;

static int __init alignment_init(void)
{
     pr_info("Container base address      : %px\n", &cont);
     pr_info("io_lock member address      : %px\n", &cont.io_lock);
     pr_info("io_lock address offset mod 4: %lu\n", (unsigned long)&cont.io_lock % 4);
     return 0;
}

static void __exit alignment_exit(void)
{
     pr_info("Module unloaded\n");
}

module_init(alignment_init);
module_exit(alignment_exit);
MODULE_LICENSE("GPL");
MODULE_AUTHOR("x");
MODULE_DESCRIPTION("x");
Result from dmesg:
[Mon Aug 25 19:33:28 2025] Container base address      : ffffffffc28f0940
[Mon Aug 25 19:33:28 2025] io_lock member address      : ffffffffc28f0971
[Mon Aug 25 19:33:28 2025] io_lock address offset mod 4: 1
FTR, I was able to reproduce that result (i.e. static storage):

[ 0.320000] Container base address : 0055d9d0 [ 0.320000] io_lock member address : 0055da01 [ 0.320000] io_lock address offset mod 4: 1

I think the experiments you sent previously would have demonstrated the same result, except for the unpredictable base address that you sensibly logged in this version.

Thanks for taking the time to reproduce it!

This proves the problem can happen in practice (e.g., with packed structs), so we need to ignore the unaligned pointers on the architectures that don't trap for now.

Cheers, Lance

Geert Uytterhoeven

1 Sep 1 Sep

8:45 a.m.

Hi Lance,

On Thu, 28 Aug 2025 at 04:05, Lance Yang lance.yang@linux.dev wrote:

...

On 2025/8/28 07:43, Finn Thain wrote:

...
On Mon, 25 Aug 2025, Lance Yang wrote:

...
Same here, using a global static variable instead of a local one. The result is consistently misaligned.
#include <linux/module.h>
#include <linux/init.h>

static struct __attribute__((packed)) test_container {
     char padding[49];
     struct mutex io_lock;
} cont;

static int __init alignment_init(void)
{
     pr_info("Container base address      : %px\n", &cont);
     pr_info("io_lock member address      : %px\n", &cont.io_lock);
     pr_info("io_lock address offset mod 4: %lu\n", (unsigned long)&cont.io_lock % 4);
     return 0;
}

static void __exit alignment_exit(void)
{
     pr_info("Module unloaded\n");
}

module_init(alignment_init);
module_exit(alignment_exit);
MODULE_LICENSE("GPL");
MODULE_AUTHOR("x");
MODULE_DESCRIPTION("x");
Result from dmesg:
[Mon Aug 25 19:33:28 2025] Container base address      : ffffffffc28f0940
[Mon Aug 25 19:33:28 2025] io_lock member address      : ffffffffc28f0971
[Mon Aug 25 19:33:28 2025] io_lock address offset mod 4: 1
FTR, I was able to reproduce that result (i.e. static storage):

[ 0.320000] Container base address : 0055d9d0 [ 0.320000] io_lock member address : 0055da01 [ 0.320000] io_lock address offset mod 4: 1

I think the experiments you sent previously would have demonstrated the same result, except for the unpredictable base address that you sensibly logged in this version.
Thanks for taking the time to reproduce it!

This proves the problem can happen in practice (e.g., with packed structs), so we need to ignore the unaligned pointers on the architectures that don't trap for now.

Putting locks inside a packed struct is definitely a Very Bad Idea and a No Go. Packed structs are meant to describe memory data and MMIO register layouts, and must not contain control data for critical sections.

Gr{oetje,eeting}s,

Geert

-- Geert Uytterhoeven -- There's lots of Linux beyond ia32 -- geert@linux-m68k.org In personal conversations with technical people, I call myself a hacker. But when I'm talking to journalists I just say "programmer" or something like that. -- Linus Torvalds

Lance Yang

2 Sep 2 Sep

1:30 p.m.

Hi Geert,

On 2025/9/1 16:45, Geert Uytterhoeven wrote:

...

Hi Lance,

On Thu, 28 Aug 2025 at 04:05, Lance Yang lance.yang@linux.dev wrote:

...
On 2025/8/28 07:43, Finn Thain wrote:

...
On Mon, 25 Aug 2025, Lance Yang wrote:

...
Same here, using a global static variable instead of a local one. The result is consistently misaligned.
#include <linux/module.h>
#include <linux/init.h>

static struct __attribute__((packed)) test_container {
      char padding[49];
      struct mutex io_lock;
} cont;

static int __init alignment_init(void)
{
      pr_info("Container base address      : %px\n", &cont);
      pr_info("io_lock member address      : %px\n", &cont.io_lock);
      pr_info("io_lock address offset mod 4: %lu\n", (unsigned long)&cont.io_lock % 4);
      return 0;
}

static void __exit alignment_exit(void)
{
      pr_info("Module unloaded\n");
}

module_init(alignment_init);
module_exit(alignment_exit);
MODULE_LICENSE("GPL");
MODULE_AUTHOR("x");
MODULE_DESCRIPTION("x");
Result from dmesg:
[Mon Aug 25 19:33:28 2025] Container base address      : ffffffffc28f0940
[Mon Aug 25 19:33:28 2025] io_lock member address      : ffffffffc28f0971
[Mon Aug 25 19:33:28 2025] io_lock address offset mod 4: 1
FTR, I was able to reproduce that result (i.e. static storage):

[ 0.320000] Container base address : 0055d9d0 [ 0.320000] io_lock member address : 0055da01 [ 0.320000] io_lock address offset mod 4: 1

I think the experiments you sent previously would have demonstrated the same result, except for the unpredictable base address that you sensibly logged in this version.
Thanks for taking the time to reproduce it!

This proves the problem can happen in practice (e.g., with packed structs), so we need to ignore the unaligned pointers on the architectures that don't trap for now.
Putting locks inside a packed struct is definitely a Very Bad Idea and a No Go. Packed structs are meant to describe memory data and

Right. That's definitely not how packed structs should be used ;)

...

MMIO register layouts, and must not contain control data for critical sections.

Unfortunately, this patten was found in an in-tree driver, as reported[1] by kernel test robot, and there might be other undiscovered instances ...

[1] https://lore.kernel.org/oe-kbuild-all/202508240539.ARmC1Umu-lkp@intel.com

Cheers, Lance

Geert Uytterhoeven

2:14 p.m.

Hi Lance,

On Tue, 2 Sept 2025 at 15:31, Lance Yang lance.yang@linux.dev wrote:

...

On 2025/9/1 16:45, Geert Uytterhoeven wrote:

...
On Thu, 28 Aug 2025 at 04:05, Lance Yang lance.yang@linux.dev wrote:

...
This proves the problem can happen in practice (e.g., with packed structs), so we need to ignore the unaligned pointers on the architectures that don't trap for now.

Putting locks inside a packed struct is definitely a Very Bad Idea and a No Go. Packed structs are meant to describe memory data and

Right. That's definitely not how packed structs should be used ;)

...
MMIO register layouts, and must not contain control data for critical sections.

Unfortunately, this patten was found in an in-tree driver, as reported[1] by kernel test robot, and there might be other undiscovered instances ...

[1] https://lore.kernel.org/oe-kbuild-all/202508240539.ARmC1Umu-lkp@intel.com

That one is completely bogus, and should be removed. Currently it would crash on any platform that does not support unaligned accesses.

Gr{oetje,eeting}s,

Geert

David Laight

6 Sep 6 Sep

11:43 a.m.

On Mon, 1 Sep 2025 10:45:46 +0200 Geert Uytterhoeven geert@linux-m68k.org wrote:

...

Hi Lance,

On Thu, 28 Aug 2025 at 04:05, Lance Yang lance.yang@linux.dev wrote:

...
On 2025/8/28 07:43, Finn Thain wrote:

...
On Mon, 25 Aug 2025, Lance Yang wrote:

...
Same here, using a global static variable instead of a local one. The result is consistently misaligned.
#include <linux/module.h>
#include <linux/init.h>

static struct __attribute__((packed)) test_container {
     char padding[49];
     struct mutex io_lock;
} cont;

static int __init alignment_init(void)
{
     pr_info("Container base address      : %px\n", &cont);
     pr_info("io_lock member address      : %px\n", &cont.io_lock);
     pr_info("io_lock address offset mod 4: %lu\n", (unsigned long)&cont.io_lock % 4);
     return 0;
}

static void __exit alignment_exit(void)
{
     pr_info("Module unloaded\n");
}

module_init(alignment_init);
module_exit(alignment_exit);
MODULE_LICENSE("GPL");
MODULE_AUTHOR("x");
MODULE_DESCRIPTION("x");
Result from dmesg:
[Mon Aug 25 19:33:28 2025] Container base address      : ffffffffc28f0940
[Mon Aug 25 19:33:28 2025] io_lock member address      : ffffffffc28f0971
[Mon Aug 25 19:33:28 2025] io_lock address offset mod 4: 1
FTR, I was able to reproduce that result (i.e. static storage):

[ 0.320000] Container base address : 0055d9d0 [ 0.320000] io_lock member address : 0055da01 [ 0.320000] io_lock address offset mod 4: 1

I think the experiments you sent previously would have demonstrated the same result, except for the unpredictable base address that you sensibly logged in this version.
Thanks for taking the time to reproduce it!

This proves the problem can happen in practice (e.g., with packed structs), so we need to ignore the unaligned pointers on the architectures that don't trap for now.
Putting locks inside a packed struct is definitely a Very Bad Idea and a No Go. Packed structs are meant to describe memory data and MMIO register layouts, and must not contain control data for critical sections.

Even for MMIO register layouts you don't (usually) want 'packed'. You may need to add explicit padding, and an 'error if padded' attribute you be useful. Sometimes you have (eg) a 64bit item on a 32bit boundary, marking the member 'packed' will remove the gap before it - usually what is wanted.

In reality pretty much nothing should be 'packed'.

David.

...

Gr{oetje,eeting}s,

                    Geert

David Laight

25 Aug 25 Aug

12:07 p.m.

On Mon, 25 Aug 2025 15:46:42 +0800 Lance Yang lance.yang@linux.dev wrote:

...

On 2025/8/25 14:17, Finn Thain wrote:

...
On Mon, 25 Aug 2025, Lance Yang wrote:

...
What if we squash the runtime check fix into your patch?

Did my patch not solve the problem?

Hmm... it should solve the problem for natural alignment, which is a critical fix.

But it cannot solve the problem of forced misalignment from drivers using #pragma pack(1). The runtime warning will still trigger in those cases.

I built a simple test module on a kernel with your patch applied:
#include <linux/module.h>
#include <linux/init.h>

struct __attribute__((packed)) test_container {
     char padding[49];
     struct mutex io_lock;
};

static int __init alignment_init(void)
{
     struct test_container cont;
     pr_info("io_lock address offset mod 4: %lu\n", (unsigned long)&cont.io_lock % 4);

Doesn't that give a compilation warning from 'taking the address of a packed member'? Ignore that at your peril.

More problematic is that, IIRC, m68k kmalloc() allocates 16bit aligned memory. This has broken other things in the past. I doubt that increasing the alignment to 32bits would make much difference to the kernel memory footprint.

David

...

 return 0;
}

static void __exit alignment_exit(void) { pr_info("Module unloaded\n"); }

module_init(alignment_init); module_exit(alignment_exit); MODULE_LICENSE("GPL"); MODULE_AUTHOR("x"); MODULE_DESCRIPTION("x");
Result from dmesg:
[Mon Aug 25 15:44:50 2025] io_lock address offset mod 4: 1

As we can see, a packed struct can still force the entire mutex object
to an unaligned address. With an address like this, the WARN_ON_ONCE
can still be triggered.

That's why I proposed squashing the runtime check fix into your patch.
Then it can be cleanly backported to stop all the spurious warnings at
once.

I hope this clarifies things.

Lance Yang

12:33 p.m.

On 2025/8/25 20:07, David Laight wrote:

...

On Mon, 25 Aug 2025 15:46:42 +0800 Lance Yang lance.yang@linux.dev wrote:

...
On 2025/8/25 14:17, Finn Thain wrote:

...
On Mon, 25 Aug 2025, Lance Yang wrote:

...
What if we squash the runtime check fix into your patch?

Did my patch not solve the problem?

Hmm... it should solve the problem for natural alignment, which is a critical fix.

But it cannot solve the problem of forced misalignment from drivers using #pragma pack(1). The runtime warning will still trigger in those cases.

I built a simple test module on a kernel with your patch applied:
#include <linux/module.h>
#include <linux/init.h>

struct __attribute__((packed)) test_container {
      char padding[49];
      struct mutex io_lock;
};

static int __init alignment_init(void)
{
      struct test_container cont;
      pr_info("io_lock address offset mod 4: %lu\n", (unsigned long)&cont.io_lock % 4);
Doesn't that give a compilation warning from 'taking the address of a packed member'? Ignore that at your peril.

Hmm, I don't see that acctully ...

...

More problematic is that, IIRC, m68k kmalloc() allocates 16bit aligned memory. This has broken other things in the past. I doubt that increasing the alignment to 32bits would make much difference to the kernel memory footprint.

@Finn Given this new information, how about we just apply the runtime check fix for now?

Since we plan to remove the entire pointer-encoding scheme later anyway, a minimal and targeted change could be the logical choice. It's easy and safe to backport, and it cleanly stops the warnings from all sources without introducing new risks - exactly what we need for stable kernels.

Cheers, Lance

Finn Thain

27 Aug 27 Aug

8 a.m.

On Mon, 25 Aug 2025, Lance Yang wrote:

...

...
More problematic is that, IIRC, m68k kmalloc() allocates 16bit aligned memory. This has broken other things in the past. I doubt that increasing the alignment to 32bits would make much difference to the kernel memory footprint.

@Finn Given this new information, how about we just apply the runtime check fix for now?

New information? No, that's just hear-say.

...

Since we plan to remove the entire pointer-encoding scheme later anyway, a minimal and targeted change could be the logical choice. It's easy and safe to backport, and it cleanly stops the warnings from all sources without introducing new risks - exactly what we need for stable kernels.

Well, that's up to you, of course. If you want my comment, I'd only ask whether or not the bug is theoretical (outside of m68k).

Lance Yang

9:34 a.m.

On 2025/8/27 16:00, Finn Thain wrote:

...

On Mon, 25 Aug 2025, Lance Yang wrote:

...
...
More problematic is that, IIRC, m68k kmalloc() allocates 16bit aligned memory. This has broken other things in the past. I doubt that increasing the alignment to 32bits would make much difference to the kernel memory footprint.

@Finn Given this new information, how about we just apply the runtime check fix for now?

New information? No, that's just hear-say.

Emm... I jumped the gun there ;p

...

...
Since we plan to remove the entire pointer-encoding scheme later anyway, a minimal and targeted change could be the logical choice. It's easy and safe to backport, and it cleanly stops the warnings from all sources without introducing new risks - exactly what we need for stable kernels.

Well, that's up to you, of course. If you want my comment, I'd only ask whether or not the bug is theoretical (outside of m68k).

Well, let's apply both this fix and the runtime check fix[1] as Masami suggested ;)

[1] https://lore.kernel.org/lkml/20250823050036.7748-1-lance.yang@linux.dev/

Geert Uytterhoeven

1 Sep 1 Sep

8:48 a.m.

Hi David,

On Mon, 25 Aug 2025 at 14:07, David Laight david.laight.linux@gmail.com wrote:

...

On Mon, 25 Aug 2025 15:46:42 +0800 Lance Yang lance.yang@linux.dev wrote:

...
On 2025/8/25 14:17, Finn Thain wrote:

...
On Mon, 25 Aug 2025, Lance Yang wrote:

...
What if we squash the runtime check fix into your patch?

Did my patch not solve the problem?

Hmm... it should solve the problem for natural alignment, which is a critical fix.

But it cannot solve the problem of forced misalignment from drivers using #pragma pack(1). The runtime warning will still trigger in those cases.

I built a simple test module on a kernel with your patch applied:
#include <linux/module.h>
#include <linux/init.h>

struct __attribute__((packed)) test_container {
     char padding[49];
     struct mutex io_lock;
};

static int __init alignment_init(void)
{
     struct test_container cont;
     pr_info("io_lock address offset mod 4: %lu\n", (unsigned long)&cont.io_lock % 4);
Doesn't that give a compilation warning from 'taking the address of a packed member'? Ignore that at your peril.

More problematic is that, IIRC, m68k kmalloc() allocates 16bit aligned memory. This has broken other things in the past.

Really? AFAIK it always returns memory that is at least aligned to the cache line size (i.e. 16 bytes on m68k). So perhaps you are confusing "16bit" with "16byte"?

Gr{oetje,eeting}s,

Geert

Peter Zijlstra

25 Aug 25 Aug

7:12 a.m.

On Mon, Aug 25, 2025 at 12:03:05PM +1000, Finn Thain wrote:

...

Some recent commits incorrectly assumed the natural alignment of locks. That assumption fails on Linux/m68k (and, interestingly, would have failed on Linux/cris also). This leads to spurious warnings from the hang check code. Fix this bug by adding the necessary 'aligned' attribute.

Cc: Andrew Morton akpm@linux-foundation.org Cc: Geert Uytterhoeven geert@linux-m68k.org Cc: Lance Yang lance.yang@linux.dev Cc: Masami Hiramatsu mhiramat@kernel.org Cc: Eero Tamminen oak@helsinkinet.fi Cc: Peter Zijlstra peterz@infradead.org Cc: Will Deacon will@kernel.org Cc: stable@vger.kernel.org Reported-by: Eero Tamminen oak@helsinkinet.fi Closes: https://lore.kernel.org/lkml/CAMuHMdW7Ab13DdGs2acMQcix5ObJK0O2dG_Fxzr8_g58Rc... Fixes: e711faaafbe5 ("hung_task: replace blocker_mutex with encoded blocker")

I don't see how this patch it at all relevant. Let along how its fixed by the below.

...

Signed-off-by: Finn Thain fthain@linux-m68k.org

I tested this on m68k using GCC and it fixed the problem for me. AFAIK, the other architectures naturally align ints already so I'm expecting to see no effect there.

include/linux/types.h | 2 +- 1 file changed, 1 insertion(+), 1 deletion(-)

diff --git a/include/linux/types.h b/include/linux/types.h index 6dfdb8e8e4c3..cd5b2b0f4b02 100644 --- a/include/linux/types.h +++ b/include/linux/types.h @@ -179,7 +179,7 @@ typedef phys_addr_t resource_size_t; typedef unsigned long irq_hw_number_t; typedef struct {

int counter;

int counter __aligned(sizeof(int));

} atomic_t; #define ATOMIC_INIT(i) { (i) }

And your architecture doesn't trap on unaligned atomic access ?!!?!

Finn Thain

8:03 a.m.

On Mon, 25 Aug 2025, Peter Zijlstra wrote:

...

And your architecture doesn't trap on unaligned atomic access ?!!?!

Right. This port doesn't do SMP.

Peter Zijlstra

11:41 a.m.

On Mon, Aug 25, 2025 at 06:03:23PM +1000, Finn Thain wrote:

...

On Mon, 25 Aug 2025, Peter Zijlstra wrote:

...
And your architecture doesn't trap on unaligned atomic access ?!!?!

Right. This port doesn't do SMP.

There is RMW_INSN which seems to imply a compare-and-swap instruction of sorts. That is happy to work on unaligned storage?

Anyway, it might make sense to add an alignment check to arch/m68k/include/asm/atomic.h somewhere, perhaps dependent on some DEBUG flag or other.

Finn Thain

27 Aug 27 Aug

7:17 a.m.

On Mon, 25 Aug 2025, Peter Zijlstra wrote:

...

On Mon, Aug 25, 2025 at 06:03:23PM +1000, Finn Thain wrote:

...
On Mon, 25 Aug 2025, Peter Zijlstra wrote:

...
And your architecture doesn't trap on unaligned atomic access ?!!?!

Right. This port doesn't do SMP.

There is RMW_INSN which seems to imply a compare-and-swap instruction of sorts. That is happy to work on unaligned storage?

Yes, the TAS and CAS instructions are happy to work on unaligned storage.

However, these operations involve an indivisible bus cycle that hogs the bus to the detriment of other processors, DMA controllers etc. So I suspect lock alignment would tend to shorten read-modify-write cycles, and improve efficiency, when CONFIG_RMW_INSN is enabled.

Most m68k platforms will have CONFIG_RMW_INSN disabled, or else simply don't implement TAS and CAS. In this case, lock alignment might still help, just because L1 cache entries are long words. I've not tried to measure this.

Peter Zijlstra

11:54 a.m.

On Wed, Aug 27, 2025 at 05:17:19PM +1000, Finn Thain wrote:

...

On Mon, 25 Aug 2025, Peter Zijlstra wrote:

...
On Mon, Aug 25, 2025 at 06:03:23PM +1000, Finn Thain wrote:

...
On Mon, 25 Aug 2025, Peter Zijlstra wrote:

...
And your architecture doesn't trap on unaligned atomic access ?!!?!

Right. This port doesn't do SMP.

There is RMW_INSN which seems to imply a compare-and-swap instruction of sorts. That is happy to work on unaligned storage?

Yes, the TAS and CAS instructions are happy to work on unaligned storage.

However, these operations involve an indivisible bus cycle that hogs the bus to the detriment of other processors, DMA controllers etc. So I suspect lock alignment would tend to shorten read-modify-write cycles, and improve efficiency, when CONFIG_RMW_INSN is enabled.

Most m68k platforms will have CONFIG_RMW_INSN disabled, or else simply don't implement TAS and CAS. In this case, lock alignment might still help, just because L1 cache entries are long words. I've not tried to measure this.

Fair enough; this sounds a little like the x86 LOCK prefix, it will work on unaligned memory, but at tremendous cost (recent chips have an optional exception on unaligned).

Anyway, I'm not opposed to adding an explicit alignment to atomic_t. Isn't s32 or __s32 already having this?

But I think it might make sense to have a DEBUG alignment check right along with adding that alignment, just to make sure things are indeed / stay aligned.

Finn Thain

28 Aug 28 Aug

9:53 a.m.

On Wed, 27 Aug 2025, Peter Zijlstra wrote:

...

On Wed, Aug 27, 2025 at 05:17:19PM +1000, Finn Thain wrote:

...
On Mon, 25 Aug 2025, Peter Zijlstra wrote:

...
On Mon, Aug 25, 2025 at 06:03:23PM +1000, Finn Thain wrote:

...
On Mon, 25 Aug 2025, Peter Zijlstra wrote:

...
And your architecture doesn't trap on unaligned atomic access ?!!?!

Right. This port doesn't do SMP.

There is RMW_INSN which seems to imply a compare-and-swap instruction of sorts. That is happy to work on unaligned storage?

Yes, the TAS and CAS instructions are happy to work on unaligned storage.

However, these operations involve an indivisible bus cycle that hogs the bus to the detriment of other processors, DMA controllers etc. So I suspect lock alignment would tend to shorten read-modify-write cycles, and improve efficiency, when CONFIG_RMW_INSN is enabled.

Most m68k platforms will have CONFIG_RMW_INSN disabled, or else simply don't implement TAS and CAS. In this case, lock alignment might still help, just because L1 cache entries are long words. I've not tried to measure this.

Fair enough; this sounds a little like the x86 LOCK prefix, it will work on unaligned memory, but at tremendous cost (recent chips have an optional exception on unaligned).

Anyway, I'm not opposed to adding an explicit alignment to atomic_t. Isn't s32 or __s32 already having this?

For Linux/m68k, __alignof__(__s32) == 2 and __alignof__(s32) == 2.

...

But I think it might make sense to have a DEBUG alignment check right along with adding that alignment, just to make sure things are indeed / stay aligned.

A run-time assertion seems surperfluous as long as other architectures already trap for misaligned locks. For m68k, perhaps we could have a compile-time check:

--- a/arch/m68k/kernel/setup_mm.c +++ b/arch/m68k/kernel/setup_mm.c @@ -371,6 +371,12 @@ void __init setup_arch(char **cmdline_p) } #endif #endif + + /* + * 680x0 CPUs don't require aligned storage for atomic ops. + * However, alignment assumptions may appear in core kernel code. + */ + BUILD_BUG_ON(__alignof__(atomic_t) < sizeof(atomic_t)); }

But I'm not sure that arch/m68k is a good place for that kind of thing -- my inclination would be to place such compile-time assertions closer to the code that rests on that assertion, like in hung_task.c or mutex.c. E.g.

--- a/kernel/locking/mutex.c +++ b/kernel/locking/mutex.c @@ -54,8 +54,6 @@ __mutex_init(struct mutex *lock, const char *name, struct lock_class_key *key) #endif

debug_mutex_init(lock, name, key); + + BUILD_BUG_ON(__alignof__(lock->owner) < sizeof(lock->owner)); } EXPORT_SYMBOL(__mutex_init);

Is that the kind of check you had in mind? I'm open to suggestions.

Peter Zijlstra

1 Sep 1 Sep

9:36 a.m.

On Thu, Aug 28, 2025 at 07:53:52PM +1000, Finn Thain wrote:

...

...
Anyway, I'm not opposed to adding an explicit alignment to atomic_t. Isn't s32 or __s32 already having this?

For Linux/m68k, __alignof__(__s32) == 2 and __alignof__(s32) == 2.

Hmm, somehow I thought one of those enforced natural alignment. Oh well.

...

...
But I think it might make sense to have a DEBUG alignment check right along with adding that alignment, just to make sure things are indeed / stay aligned.

A run-time assertion seems surperfluous as long as other architectures already trap for misaligned locks.

Right, but those architectures have natural alignment. m68k is 'special' in that it doesn't have this.

...

For m68k, perhaps we could have a compile-time check:

I don't think build-time is sufficient. There is various code that casts to atomic types and other funny stuff like that.

If you want to ensure 'atomics' are always naturally aligned, the only sound way is to have a runtime test/trap.

Something like the completely untested below should do I suppose.

--- diff --git a/include/linux/instrumented.h b/include/linux/instrumented.h index 711a1f0d1a73..e39cdfe5a59e 100644 --- a/include/linux/instrumented.h +++ b/include/linux/instrumented.h @@ -67,6 +67,7 @@ static __always_inline void instrument_atomic_read(const volatile void *v, size_ { kasan_check_read(v, size); kcsan_check_atomic_read(v, size); + WARN_ON_ONCE(IS_ENABLED(CONFIG_DEBUG_ATOMIC) && ((unsigned long)v & 3)); }

/** @@ -81,6 +82,7 @@ static __always_inline void instrument_atomic_write(const volatile void *v, size { kasan_check_write(v, size); kcsan_check_atomic_write(v, size); + WARN_ON_ONCE(IS_ENABLED(CONFIG_DEBUG_ATOMIC) && ((unsigned long)v & 3)); }

/** @@ -95,6 +97,7 @@ static __always_inline void instrument_atomic_read_write(const volatile void *v, { kasan_check_write(v, size); kcsan_check_atomic_read_write(v, size); + WARN_ON_ONCE(IS_ENABLED(CONFIG_DEBUG_ATOMIC) && ((unsigned long)v & 3)); }

/** diff --git a/lib/Kconfig.debug b/lib/Kconfig.debug index dc0e0c6ed075..1c7e30cdfe04 100644 --- a/lib/Kconfig.debug +++ b/lib/Kconfig.debug @@ -1363,6 +1363,16 @@ config DEBUG_PREEMPT depending on workload as it triggers debugging routines for each this_cpu operation. It should only be used for debugging purposes.

+config DEBUG_ATOMIC + bool "Debug atomic variables" + depends on DEBUG_KERNEL + help + If you say Y here then the kernel will add a runtime alignment check + to atomic accesses. Useful for architectures that do not have trap on + mis-aligned access. + + This option has potentially significant overhead. + menu "Lock Debugging (spinlocks, mutexes, etc...)"

config LOCK_DEBUGGING_SUPPORT

Peter Zijlstra

9:40 a.m.

On Mon, Sep 01, 2025 at 11:36:00AM +0200, Peter Zijlstra wrote:

...

Something like the completely untested below should do I suppose.

diff --git a/include/linux/instrumented.h b/include/linux/instrumented.h index 711a1f0d1a73..e39cdfe5a59e 100644 --- a/include/linux/instrumented.h +++ b/include/linux/instrumented.h @@ -67,6 +67,7 @@ static __always_inline void instrument_atomic_read(const volatile void *v, size_ { kasan_check_read(v, size); kcsan_check_atomic_read(v, size);

WARN_ON_ONCE(IS_ENABLED(CONFIG_DEBUG_ATOMIC) && ((unsigned long)v & 3));

} /** @@ -81,6 +82,7 @@ static __always_inline void instrument_atomic_write(const volatile void *v, size { kasan_check_write(v, size); kcsan_check_atomic_write(v, size);

WARN_ON_ONCE(IS_ENABLED(CONFIG_DEBUG_ATOMIC) && ((unsigned long)v & 3));

} /** @@ -95,6 +97,7 @@ static __always_inline void instrument_atomic_read_write(const volatile void *v, { kasan_check_write(v, size); kcsan_check_atomic_read_write(v, size);

WARN_ON_ONCE(IS_ENABLED(CONFIG_DEBUG_ATOMIC) && ((unsigned long)v & 3));

} /**

Arguably, that should've been something like:

((unsigned long)v & (size-1))

I suppose.

...

diff --git a/lib/Kconfig.debug b/lib/Kconfig.debug index dc0e0c6ed075..1c7e30cdfe04 100644 --- a/lib/Kconfig.debug +++ b/lib/Kconfig.debug @@ -1363,6 +1363,16 @@ config DEBUG_PREEMPT depending on workload as it triggers debugging routines for each this_cpu operation. It should only be used for debugging purposes. +config DEBUG_ATOMIC
bool "Debug atomic variables"

depends on DEBUG_KERNEL

help
 If you say Y here then the kernel will add a runtime alignment check
 to atomic accesses. Useful for architectures that do not have trap on
 mis-aligned access.
 This option has potentially significant overhead.
menu "Lock Debugging (spinlocks, mutexes, etc...)" config LOCK_DEBUGGING_SUPPORT

Finn Thain

8 Sep 8 Sep

1:30 a.m.

On Mon, 1 Sep 2025, Peter Zijlstra wrote:

...

On Mon, Sep 01, 2025 at 11:36:00AM +0200, Peter Zijlstra wrote:

...
Something like the completely untested below should do I suppose.

diff --git a/include/linux/instrumented.h b/include/linux/instrumented.h index 711a1f0d1a73..e39cdfe5a59e 100644 --- a/include/linux/instrumented.h +++ b/include/linux/instrumented.h @@ -67,6 +67,7 @@ static __always_inline void instrument_atomic_read(const volatile void *v, size_ { kasan_check_read(v, size); kcsan_check_atomic_read(v, size);

WARN_ON_ONCE(IS_ENABLED(CONFIG_DEBUG_ATOMIC) && ((unsigned long)v & 3));

} /** @@ -81,6 +82,7 @@ static __always_inline void instrument_atomic_write(const volatile void *v, size { kasan_check_write(v, size); kcsan_check_atomic_write(v, size);

WARN_ON_ONCE(IS_ENABLED(CONFIG_DEBUG_ATOMIC) && ((unsigned long)v & 3));

} /** @@ -95,6 +97,7 @@ static __always_inline void instrument_atomic_read_write(const volatile void *v, { kasan_check_write(v, size); kcsan_check_atomic_read_write(v, size);

WARN_ON_ONCE(IS_ENABLED(CONFIG_DEBUG_ATOMIC) && ((unsigned long)v & 3));

} /**

Arguably, that should've been something like:

((unsigned long)v & (size-1))

I suppose.

I've been testing this patch and these new WARN_ON splats have revealed some issues.

To start with, I overlooked atomic64_t in my patch. But aligning that isn't sufficient in some cases: we have kmem caches of structs containing atomic64_t members, where kmem_cache_create() is used with a default alignment of 4, for example, the struct inode cache in fs/proc/inode.c. So I changed the above CONFIG_DEBUG_ATOMIC test to (unsigned long)v & (sizeof(long) - 1).

Another issue I encountered was atomic operations used on non-atomic types. The try_cmpxchg() in task_work_add() triggers the CONFIG_DEBUG_ATOMIC misalignment WARN because of the 850 byte offset of task_works into struct task_struct. I got around this by adding __aligned(sizeof(long)) to task_works, but maybe __aligned(sizeof(atomic_t)) would be better (?)

Another example of this problem (i.e. atomic operation used with non-atomic type) appears in wait_task_zombie() in kernel/exit.c, where cmpxchg() is used on exit_state, found at offset 418 in struct task_struct. I prevented this by adding __aligned(sizeof(long)) to exit_state. I'm not sure what the right patch is.

A different problem shows up in spi_setup_transport_attrs() where spi_dv_mutex() is used to coerce starget_data into struct spi_transport_attrs, which happens to contain some atomic storage. starget_data is found at the end of the dynamically allocated struct scsi_target (not an uncommon pattern). So starget_data ends up at offset 290, and struct spi_transport_attrs is also misaligned, as is dv_mutex. To get around this, I changed the definition of struct scsi_target:

--- a/include/scsi/scsi_device.h +++ b/include/scsi/scsi_device.h @@ -370,7 +370,7 @@ struct scsi_target { char scsi_level; enum scsi_target_state state; void *hostdata; /* available to low-level driver */ - unsigned long starget_data[]; /* for the transport */ + atomic64_t starget_data[]; /* for the transport */ /* starget_data must be the last element!!!! */ } __attribute__((aligned(sizeof(unsigned long))));

This patch works because starget_data is never accessed except where it is being cast as some other type. But there's probably a reason for the 'long' here that I don't know about... maybe I should use atomic_long_t or atomic_t.

I found a few structures in the VFS layer with a misaligned lock, which I patched my way around. There's still a WARN splat from down_write() for semaphores somewhere in the VFS and TTY layers, but I was unable to track down the relevant memory allocations:

[ 59.500000] ------------[ cut here ]------------ [ 59.500000] WARNING: CPU: 0 PID: 329 at ./include/linux/instrumented.h:101 rwsem_down_write_slowpath+0x26a/0x4ca [ 59.500000] Modules linked in: [ 59.500000] CPU: 0 UID: 0 PID: 329 Comm: (udev-worker) Not tainted 6.16.0-multi-00009-g2b4218de16c4 #1 NONE [ 59.500000] Stack from 01e4be64: [ 59.500000] 01e4be64 005b83e3 005b83e3 005aa650 00000065 00000009 01e4be88 00007398 [ 59.500000] 005b83e3 01e4be98 000353a0 005aa41f 01e3024c 01e4bed0 00002594 005aa650 [ 59.500000] 00000065 00507f2a 00000009 00000000 00000000 00000000 00000002 00000002 [ 59.500000] 00000000 00000000 01e3024c 01e4bf40 00507f2a 005aa650 00000065 00000009 [ 59.500000] 00000000 00000000 00000000 00141ef8 0013e360 fffffffe 0aba9500 01e3024c [ 59.500000] 01e4bfa0 00f981b0 00000001 01e4bf2a 01e30254 01e4bf2a 00070000 00020000 [ 59.500000] Call Trace: [<00007398>] dump_stack+0x10/0x16 [ 59.500000] [<000353a0>] __warn+0xd0/0xf8 [ 59.500000] [<00002594>] warn_slowpath_fmt+0x54/0x62 [ 59.500000] [<00507f2a>] rwsem_down_write_slowpath+0x26a/0x4ca [ 59.500000] [<00507f2a>] rwsem_down_write_slowpath+0x26a/0x4ca [ 59.500000] [<00141ef8>] iput+0x0/0x1fe [ 59.500000] [<0013e360>] dput+0x0/0x160 [ 59.500000] [<00070000>] defer_console_output+0x28/0x34 [ 59.500000] [<00020000>] _FP_CALL_TOP+0x25c2/0xd512 [ 59.500000] [<000101e4>] ikbd_mouse_y0_bot+0x4/0x1c [ 59.500000] [<0000ffff>] atari_keyb_init+0xf7/0x17a [ 59.500000] [<005081c2>] down_write+0x38/0xf6 [ 59.500000] [<0013733a>] do_unlinkat+0xc8/0x264 [ 59.500000] [<0013755e>] sys_unlink+0x1c/0x20 [ 59.500000] [<0000876a>] syscall+0x8/0xc [ 59.500000] [<0000c048>] die_if_kernel.part.0+0x2c/0x40 [ 59.500000] [ 59.500000] ---[ end trace 0000000000000000 ]---

All of my testing was done on m68k. It would be interesting to try this on some other 32-bit architecture that tolerates misaligned accesses. More misaligned 64-bit atomics would be unsurprising.

The patches for these experiments may be found at, https://github.com/fthain/linux/tree/atomic_t

Eero Tamminen

26 Aug 26 Aug

3:22 p.m.

Hi Finn & Lance,

On 25.8.2025 5.03, Finn Thain wrote:

...

Some recent commits incorrectly assumed the natural alignment of locks. That assumption fails on Linux/m68k (and, interestingly, would have failed on Linux/cris also). This leads to spurious warnings from the hang check code. Fix this bug by adding the necessary 'aligned' attribute.

[...]

...

Reported-by: Eero Tamminen oak@helsinkinet.fi Closes: https://lore.kernel.org/lkml/CAMuHMdW7Ab13DdGs2acMQcix5ObJK0O2dG_Fxzr8_g58Rc... Fixes: e711faaafbe5 ("hung_task: replace blocker_mutex with encoded blocker") Signed-off-by: Finn Thain fthain@linux-m68k.org

I tested this on m68k using GCC and it fixed the problem for me. AFAIK, the other architectures naturally align ints already so I'm expecting to see no effect there.

Yes, it fixes both of the issues (warnings & broken console): Tested-by: Eero Tamminen oak@helsinkinet.fi

(Emulated Atari Falcon) boot up performance with this is within normal variation.

On 23.8.2025 10.49, Lance Yang wrote:

...

Anyway, I've prepared two patches for discussion, either of which should fix the alignment issue :)

Patch A[1] adjusts the runtime checks to handle unaligned pointers. Patch B[2] enforces 4-byte alignment on the core lock structures.

Both tested on x86-64.

[1]

https://lore.kernel.org/lkml/20250823050036.7748-1-lance.yang@linux.dev

...

[2] https://lore.kernel.org/lkml/20250823074048.92498-1- lance.yang@linux.dev

Same goes for both of these, except that removing warnings makes minimal kernel boot 1-2% faster than 4-aligning the whole struct.

- Eero

Lance Yang

5:33 p.m.

Hi Eero,

On 2025/8/26 23:22, Eero Tamminen wrote:

...

Hi Finn & Lance,

On 25.8.2025 5.03, Finn Thain wrote:

...
Some recent commits incorrectly assumed the natural alignment of locks. That assumption fails on Linux/m68k (and, interestingly, would have failed on Linux/cris also). This leads to spurious warnings from the hang check code. Fix this bug by adding the necessary 'aligned' attribute.

[...]

...
Reported-by: Eero Tamminen oak@helsinkinet.fi Closes: https://lore.kernel.org/lkml/ CAMuHMdW7Ab13DdGs2acMQcix5ObJK0O2dG_Fxzr8_g58Rc1_0g@mail.gmail.com/ Fixes: e711faaafbe5 ("hung_task: replace blocker_mutex with encoded blocker") Signed-off-by: Finn Thain fthain@linux-m68k.org

I tested this on m68k using GCC and it fixed the problem for me. AFAIK, the other architectures naturally align ints already so I'm expecting to see no effect there.

Yes, it fixes both of the issues (warnings & broken console): Tested-by: Eero Tamminen oak@helsinkinet.fi

(Emulated Atari Falcon) boot up performance with this is within normal variation.

On 23.8.2025 10.49, Lance Yang wrote:

...
Anyway, I've prepared two patches for discussion, either of which should fix the alignment issue :)

Patch A[1] adjusts the runtime checks to handle unaligned pointers. Patch B[2] enforces 4-byte alignment on the core lock structures.

Both tested on x86-64.

[1] https://lore.kernel.org/lkml/20250823050036.7748-1-

lance.yang@linux.dev

...
[2] https://lore.kernel.org/lkml/20250823074048.92498-1- lance.yang@linux.dev

Same goes for both of these, except that removing warnings makes minimal kernel boot 1-2% faster than 4-aligning the whole struct.

Thanks a lot for testing!

Cheers, Lance

Geert Uytterhoeven

1 Sep 1 Sep

8:51 a.m.

Hi Eero,

On Tue, 26 Aug 2025 at 17:22, Eero Tamminen oak@helsinkinet.fi wrote:

...

On 25.8.2025 5.03, Finn Thain wrote:

...
Some recent commits incorrectly assumed the natural alignment of locks. That assumption fails on Linux/m68k (and, interestingly, would have failed on Linux/cris also). This leads to spurious warnings from the hang check code. Fix this bug by adding the necessary 'aligned' attribute.

[...]

...
Reported-by: Eero Tamminen oak@helsinkinet.fi Closes: https://lore.kernel.org/lkml/CAMuHMdW7Ab13DdGs2acMQcix5ObJK0O2dG_Fxzr8_g58Rc... Fixes: e711faaafbe5 ("hung_task: replace blocker_mutex with encoded blocker") Signed-off-by: Finn Thain fthain@linux-m68k.org

I tested this on m68k using GCC and it fixed the problem for me. AFAIK, the other architectures naturally align ints already so I'm expecting to see no effect there.

Yes, it fixes both of the issues (warnings & broken console): Tested-by: Eero Tamminen oak@helsinkinet.fi

(Emulated Atari Falcon) boot up performance with this is within normal variation.

On 23.8.2025 10.49, Lance Yang wrote:

...
Anyway, I've prepared two patches for discussion, either of which should fix the alignment issue :)

Patch A[1] adjusts the runtime checks to handle unaligned pointers. Patch B[2] enforces 4-byte alignment on the core lock structures.

Both tested on x86-64.

[1]

https://lore.kernel.org/lkml/20250823050036.7748-1-lance.yang@linux.dev

...
[2] https://lore.kernel.org/lkml/20250823074048.92498-1- lance.yang@linux.dev

Same goes for both of these, except that removing warnings makes minimal kernel boot 1-2% faster than 4-aligning the whole struct.

That is an interesting outcome! So the gain of naturally-aligning the lock is more than offset by the increased cache pressure due to wasting (a bit?) more memory.

Do you know what was the impact on total kernel size? Thanks!

Gr{oetje,eeting}s,

Geert

Eero Tamminen

3:12 p.m.

Hi Geert,

On 1.9.2025 11.51, Geert Uytterhoeven wrote:

...

...
On 23.8.2025 10.49, Lance Yang wrote:

...
Anyway, I've prepared two patches for discussion, either of which should fix the alignment issue :)

Patch A[1] adjusts the runtime checks to handle unaligned pointers. Patch B[2] enforces 4-byte alignment on the core lock structures.

Both tested on x86-64.

[1]

https://lore.kernel.org/lkml/20250823050036.7748-1-lance.yang@linux.dev

...
[2] https://lore.kernel.org/lkml/20250823074048.92498-1- lance.yang@linux.dev

Same goes for both of these, except that removing warnings makes minimal kernel boot 1-2% faster than 4-aligning the whole struct.

Note that above result was from (emulated) 68030 Falcon, i.e. something that has really small caches (256-byte i-/d-cache), *and* a kernel config using CONFIG_CC_OPTIMIZE_FOR_SIZE=y (with GCC 12.2).

...

That is an interesting outcome! So the gain of naturally-aligning the lock is more than offset by the increased cache pressure due to wasting (a bit?) more memory.

Another reason could be those extra inlined warning checks in: ----------------------------------------------------- $ git grep -e hung_task_set_blocker -e hung_task_clear_blocker kernel/ kernel/locking/mutex.c: hung_task_set_blocker(lock, BLOCKER_TYPE_MUTEX); kernel/locking/mutex.c: hung_task_clear_blocker(); kernel/locking/rwsem.c: hung_task_set_blocker(sem, BLOCKER_TYPE_RWSEM_READER); kernel/locking/rwsem.c: hung_task_clear_blocker(); kernel/locking/rwsem.c: hung_task_set_blocker(sem, BLOCKER_TYPE_RWSEM_WRITER); kernel/locking/rwsem.c: hung_task_clear_blocker(); kernel/locking/semaphore.c: hung_task_set_blocker(sem, BLOCKER_TYPE_SEM); kernel/locking/semaphore.c: hung_task_clear_blocker(); -----------------------------------------------------

...

Do you know what was the impact on total kernel size?

As expected, kernel code size is smaller with the static inlined warn checks removed: ----------------------------------------------------- $ size vmlinux-m68k-6.16-fix1 vmlinux-m68k-6.16-fix2 text data bss dec hex filename 3088520 953532 84224 4126276 3ef644 vmlinux-m68k-6.16-fix1 [1] 3088730 953564 84192 4126486 3ef716 vmlinux-m68k-6.16-fix2 [2] -----------------------------------------------------

But could aligning of structs have caused 32 bytes moving from BSS to DATA section?

- Eero

PS. I profiled these 3 kernels on emulated Falcon. According to (Hatari) profiler, main difference in the kernel with the warnings removed, is it doing less than half of the calls to NCR5380_read() / atari_scsi_reg_read(), compared to the other 2 versions.

These additional 2x calls in the other two versions, seem to mostly come through chain originating from process_scheduled_works(), NCR5380_poll_politely*() functions and bus probing.

After quick look at the WARN_ON_ONCE()s and SCSI code, I have no idea how having those checks being inlined to locking functions, or not, would cause a difference like that. I've tried patching & building kernels again, and repeating profiling, but result is same.

While Hatari call (graph) tracking might have some issue (due to kernel stack return address manipulation), I don't see how there could be a problem with the profiler instruction counts. Kernel code at given address does not change during boot in monolithic kernel, (emulator) profiler tracks _every_ executed instruction/address, and it's clearly correct function: ------------------------------------ # disassembly with profile data: <instructions percentage>% (<sum of instructions>, <sum of cycles>, <sum of i-cache misses>, <sum of d-cache hits>) ... atari_scsi_falcon_reg_read: $001dd826 link.w a6,#$0 0.43% (414942, 1578432, 44701, 0) $001dd82a move.w sr,d1 0.43% (414942, 224, 8, 0) $001dd82c ori.w #$700,sr 0.43% (414942, 414368, 44705, 0) $001dd830 move.l $8(a6),d0 0.43% (414942, 357922, 44705, 414911) $001dd834 addi.l #$88,d0 0.43% (414942, 1014804, 133917, 0) $001dd83a move.w d0,$8606.w 0.43% (414942, 3618352, 89169, 0) $001dd83e move.w $8604.w,d0 0.43% (414942, 3620646, 89162, 0) $001dd842 move.w d1,sr 0.43% (414942, 2148, 142, 0) $001dd844 unlk a6 0.43% (414942, 436, 0, 414893) $001dd846 rts 0.43% (414942, 1073934, 134123, 414942) atari_scsi_falcon_reg_write: $001dd848 link.w a6,#$0 0.00% (81, 484, 29, 0) $001dd84c move.l $c(a6),d0 0.00% (81, 326, 29, 73) ... ------------------------------------

Maybe those WARN_ON_ONCE() checks just happen to slow down something marginally so that things get interrupted & re-started more for the SCSI code?

PPS. emulated machine has no SCSI drives, only one IDE drive (with 4MB Busybox partition): ---------------------------------------------------- scsi host0: Atari native SCSI, irq 15, io_port 0x0, base 0x0, can_queue 1, cmd_per_lun 2, sg_tablesize 1, this_id 7, flags { } atari-falcon-ide atari-falcon-ide: Atari Falcon and Q40/Q60 PATA controller scsi host1: pata_falcon ata1: PATA max PIO4 cmd fff00000 ctl fff00038 data fff00000 no IRQ, using PIO polling ... ata1: found unknown device (class 0) ata1.00: ATA-7: Hatari IDE disk 4M, 1.0, max UDMA/100 ata1.00: 8192 sectors, multi 16: LBA48 ata1.00: configured for PIO ... scsi 1:0:0:0: Direct-Access ATA Hatari IDE disk 1.0 PQ: 0 ANSI: 5 sd 1:0:0:0: [sda] 8192 512-byte logical blocks: (4.19 MB/4.00 MiB) sd 1:0:0:0: [sda] Write Protect is off sd 1:0:0:0: [sda] Mode Sense: 00 3a 00 00 sd 1:0:0:0: [sda] Write cache: disabled, read cache: enabled, doesn't support DPO or FUA sd 1:0:0:0: [sda] Preferred minimum I/O size 512 bytes sd 1:0:0:0: [sda] Attached SCSI disk VFS: Mounted root (ext2 filesystem) readonly on device 8:0. ---------------------------------------------------

David Laight

6 Sep 6 Sep

11:50 a.m.

On Mon, 1 Sep 2025 18:12:53 +0300 Eero Tamminen oak@helsinkinet.fi wrote:

...

Hi Geert,

On 1.9.2025 11.51, Geert Uytterhoeven wrote:

...
...
On 23.8.2025 10.49, Lance Yang wrote:

...
Anyway, I've prepared two patches for discussion, either of which should fix the alignment issue :)

Patch A[1] adjusts the runtime checks to handle unaligned pointers. Patch B[2] enforces 4-byte alignment on the core lock structures.

Both tested on x86-64.

[1]

https://lore.kernel.org/lkml/20250823050036.7748-1-lance.yang@linux.dev

...
[2] https://lore.kernel.org/lkml/20250823074048.92498-1- lance.yang@linux.dev

Same goes for both of these, except that removing warnings makes minimal kernel boot 1-2% faster than 4-aligning the whole struct.

Note that above result was from (emulated) 68030 Falcon, i.e. something that has really small caches (256-byte i-/d-cache), *and* a kernel config using CONFIG_CC_OPTIMIZE_FOR_SIZE=y (with GCC 12.2).

If you are emulating it on x86 the misaligned memory accesses are likely to be zero cost. On a real '030 I'd expect them to be implemented as two memory accesses. I also doubt (but a guess) that the emulator even attempts to emulate the '030 caches. If they are like the '020 ones the i-cache really only helps short loops.

It is more likely that the cost of WARN_ON_ONCE() is far more than you might expect. Especially since it will affect register allocation in the function(s).

David

...

...
That is an interesting outcome! So the gain of naturally-aligning the lock is more than offset by the increased cache pressure due to wasting (a bit?) more memory.

Another reason could be those extra inlined warning checks in:

$ git grep -e hung_task_set_blocker -e hung_task_clear_blocker kernel/ kernel/locking/mutex.c: hung_task_set_blocker(lock, BLOCKER_TYPE_MUTEX); kernel/locking/mutex.c: hung_task_clear_blocker(); kernel/locking/rwsem.c: hung_task_set_blocker(sem, BLOCKER_TYPE_RWSEM_READER); kernel/locking/rwsem.c: hung_task_clear_blocker(); kernel/locking/rwsem.c: hung_task_set_blocker(sem, BLOCKER_TYPE_RWSEM_WRITER); kernel/locking/rwsem.c: hung_task_clear_blocker(); kernel/locking/semaphore.c: hung_task_set_blocker(sem, BLOCKER_TYPE_SEM); kernel/locking/semaphore.c: hung_task_clear_blocker();

...
Do you know what was the impact on total kernel size?

As expected, kernel code size is smaller with the static inlined warn checks removed:

$ size vmlinux-m68k-6.16-fix1 vmlinux-m68k-6.16-fix2 text data bss dec hex filename 3088520 953532 84224 4126276 3ef644 vmlinux-m68k-6.16-fix1 [1] 3088730 953564 84192 4126486 3ef716 vmlinux-m68k-6.16-fix2 [2]

But could aligning of structs have caused 32 bytes moving from BSS to DATA section?

Eero

PS. I profiled these 3 kernels on emulated Falcon. According to (Hatari) profiler, main difference in the kernel with the warnings removed, is it doing less than half of the calls to NCR5380_read() / atari_scsi_reg_read(), compared to the other 2 versions.

These additional 2x calls in the other two versions, seem to mostly come through chain originating from process_scheduled_works(), NCR5380_poll_politely*() functions and bus probing.

After quick look at the WARN_ON_ONCE()s and SCSI code, I have no idea how having those checks being inlined to locking functions, or not, would cause a difference like that. I've tried patching & building kernels again, and repeating profiling, but result is same.

While Hatari call (graph) tracking might have some issue (due to kernel stack return address manipulation), I don't see how there could be a problem with the profiler instruction counts. Kernel code at given address does not change during boot in monolithic kernel, (emulator) profiler tracks _every_ executed instruction/address, and it's clearly correct function:

# disassembly with profile data: <instructions percentage>% (<sum of instructions>, <sum of cycles>, <sum of i-cache misses>, <sum of d-cache hits>) ... atari_scsi_falcon_reg_read: $001dd826 link.w a6,#$0 0.43% (414942, 1578432, 44701, 0) $001dd82a move.w sr,d1 0.43% (414942, 224, 8, 0) $001dd82c ori.w #$700,sr 0.43% (414942, 414368, 44705, 0) $001dd830 move.l $8(a6),d0 0.43% (414942, 357922, 44705, 414911) $001dd834 addi.l #$88,d0 0.43% (414942, 1014804, 133917, 0) $001dd83a move.w d0,$8606.w 0.43% (414942, 3618352, 89169, 0) $001dd83e move.w $8604.w,d0 0.43% (414942, 3620646, 89162, 0) $001dd842 move.w d1,sr 0.43% (414942, 2148, 142, 0) $001dd844 unlk a6 0.43% (414942, 436, 0, 414893) $001dd846 rts 0.43% (414942, 1073934, 134123, 414942) atari_scsi_falcon_reg_write: $001dd848 link.w a6,#$0 0.00% (81, 484, 29, 0) $001dd84c move.l $c(a6),d0 0.00% (81, 326, 29, 73) ...

Maybe those WARN_ON_ONCE() checks just happen to slow down something marginally so that things get interrupted & re-started more for the SCSI code?

PPS. emulated machine has no SCSI drives, only one IDE drive (with 4MB Busybox partition):

scsi host0: Atari native SCSI, irq 15, io_port 0x0, base 0x0, can_queue 1, cmd_per_lun 2, sg_tablesize 1, this_id 7, flags { } atari-falcon-ide atari-falcon-ide: Atari Falcon and Q40/Q60 PATA controller scsi host1: pata_falcon ata1: PATA max PIO4 cmd fff00000 ctl fff00038 data fff00000 no IRQ, using PIO polling ... ata1: found unknown device (class 0) ata1.00: ATA-7: Hatari IDE disk 4M, 1.0, max UDMA/100 ata1.00: 8192 sectors, multi 16: LBA48 ata1.00: configured for PIO ... scsi 1:0:0:0: Direct-Access ATA Hatari IDE disk 1.0 PQ: 0 ANSI: 5 sd 1:0:0:0: [sda] 8192 512-byte logical blocks: (4.19 MB/4.00 MiB) sd 1:0:0:0: [sda] Write Protect is off sd 1:0:0:0: [sda] Mode Sense: 00 3a 00 00 sd 1:0:0:0: [sda] Write cache: disabled, read cache: enabled, doesn't support DPO or FUA sd 1:0:0:0: [sda] Preferred minimum I/O size 512 bytes sd 1:0:0:0: [sda] Attached SCSI disk VFS: Mounted root (ext2 filesystem) readonly on device 8:0.

Masami Hiramatsu

27 Aug 27 Aug

2:45 a.m.

On Mon, 25 Aug 2025 12:03:05 +1000 Finn Thain fthain@linux-m68k.org wrote:

...

Some recent commits incorrectly assumed the natural alignment of locks. That assumption fails on Linux/m68k (and, interestingly, would have failed on Linux/cris also). This leads to spurious warnings from the hang check code. Fix this bug by adding the necessary 'aligned' attribute.

Cc: Andrew Morton akpm@linux-foundation.org Cc: Geert Uytterhoeven geert@linux-m68k.org Cc: Lance Yang lance.yang@linux.dev Cc: Masami Hiramatsu mhiramat@kernel.org Cc: Eero Tamminen oak@helsinkinet.fi Cc: Peter Zijlstra peterz@infradead.org Cc: Will Deacon will@kernel.org Cc: stable@vger.kernel.org Reported-by: Eero Tamminen oak@helsinkinet.fi Closes: https://lore.kernel.org/lkml/CAMuHMdW7Ab13DdGs2acMQcix5ObJK0O2dG_Fxzr8_g58Rc... Fixes: e711faaafbe5 ("hung_task: replace blocker_mutex with encoded blocker") Signed-off-by: Finn Thain fthain@linux-m68k.org

This seems good anyway because unaligned atomic memory access sounds insane. So looks good to me.

Reviewed-by: Masami Hiramatsu (Google) mhiramat@kernel.org

Anyway, Lance's patch[1] is still needed. We'd better gracefully ignore if the blocker is not aligned, because hung_task blocker detection is an optional for debugging and not necessary for the kernel operation.

[1] https://lore.kernel.org/all/20250823050036.7748-1-lance.yang@linux.dev/

Thank you,

...

I tested this on m68k using GCC and it fixed the problem for me. AFAIK, the other architectures naturally align ints already so I'm expecting to see no effect there.

include/linux/types.h | 2 +- 1 file changed, 1 insertion(+), 1 deletion(-)

diff --git a/include/linux/types.h b/include/linux/types.h index 6dfdb8e8e4c3..cd5b2b0f4b02 100644 --- a/include/linux/types.h +++ b/include/linux/types.h @@ -179,7 +179,7 @@ typedef phys_addr_t resource_size_t; typedef unsigned long irq_hw_number_t; typedef struct {

int counter;

int counter __aligned(sizeof(int));

} atomic_t;

#define ATOMIC_INIT(i) { (i) }

2.49.1

-- Masami Hiramatsu (Google) mhiramat@kernel.org

days inactive

days old

linux-stable-mirror@lists.linaro.org

36 comments

participants

tags (0)

participants (8)

David Laight
Eero Tamminen
Finn Thain
Geert Uytterhoeven
Lance Yang
Lance Yang
Masami Hiramatsu
Peter Zijlstra