On Mon, May 27, 2024 at 2:21 AM Sasha Levin wrote:
>
> This is a note to let you know that I've just added the patch titled
>
> nilfs2: make superblock data array index computation sparse friendly
>
> to the 6.9-stable tree which can be found at:
> http://www.kernel.org/git/?p=linux/kernel/git/stable/stable-queue.git;a=sum…
>
> The filename of the patch is:
> nilfs2-make-superblock-data-array-index-computation-.patch
> and it can be found in the queue-6.9 subdirectory.
>
> If you, or anyone else, feels it should not be added to the stable tree,
> please let <stable(a)vger.kernel.org> know about it.
>
>
>
> commit 5017482ff3b29550015cce7f81279dc69aefd6fe
> Author: Ryusuke Konishi <konishi.ryusuke(a)gmail.com>
> Date: Tue Apr 30 17:00:19 2024 +0900
>
> nilfs2: make superblock data array index computation sparse friendly
>
> [ Upstream commit 91d743a9c8299de1fc1b47428d8bb4c85face00f ]
>
> Upon running sparse, "warning: dubious: x & !y" is output at an array
> index calculation within nilfs_load_super_block().
>
> The calculation is not wrong, but to eliminate the sparse warning, replace
> it with an equivalent calculation.
>
> Also, add a comment to make it easier to understand what the unintuitive
> array index calculation is doing and whether it's correct.
>
> Link: https://lkml.kernel.org/r/20240430080019.4242-3-konishi.ryusuke@gmail.com
> Fixes: e339ad31f599 ("nilfs2: introduce secondary super block")
> Signed-off-by: Ryusuke Konishi <konishi.ryusuke(a)gmail.com>
> Cc: Bart Van Assche <bvanassche(a)acm.org>
> Cc: Jens Axboe <axboe(a)kernel.dk>
> Cc: kernel test robot <lkp(a)intel.com>
> Signed-off-by: Andrew Morton <akpm(a)linux-foundation.org>
> Signed-off-by: Sasha Levin <sashal(a)kernel.org>
>
> diff --git a/fs/nilfs2/the_nilfs.c b/fs/nilfs2/the_nilfs.c
> index 2ae2c1bbf6d17..adbc6e87471ab 100644
> --- a/fs/nilfs2/the_nilfs.c
> +++ b/fs/nilfs2/the_nilfs.c
> @@ -592,7 +592,7 @@ static int nilfs_load_super_block(struct the_nilfs *nilfs,
> struct nilfs_super_block **sbp = nilfs->ns_sbp;
> struct buffer_head **sbh = nilfs->ns_sbh;
> u64 sb2off, devsize = bdev_nr_bytes(nilfs->ns_bdev);
> - int valid[2], swp = 0;
> + int valid[2], swp = 0, older;
>
> if (devsize < NILFS_SEG_MIN_BLOCKS * NILFS_MIN_BLOCK_SIZE + 4096) {
> nilfs_err(sb, "device size too small");
> @@ -648,9 +648,25 @@ static int nilfs_load_super_block(struct the_nilfs *nilfs,
> if (swp)
> nilfs_swap_super_block(nilfs);
>
> + /*
> + * Calculate the array index of the older superblock data.
> + * If one has been dropped, set index 0 pointing to the remaining one,
> + * otherwise set index 1 pointing to the old one (including if both
> + * are the same).
> + *
> + * Divided case valid[0] valid[1] swp -> older
> + * -------------------------------------------------------------
> + * Both SBs are invalid 0 0 N/A (Error)
> + * SB1 is invalid 0 1 1 0
> + * SB2 is invalid 1 0 0 0
> + * SB2 is newer 1 1 1 0
> + * SB2 is older or the same 1 1 0 1
> + */
> + older = valid[1] ^ swp;
> +
> nilfs->ns_sbwcount = 0;
> nilfs->ns_sbwtime = le64_to_cpu(sbp[0]->s_wtime);
> - nilfs->ns_prot_seq = le64_to_cpu(sbp[valid[1] & !swp]->s_last_seq);
> + nilfs->ns_prot_seq = le64_to_cpu(sbp[older]->s_last_seq);
> *sbpp = sbp[0];
> return 0;
> }
This commit fixes the sparse warning output by build "make C=1" with
the sparse check, but does not fix any operational bugs.
Therefore, if fixing a harmless sparse warning does not meet the
requirements for backporting to stable trees (I assume it does),
please drop it as it is a false positive pickup. Sorry if the
"Fixes:" tag is confusing.
The same goes for the same patch queued to other stable-trees.
Thanks,
Ryusuke Konishi
The patch below does not apply to the 5.4-stable tree.
If someone wants it applied there, or to any other stable or longterm
tree, then please email the backport, including the original git commit
id to <stable(a)vger.kernel.org>.
To reproduce the conflict and resubmit, you may use the following commands:
git fetch https://git.kernel.org/pub/scm/linux/kernel/git/stable/linux.git/ linux-5.4.y
git checkout FETCH_HEAD
git cherry-pick -x e60b613df8b6253def41215402f72986fee3fc8d
# <resolve conflicts, build, test, etc.>
git commit -s
git send-email --to '<stable(a)vger.kernel.org>' --in-reply-to '2024052759-earmark-vagrantly-05cf@gregkh' --subject-prefix 'PATCH 5.4.y' HEAD^..
Possible dependencies:
thanks,
greg k-h
------------------ original commit in Linus's tree ------------------
From e60b613df8b6253def41215402f72986fee3fc8d Mon Sep 17 00:00:00 2001
From: Zheng Yejian <zhengyejian1(a)huawei.com>
Date: Fri, 10 May 2024 03:28:59 +0800
Subject: [PATCH] ftrace: Fix possible use-after-free issue in
ftrace_location()
KASAN reports a bug:
BUG: KASAN: use-after-free in ftrace_location+0x90/0x120
Read of size 8 at addr ffff888141d40010 by task insmod/424
CPU: 8 PID: 424 Comm: insmod Tainted: G W 6.9.0-rc2+
[...]
Call Trace:
<TASK>
dump_stack_lvl+0x68/0xa0
print_report+0xcf/0x610
kasan_report+0xb5/0xe0
ftrace_location+0x90/0x120
register_kprobe+0x14b/0xa40
kprobe_init+0x2d/0xff0 [kprobe_example]
do_one_initcall+0x8f/0x2d0
do_init_module+0x13a/0x3c0
load_module+0x3082/0x33d0
init_module_from_file+0xd2/0x130
__x64_sys_finit_module+0x306/0x440
do_syscall_64+0x68/0x140
entry_SYSCALL_64_after_hwframe+0x71/0x79
The root cause is that, in lookup_rec(), ftrace record of some address
is being searched in ftrace pages of some module, but those ftrace pages
at the same time is being freed in ftrace_release_mod() as the
corresponding module is being deleted:
CPU1 | CPU2
register_kprobes() { | delete_module() {
check_kprobe_address_safe() { |
arch_check_ftrace_location() { |
ftrace_location() { |
lookup_rec() // USE! | ftrace_release_mod() // Free!
To fix this issue:
1. Hold rcu lock as accessing ftrace pages in ftrace_location_range();
2. Use ftrace_location_range() instead of lookup_rec() in
ftrace_location();
3. Call synchronize_rcu() before freeing any ftrace pages both in
ftrace_process_locs()/ftrace_release_mod()/ftrace_free_mem().
Link: https://lore.kernel.org/linux-trace-kernel/20240509192859.1273558-1-zhengye…
Cc: stable(a)vger.kernel.org
Cc: <mhiramat(a)kernel.org>
Cc: <mark.rutland(a)arm.com>
Cc: <mathieu.desnoyers(a)efficios.com>
Fixes: ae6aa16fdc16 ("kprobes: introduce ftrace based optimization")
Suggested-by: Steven Rostedt <rostedt(a)goodmis.org>
Signed-off-by: Zheng Yejian <zhengyejian1(a)huawei.com>
Signed-off-by: Steven Rostedt (Google) <rostedt(a)goodmis.org>
diff --git a/kernel/trace/ftrace.c b/kernel/trace/ftrace.c
index 5a01d72f66db..2308c0a2fd29 100644
--- a/kernel/trace/ftrace.c
+++ b/kernel/trace/ftrace.c
@@ -1595,12 +1595,15 @@ static struct dyn_ftrace *lookup_rec(unsigned long start, unsigned long end)
unsigned long ftrace_location_range(unsigned long start, unsigned long end)
{
struct dyn_ftrace *rec;
+ unsigned long ip = 0;
+ rcu_read_lock();
rec = lookup_rec(start, end);
if (rec)
- return rec->ip;
+ ip = rec->ip;
+ rcu_read_unlock();
- return 0;
+ return ip;
}
/**
@@ -1614,25 +1617,22 @@ unsigned long ftrace_location_range(unsigned long start, unsigned long end)
*/
unsigned long ftrace_location(unsigned long ip)
{
- struct dyn_ftrace *rec;
+ unsigned long loc;
unsigned long offset;
unsigned long size;
- rec = lookup_rec(ip, ip);
- if (!rec) {
+ loc = ftrace_location_range(ip, ip);
+ if (!loc) {
if (!kallsyms_lookup_size_offset(ip, &size, &offset))
goto out;
/* map sym+0 to __fentry__ */
if (!offset)
- rec = lookup_rec(ip, ip + size - 1);
+ loc = ftrace_location_range(ip, ip + size - 1);
}
- if (rec)
- return rec->ip;
-
out:
- return 0;
+ return loc;
}
/**
@@ -6591,6 +6591,8 @@ static int ftrace_process_locs(struct module *mod,
/* We should have used all pages unless we skipped some */
if (pg_unuse) {
WARN_ON(!skipped);
+ /* Need to synchronize with ftrace_location_range() */
+ synchronize_rcu();
ftrace_free_pages(pg_unuse);
}
return ret;
@@ -6804,6 +6806,9 @@ void ftrace_release_mod(struct module *mod)
out_unlock:
mutex_unlock(&ftrace_lock);
+ /* Need to synchronize with ftrace_location_range() */
+ if (tmp_page)
+ synchronize_rcu();
for (pg = tmp_page; pg; pg = tmp_page) {
/* Needs to be called outside of ftrace_lock */
@@ -7137,6 +7142,7 @@ void ftrace_free_mem(struct module *mod, void *start_ptr, void *end_ptr)
unsigned long start = (unsigned long)(start_ptr);
unsigned long end = (unsigned long)(end_ptr);
struct ftrace_page **last_pg = &ftrace_pages_start;
+ struct ftrace_page *tmp_page = NULL;
struct ftrace_page *pg;
struct dyn_ftrace *rec;
struct dyn_ftrace key;
@@ -7178,12 +7184,8 @@ void ftrace_free_mem(struct module *mod, void *start_ptr, void *end_ptr)
ftrace_update_tot_cnt--;
if (!pg->index) {
*last_pg = pg->next;
- if (pg->records) {
- free_pages((unsigned long)pg->records, pg->order);
- ftrace_number_of_pages -= 1 << pg->order;
- }
- ftrace_number_of_groups--;
- kfree(pg);
+ pg->next = tmp_page;
+ tmp_page = pg;
pg = container_of(last_pg, struct ftrace_page, next);
if (!(*last_pg))
ftrace_pages = pg;
@@ -7200,6 +7202,11 @@ void ftrace_free_mem(struct module *mod, void *start_ptr, void *end_ptr)
clear_func_from_hashes(func);
kfree(func);
}
+ /* Need to synchronize with ftrace_location_range() */
+ if (tmp_page) {
+ synchronize_rcu();
+ ftrace_free_pages(tmp_page);
+ }
}
void __init ftrace_free_init_mem(void)
The patch below does not apply to the 4.19-stable tree.
If someone wants it applied there, or to any other stable or longterm
tree, then please email the backport, including the original git commit
id to <stable(a)vger.kernel.org>.
To reproduce the conflict and resubmit, you may use the following commands:
git fetch https://git.kernel.org/pub/scm/linux/kernel/git/stable/linux.git/ linux-4.19.y
git checkout FETCH_HEAD
git cherry-pick -x e60b613df8b6253def41215402f72986fee3fc8d
# <resolve conflicts, build, test, etc.>
git commit -s
git send-email --to '<stable(a)vger.kernel.org>' --in-reply-to '2024052700-ferry-breeder-caa8@gregkh' --subject-prefix 'PATCH 4.19.y' HEAD^..
Possible dependencies:
thanks,
greg k-h
------------------ original commit in Linus's tree ------------------
From e60b613df8b6253def41215402f72986fee3fc8d Mon Sep 17 00:00:00 2001
From: Zheng Yejian <zhengyejian1(a)huawei.com>
Date: Fri, 10 May 2024 03:28:59 +0800
Subject: [PATCH] ftrace: Fix possible use-after-free issue in
ftrace_location()
KASAN reports a bug:
BUG: KASAN: use-after-free in ftrace_location+0x90/0x120
Read of size 8 at addr ffff888141d40010 by task insmod/424
CPU: 8 PID: 424 Comm: insmod Tainted: G W 6.9.0-rc2+
[...]
Call Trace:
<TASK>
dump_stack_lvl+0x68/0xa0
print_report+0xcf/0x610
kasan_report+0xb5/0xe0
ftrace_location+0x90/0x120
register_kprobe+0x14b/0xa40
kprobe_init+0x2d/0xff0 [kprobe_example]
do_one_initcall+0x8f/0x2d0
do_init_module+0x13a/0x3c0
load_module+0x3082/0x33d0
init_module_from_file+0xd2/0x130
__x64_sys_finit_module+0x306/0x440
do_syscall_64+0x68/0x140
entry_SYSCALL_64_after_hwframe+0x71/0x79
The root cause is that, in lookup_rec(), ftrace record of some address
is being searched in ftrace pages of some module, but those ftrace pages
at the same time is being freed in ftrace_release_mod() as the
corresponding module is being deleted:
CPU1 | CPU2
register_kprobes() { | delete_module() {
check_kprobe_address_safe() { |
arch_check_ftrace_location() { |
ftrace_location() { |
lookup_rec() // USE! | ftrace_release_mod() // Free!
To fix this issue:
1. Hold rcu lock as accessing ftrace pages in ftrace_location_range();
2. Use ftrace_location_range() instead of lookup_rec() in
ftrace_location();
3. Call synchronize_rcu() before freeing any ftrace pages both in
ftrace_process_locs()/ftrace_release_mod()/ftrace_free_mem().
Link: https://lore.kernel.org/linux-trace-kernel/20240509192859.1273558-1-zhengye…
Cc: stable(a)vger.kernel.org
Cc: <mhiramat(a)kernel.org>
Cc: <mark.rutland(a)arm.com>
Cc: <mathieu.desnoyers(a)efficios.com>
Fixes: ae6aa16fdc16 ("kprobes: introduce ftrace based optimization")
Suggested-by: Steven Rostedt <rostedt(a)goodmis.org>
Signed-off-by: Zheng Yejian <zhengyejian1(a)huawei.com>
Signed-off-by: Steven Rostedt (Google) <rostedt(a)goodmis.org>
diff --git a/kernel/trace/ftrace.c b/kernel/trace/ftrace.c
index 5a01d72f66db..2308c0a2fd29 100644
--- a/kernel/trace/ftrace.c
+++ b/kernel/trace/ftrace.c
@@ -1595,12 +1595,15 @@ static struct dyn_ftrace *lookup_rec(unsigned long start, unsigned long end)
unsigned long ftrace_location_range(unsigned long start, unsigned long end)
{
struct dyn_ftrace *rec;
+ unsigned long ip = 0;
+ rcu_read_lock();
rec = lookup_rec(start, end);
if (rec)
- return rec->ip;
+ ip = rec->ip;
+ rcu_read_unlock();
- return 0;
+ return ip;
}
/**
@@ -1614,25 +1617,22 @@ unsigned long ftrace_location_range(unsigned long start, unsigned long end)
*/
unsigned long ftrace_location(unsigned long ip)
{
- struct dyn_ftrace *rec;
+ unsigned long loc;
unsigned long offset;
unsigned long size;
- rec = lookup_rec(ip, ip);
- if (!rec) {
+ loc = ftrace_location_range(ip, ip);
+ if (!loc) {
if (!kallsyms_lookup_size_offset(ip, &size, &offset))
goto out;
/* map sym+0 to __fentry__ */
if (!offset)
- rec = lookup_rec(ip, ip + size - 1);
+ loc = ftrace_location_range(ip, ip + size - 1);
}
- if (rec)
- return rec->ip;
-
out:
- return 0;
+ return loc;
}
/**
@@ -6591,6 +6591,8 @@ static int ftrace_process_locs(struct module *mod,
/* We should have used all pages unless we skipped some */
if (pg_unuse) {
WARN_ON(!skipped);
+ /* Need to synchronize with ftrace_location_range() */
+ synchronize_rcu();
ftrace_free_pages(pg_unuse);
}
return ret;
@@ -6804,6 +6806,9 @@ void ftrace_release_mod(struct module *mod)
out_unlock:
mutex_unlock(&ftrace_lock);
+ /* Need to synchronize with ftrace_location_range() */
+ if (tmp_page)
+ synchronize_rcu();
for (pg = tmp_page; pg; pg = tmp_page) {
/* Needs to be called outside of ftrace_lock */
@@ -7137,6 +7142,7 @@ void ftrace_free_mem(struct module *mod, void *start_ptr, void *end_ptr)
unsigned long start = (unsigned long)(start_ptr);
unsigned long end = (unsigned long)(end_ptr);
struct ftrace_page **last_pg = &ftrace_pages_start;
+ struct ftrace_page *tmp_page = NULL;
struct ftrace_page *pg;
struct dyn_ftrace *rec;
struct dyn_ftrace key;
@@ -7178,12 +7184,8 @@ void ftrace_free_mem(struct module *mod, void *start_ptr, void *end_ptr)
ftrace_update_tot_cnt--;
if (!pg->index) {
*last_pg = pg->next;
- if (pg->records) {
- free_pages((unsigned long)pg->records, pg->order);
- ftrace_number_of_pages -= 1 << pg->order;
- }
- ftrace_number_of_groups--;
- kfree(pg);
+ pg->next = tmp_page;
+ tmp_page = pg;
pg = container_of(last_pg, struct ftrace_page, next);
if (!(*last_pg))
ftrace_pages = pg;
@@ -7200,6 +7202,11 @@ void ftrace_free_mem(struct module *mod, void *start_ptr, void *end_ptr)
clear_func_from_hashes(func);
kfree(func);
}
+ /* Need to synchronize with ftrace_location_range() */
+ if (tmp_page) {
+ synchronize_rcu();
+ ftrace_free_pages(tmp_page);
+ }
}
void __init ftrace_free_init_mem(void)
The patch below does not apply to the 5.10-stable tree.
If someone wants it applied there, or to any other stable or longterm
tree, then please email the backport, including the original git commit
id to <stable(a)vger.kernel.org>.
To reproduce the conflict and resubmit, you may use the following commands:
git fetch https://git.kernel.org/pub/scm/linux/kernel/git/stable/linux.git/ linux-5.10.y
git checkout FETCH_HEAD
git cherry-pick -x e60b613df8b6253def41215402f72986fee3fc8d
# <resolve conflicts, build, test, etc.>
git commit -s
git send-email --to '<stable(a)vger.kernel.org>' --in-reply-to '2024052758-tweezers-sassy-6775@gregkh' --subject-prefix 'PATCH 5.10.y' HEAD^..
Possible dependencies:
thanks,
greg k-h
------------------ original commit in Linus's tree ------------------
From e60b613df8b6253def41215402f72986fee3fc8d Mon Sep 17 00:00:00 2001
From: Zheng Yejian <zhengyejian1(a)huawei.com>
Date: Fri, 10 May 2024 03:28:59 +0800
Subject: [PATCH] ftrace: Fix possible use-after-free issue in
ftrace_location()
KASAN reports a bug:
BUG: KASAN: use-after-free in ftrace_location+0x90/0x120
Read of size 8 at addr ffff888141d40010 by task insmod/424
CPU: 8 PID: 424 Comm: insmod Tainted: G W 6.9.0-rc2+
[...]
Call Trace:
<TASK>
dump_stack_lvl+0x68/0xa0
print_report+0xcf/0x610
kasan_report+0xb5/0xe0
ftrace_location+0x90/0x120
register_kprobe+0x14b/0xa40
kprobe_init+0x2d/0xff0 [kprobe_example]
do_one_initcall+0x8f/0x2d0
do_init_module+0x13a/0x3c0
load_module+0x3082/0x33d0
init_module_from_file+0xd2/0x130
__x64_sys_finit_module+0x306/0x440
do_syscall_64+0x68/0x140
entry_SYSCALL_64_after_hwframe+0x71/0x79
The root cause is that, in lookup_rec(), ftrace record of some address
is being searched in ftrace pages of some module, but those ftrace pages
at the same time is being freed in ftrace_release_mod() as the
corresponding module is being deleted:
CPU1 | CPU2
register_kprobes() { | delete_module() {
check_kprobe_address_safe() { |
arch_check_ftrace_location() { |
ftrace_location() { |
lookup_rec() // USE! | ftrace_release_mod() // Free!
To fix this issue:
1. Hold rcu lock as accessing ftrace pages in ftrace_location_range();
2. Use ftrace_location_range() instead of lookup_rec() in
ftrace_location();
3. Call synchronize_rcu() before freeing any ftrace pages both in
ftrace_process_locs()/ftrace_release_mod()/ftrace_free_mem().
Link: https://lore.kernel.org/linux-trace-kernel/20240509192859.1273558-1-zhengye…
Cc: stable(a)vger.kernel.org
Cc: <mhiramat(a)kernel.org>
Cc: <mark.rutland(a)arm.com>
Cc: <mathieu.desnoyers(a)efficios.com>
Fixes: ae6aa16fdc16 ("kprobes: introduce ftrace based optimization")
Suggested-by: Steven Rostedt <rostedt(a)goodmis.org>
Signed-off-by: Zheng Yejian <zhengyejian1(a)huawei.com>
Signed-off-by: Steven Rostedt (Google) <rostedt(a)goodmis.org>
diff --git a/kernel/trace/ftrace.c b/kernel/trace/ftrace.c
index 5a01d72f66db..2308c0a2fd29 100644
--- a/kernel/trace/ftrace.c
+++ b/kernel/trace/ftrace.c
@@ -1595,12 +1595,15 @@ static struct dyn_ftrace *lookup_rec(unsigned long start, unsigned long end)
unsigned long ftrace_location_range(unsigned long start, unsigned long end)
{
struct dyn_ftrace *rec;
+ unsigned long ip = 0;
+ rcu_read_lock();
rec = lookup_rec(start, end);
if (rec)
- return rec->ip;
+ ip = rec->ip;
+ rcu_read_unlock();
- return 0;
+ return ip;
}
/**
@@ -1614,25 +1617,22 @@ unsigned long ftrace_location_range(unsigned long start, unsigned long end)
*/
unsigned long ftrace_location(unsigned long ip)
{
- struct dyn_ftrace *rec;
+ unsigned long loc;
unsigned long offset;
unsigned long size;
- rec = lookup_rec(ip, ip);
- if (!rec) {
+ loc = ftrace_location_range(ip, ip);
+ if (!loc) {
if (!kallsyms_lookup_size_offset(ip, &size, &offset))
goto out;
/* map sym+0 to __fentry__ */
if (!offset)
- rec = lookup_rec(ip, ip + size - 1);
+ loc = ftrace_location_range(ip, ip + size - 1);
}
- if (rec)
- return rec->ip;
-
out:
- return 0;
+ return loc;
}
/**
@@ -6591,6 +6591,8 @@ static int ftrace_process_locs(struct module *mod,
/* We should have used all pages unless we skipped some */
if (pg_unuse) {
WARN_ON(!skipped);
+ /* Need to synchronize with ftrace_location_range() */
+ synchronize_rcu();
ftrace_free_pages(pg_unuse);
}
return ret;
@@ -6804,6 +6806,9 @@ void ftrace_release_mod(struct module *mod)
out_unlock:
mutex_unlock(&ftrace_lock);
+ /* Need to synchronize with ftrace_location_range() */
+ if (tmp_page)
+ synchronize_rcu();
for (pg = tmp_page; pg; pg = tmp_page) {
/* Needs to be called outside of ftrace_lock */
@@ -7137,6 +7142,7 @@ void ftrace_free_mem(struct module *mod, void *start_ptr, void *end_ptr)
unsigned long start = (unsigned long)(start_ptr);
unsigned long end = (unsigned long)(end_ptr);
struct ftrace_page **last_pg = &ftrace_pages_start;
+ struct ftrace_page *tmp_page = NULL;
struct ftrace_page *pg;
struct dyn_ftrace *rec;
struct dyn_ftrace key;
@@ -7178,12 +7184,8 @@ void ftrace_free_mem(struct module *mod, void *start_ptr, void *end_ptr)
ftrace_update_tot_cnt--;
if (!pg->index) {
*last_pg = pg->next;
- if (pg->records) {
- free_pages((unsigned long)pg->records, pg->order);
- ftrace_number_of_pages -= 1 << pg->order;
- }
- ftrace_number_of_groups--;
- kfree(pg);
+ pg->next = tmp_page;
+ tmp_page = pg;
pg = container_of(last_pg, struct ftrace_page, next);
if (!(*last_pg))
ftrace_pages = pg;
@@ -7200,6 +7202,11 @@ void ftrace_free_mem(struct module *mod, void *start_ptr, void *end_ptr)
clear_func_from_hashes(func);
kfree(func);
}
+ /* Need to synchronize with ftrace_location_range() */
+ if (tmp_page) {
+ synchronize_rcu();
+ ftrace_free_pages(tmp_page);
+ }
}
void __init ftrace_free_init_mem(void)
The patch below does not apply to the 5.15-stable tree.
If someone wants it applied there, or to any other stable or longterm
tree, then please email the backport, including the original git commit
id to <stable(a)vger.kernel.org>.
To reproduce the conflict and resubmit, you may use the following commands:
git fetch https://git.kernel.org/pub/scm/linux/kernel/git/stable/linux.git/ linux-5.15.y
git checkout FETCH_HEAD
git cherry-pick -x e60b613df8b6253def41215402f72986fee3fc8d
# <resolve conflicts, build, test, etc.>
git commit -s
git send-email --to '<stable(a)vger.kernel.org>' --in-reply-to '2024052757-fantasy-resent-77c6@gregkh' --subject-prefix 'PATCH 5.15.y' HEAD^..
Possible dependencies:
thanks,
greg k-h
------------------ original commit in Linus's tree ------------------
From e60b613df8b6253def41215402f72986fee3fc8d Mon Sep 17 00:00:00 2001
From: Zheng Yejian <zhengyejian1(a)huawei.com>
Date: Fri, 10 May 2024 03:28:59 +0800
Subject: [PATCH] ftrace: Fix possible use-after-free issue in
ftrace_location()
KASAN reports a bug:
BUG: KASAN: use-after-free in ftrace_location+0x90/0x120
Read of size 8 at addr ffff888141d40010 by task insmod/424
CPU: 8 PID: 424 Comm: insmod Tainted: G W 6.9.0-rc2+
[...]
Call Trace:
<TASK>
dump_stack_lvl+0x68/0xa0
print_report+0xcf/0x610
kasan_report+0xb5/0xe0
ftrace_location+0x90/0x120
register_kprobe+0x14b/0xa40
kprobe_init+0x2d/0xff0 [kprobe_example]
do_one_initcall+0x8f/0x2d0
do_init_module+0x13a/0x3c0
load_module+0x3082/0x33d0
init_module_from_file+0xd2/0x130
__x64_sys_finit_module+0x306/0x440
do_syscall_64+0x68/0x140
entry_SYSCALL_64_after_hwframe+0x71/0x79
The root cause is that, in lookup_rec(), ftrace record of some address
is being searched in ftrace pages of some module, but those ftrace pages
at the same time is being freed in ftrace_release_mod() as the
corresponding module is being deleted:
CPU1 | CPU2
register_kprobes() { | delete_module() {
check_kprobe_address_safe() { |
arch_check_ftrace_location() { |
ftrace_location() { |
lookup_rec() // USE! | ftrace_release_mod() // Free!
To fix this issue:
1. Hold rcu lock as accessing ftrace pages in ftrace_location_range();
2. Use ftrace_location_range() instead of lookup_rec() in
ftrace_location();
3. Call synchronize_rcu() before freeing any ftrace pages both in
ftrace_process_locs()/ftrace_release_mod()/ftrace_free_mem().
Link: https://lore.kernel.org/linux-trace-kernel/20240509192859.1273558-1-zhengye…
Cc: stable(a)vger.kernel.org
Cc: <mhiramat(a)kernel.org>
Cc: <mark.rutland(a)arm.com>
Cc: <mathieu.desnoyers(a)efficios.com>
Fixes: ae6aa16fdc16 ("kprobes: introduce ftrace based optimization")
Suggested-by: Steven Rostedt <rostedt(a)goodmis.org>
Signed-off-by: Zheng Yejian <zhengyejian1(a)huawei.com>
Signed-off-by: Steven Rostedt (Google) <rostedt(a)goodmis.org>
diff --git a/kernel/trace/ftrace.c b/kernel/trace/ftrace.c
index 5a01d72f66db..2308c0a2fd29 100644
--- a/kernel/trace/ftrace.c
+++ b/kernel/trace/ftrace.c
@@ -1595,12 +1595,15 @@ static struct dyn_ftrace *lookup_rec(unsigned long start, unsigned long end)
unsigned long ftrace_location_range(unsigned long start, unsigned long end)
{
struct dyn_ftrace *rec;
+ unsigned long ip = 0;
+ rcu_read_lock();
rec = lookup_rec(start, end);
if (rec)
- return rec->ip;
+ ip = rec->ip;
+ rcu_read_unlock();
- return 0;
+ return ip;
}
/**
@@ -1614,25 +1617,22 @@ unsigned long ftrace_location_range(unsigned long start, unsigned long end)
*/
unsigned long ftrace_location(unsigned long ip)
{
- struct dyn_ftrace *rec;
+ unsigned long loc;
unsigned long offset;
unsigned long size;
- rec = lookup_rec(ip, ip);
- if (!rec) {
+ loc = ftrace_location_range(ip, ip);
+ if (!loc) {
if (!kallsyms_lookup_size_offset(ip, &size, &offset))
goto out;
/* map sym+0 to __fentry__ */
if (!offset)
- rec = lookup_rec(ip, ip + size - 1);
+ loc = ftrace_location_range(ip, ip + size - 1);
}
- if (rec)
- return rec->ip;
-
out:
- return 0;
+ return loc;
}
/**
@@ -6591,6 +6591,8 @@ static int ftrace_process_locs(struct module *mod,
/* We should have used all pages unless we skipped some */
if (pg_unuse) {
WARN_ON(!skipped);
+ /* Need to synchronize with ftrace_location_range() */
+ synchronize_rcu();
ftrace_free_pages(pg_unuse);
}
return ret;
@@ -6804,6 +6806,9 @@ void ftrace_release_mod(struct module *mod)
out_unlock:
mutex_unlock(&ftrace_lock);
+ /* Need to synchronize with ftrace_location_range() */
+ if (tmp_page)
+ synchronize_rcu();
for (pg = tmp_page; pg; pg = tmp_page) {
/* Needs to be called outside of ftrace_lock */
@@ -7137,6 +7142,7 @@ void ftrace_free_mem(struct module *mod, void *start_ptr, void *end_ptr)
unsigned long start = (unsigned long)(start_ptr);
unsigned long end = (unsigned long)(end_ptr);
struct ftrace_page **last_pg = &ftrace_pages_start;
+ struct ftrace_page *tmp_page = NULL;
struct ftrace_page *pg;
struct dyn_ftrace *rec;
struct dyn_ftrace key;
@@ -7178,12 +7184,8 @@ void ftrace_free_mem(struct module *mod, void *start_ptr, void *end_ptr)
ftrace_update_tot_cnt--;
if (!pg->index) {
*last_pg = pg->next;
- if (pg->records) {
- free_pages((unsigned long)pg->records, pg->order);
- ftrace_number_of_pages -= 1 << pg->order;
- }
- ftrace_number_of_groups--;
- kfree(pg);
+ pg->next = tmp_page;
+ tmp_page = pg;
pg = container_of(last_pg, struct ftrace_page, next);
if (!(*last_pg))
ftrace_pages = pg;
@@ -7200,6 +7202,11 @@ void ftrace_free_mem(struct module *mod, void *start_ptr, void *end_ptr)
clear_func_from_hashes(func);
kfree(func);
}
+ /* Need to synchronize with ftrace_location_range() */
+ if (tmp_page) {
+ synchronize_rcu();
+ ftrace_free_pages(tmp_page);
+ }
}
void __init ftrace_free_init_mem(void)
From: Josef Bacik <josef(a)toxicpanda.com>
[ Upstream commit 418b9687dece5bd763c09b5c27a801a7e3387be9 ]
nfsd is the only thing using this helper, and it doesn't use the private
currently. When we switch to per-network namespace stats we will need
the struct net * in order to get to the nfsd_net. Use the net as the
proc private so we can utilize this when we make the switch over.
Signed-off-by: Josef Bacik <josef(a)toxicpanda.com>
Reviewed-by: Jeff Layton <jlayton(a)kernel.org>
Signed-off-by: Chuck Lever <chuck.lever(a)oracle.com>
---
net/sunrpc/stats.c | 2 +-
1 file changed, 1 insertion(+), 1 deletion(-)
This pre-requisite is needed for upstream commit 4b14885411f7 ("nfsd:
make all of the nfsd stats per-network namespace").
diff --git a/net/sunrpc/stats.c b/net/sunrpc/stats.c
index 65fc1297c6df..383860cb1d5b 100644
--- a/net/sunrpc/stats.c
+++ b/net/sunrpc/stats.c
@@ -314,7 +314,7 @@ EXPORT_SYMBOL_GPL(rpc_proc_unregister);
struct proc_dir_entry *
svc_proc_register(struct net *net, struct svc_stat *statp, const struct proc_ops *proc_ops)
{
- return do_register(net, statp->program->pg_name, statp, proc_ops);
+ return do_register(net, statp->program->pg_name, net, proc_ops);
}
EXPORT_SYMBOL_GPL(svc_proc_register);
--
2.45.1
On present kernels, HPET fallback occurs on some 8-16 socket x86
systems due to the TSC adjust architectural MSR not being respected.
This was fixed upstream in commit 455f9075f14484f358b3c1d6845b4a438de198a7.
Please backport this fix to -stable for 6.9, 6.8, 6.6, 6.1, 5.15,
5.10, 5.4 and 4.19 branches to allow correct TSC operation on existing
distros using these kernels. The patch cleanly applies to all of these
latest -stable branches.
Many thanks,
Daniel
--
Daniel J Blueman
From: Andrey Konovalov <andreyknvl(a)gmail.com>
After commit 8fea0c8fda30 ("usb: core: hcd: Convert from tasklet to BH
workqueue"), usb_giveback_urb_bh() runs in the BH workqueue with
interrupts enabled.
Thus, the remote coverage collection section in usb_giveback_urb_bh()->
__usb_hcd_giveback_urb() might be interrupted, and the interrupt handler
might invoke __usb_hcd_giveback_urb() again.
This breaks KCOV, as it does not support nested remote coverage collection
sections within the same context (neither in task nor in softirq).
Update kcov_remote_start/stop_usb_softirq() to disable interrupts for the
duration of the coverage collection section to avoid nested sections in
the softirq context (in addition to such in the task context, which are
already handled).
Reported-by: Tetsuo Handa <penguin-kernel(a)i-love.sakura.ne.jp>
Closes: https://lore.kernel.org/linux-usb/0f4d1964-7397-485b-bc48-11c01e2fcbca@I-lo…
Closes: https://syzkaller.appspot.com/bug?extid=0438378d6f157baae1a2
Suggested-by: Alan Stern <stern(a)rowland.harvard.edu>
Fixes: 8fea0c8fda30 ("usb: core: hcd: Convert from tasklet to BH workqueue")
Cc: stable(a)vger.kernel.org
Acked-by: Dmitry Vyukov <dvyukov(a)google.com>
Signed-off-by: Andrey Konovalov <andreyknvl(a)gmail.com>
---
Changes v2->v3:
- Cc: stable(a)vger.kernel.org.
---
drivers/usb/core/hcd.c | 12 ++++++-----
include/linux/kcov.h | 47 ++++++++++++++++++++++++++++++++++--------
2 files changed, 45 insertions(+), 14 deletions(-)
diff --git a/drivers/usb/core/hcd.c b/drivers/usb/core/hcd.c
index c0e005670d67..fb1aa0d4fc28 100644
--- a/drivers/usb/core/hcd.c
+++ b/drivers/usb/core/hcd.c
@@ -1623,6 +1623,7 @@ static void __usb_hcd_giveback_urb(struct urb *urb)
struct usb_hcd *hcd = bus_to_hcd(urb->dev->bus);
struct usb_anchor *anchor = urb->anchor;
int status = urb->unlinked;
+ unsigned long flags;
urb->hcpriv = NULL;
if (unlikely((urb->transfer_flags & URB_SHORT_NOT_OK) &&
@@ -1640,13 +1641,14 @@ static void __usb_hcd_giveback_urb(struct urb *urb)
/* pass ownership to the completion handler */
urb->status = status;
/*
- * This function can be called in task context inside another remote
- * coverage collection section, but kcov doesn't support that kind of
- * recursion yet. Only collect coverage in softirq context for now.
+ * Only collect coverage in the softirq context and disable interrupts
+ * to avoid scenarios with nested remote coverage collection sections
+ * that KCOV does not support.
+ * See the comment next to kcov_remote_start_usb_softirq() for details.
*/
- kcov_remote_start_usb_softirq((u64)urb->dev->bus->busnum);
+ flags = kcov_remote_start_usb_softirq((u64)urb->dev->bus->busnum);
urb->complete(urb);
- kcov_remote_stop_softirq();
+ kcov_remote_stop_softirq(flags);
usb_anchor_resume_wakeups(anchor);
atomic_dec(&urb->use_count);
diff --git a/include/linux/kcov.h b/include/linux/kcov.h
index b851ba415e03..1068a7318d89 100644
--- a/include/linux/kcov.h
+++ b/include/linux/kcov.h
@@ -55,21 +55,47 @@ static inline void kcov_remote_start_usb(u64 id)
/*
* The softirq flavor of kcov_remote_*() functions is introduced as a temporary
- * work around for kcov's lack of nested remote coverage sections support in
- * task context. Adding support for nested sections is tracked in:
- * https://bugzilla.kernel.org/show_bug.cgi?id=210337
+ * workaround for KCOV's lack of nested remote coverage sections support.
+ *
+ * Adding support is tracked in https://bugzilla.kernel.org/show_bug.cgi?id=210337.
+ *
+ * kcov_remote_start_usb_softirq():
+ *
+ * 1. Only collects coverage when called in the softirq context. This allows
+ * avoiding nested remote coverage collection sections in the task context.
+ * For example, USB/IP calls usb_hcd_giveback_urb() in the task context
+ * within an existing remote coverage collection section. Thus, KCOV should
+ * not attempt to start collecting coverage within the coverage collection
+ * section in __usb_hcd_giveback_urb() in this case.
+ *
+ * 2. Disables interrupts for the duration of the coverage collection section.
+ * This allows avoiding nested remote coverage collection sections in the
+ * softirq context (a softirq might occur during the execution of a work in
+ * the BH workqueue, which runs with in_serving_softirq() > 0).
+ * For example, usb_giveback_urb_bh() runs in the BH workqueue with
+ * interrupts enabled, so __usb_hcd_giveback_urb() might be interrupted in
+ * the middle of its remote coverage collection section, and the interrupt
+ * handler might invoke __usb_hcd_giveback_urb() again.
*/
-static inline void kcov_remote_start_usb_softirq(u64 id)
+static inline unsigned long kcov_remote_start_usb_softirq(u64 id)
{
- if (in_serving_softirq())
+ unsigned long flags = 0;
+
+ if (in_serving_softirq()) {
+ local_irq_save(flags);
kcov_remote_start_usb(id);
+ }
+
+ return flags;
}
-static inline void kcov_remote_stop_softirq(void)
+static inline void kcov_remote_stop_softirq(unsigned long flags)
{
- if (in_serving_softirq())
+ if (in_serving_softirq()) {
kcov_remote_stop();
+ local_irq_restore(flags);
+ }
}
#ifdef CONFIG_64BIT
@@ -103,8 +129,11 @@ static inline u64 kcov_common_handle(void)
}
static inline void kcov_remote_start_common(u64 id) {}
static inline void kcov_remote_start_usb(u64 id) {}
-static inline void kcov_remote_start_usb_softirq(u64 id) {}
-static inline void kcov_remote_stop_softirq(void) {}
+static inline unsigned long kcov_remote_start_usb_softirq(u64 id)
+{
+ return 0;
+}
+static inline void kcov_remote_stop_softirq(unsigned long flags) {}
#endif /* CONFIG_KCOV */
#endif /* _LINUX_KCOV_H */
--
2.25.1
[CCing Greg and stable and regressions list]
Hi, Thorsten here, the Linux kernel's regression tracker. Top-posting
for once, to make this easily accessible to everyone.
Please correct me if I'm wrong, but since 6.8.10 we afaics have below
regression report about a general protection fault as well as two
reports about NFSd related NULL pointer dereferences[1] that got some
attention[2]. From a quick look I saw no fixes for those queued for the
next 6.8.y release. And the series might be EOL soon.
Hmmm. That sounds not really good. Is there some easy way to fix this?
Chuck, you earlier mentioned (see quote below) that Greg pulled in three
changes as dep into 6.8.10 that might be unneeded. Might reverting all
three be the best way forward?
Ciao, Thorsten
[1]
https://lore.kernel.org/all/CAK2bqVJoT3yy2m0OmTnqH9EAKkj6O1iTk42EyyMtvvxKh6…
and https://lore.kernel.org/all/A8DQDS.ZXN0FMYZ3DIM1@gmail.com/
[2] https://x.com/spendergrsec/status/1793489498443252143
On 24.05.24 20:21, Chuck Lever III wrote:
>> On May 24, 2024, at 4:59 AM, Jaroslav Pulchart <jaroslav.pulchart(a)gooddata.com> wrote:
>>
>>>
>>>>
>>>> On Wed, May 22, 2024 at 04:36:57AM -0400, Jaroslav Pulchart wrote:
>>>>> Hello,
>>>>>
>>>>> I would like to report some issue causing a "general protection fault"
>>>>> crash (constantly) after we updated the kernel from 6.8.9 to 6.8.10.
>>>>> This is triggered when monitoring is using nfsstat on a server where
>>>>> nfsd is running.
>>>>>
>>>>> [ 3049.260633] general protection fault, probably for non-canonical
>>>>> address 0x66fb103e19e9cc89: 0000 [#1] PREEMPT SMP NOPTI
>>>>> [ 3049.261628] CPU: 22 PID: 74991 Comm: nfsstat Tainted: G
>>>>> E 6.8.10-1.gdc.el9.x86_64 #1
>>>>> [ 3049.262336] Hardware name: RDO OpenStack Compute/RHEL, BIOS
>>>>> edk2-20240214-2.el9 02/14/2024
>>>>> [ 3049.263003] RIP: 0010:_raw_spin_lock_irqsave+0x19/0x40
>>>>> [ 3049.263487] Code: cc 90 90 90 90 90 90 90 90 90 90 90 90 90 90 90
>>>>> 90 0f 1f 44 00 00 41 54 9c 41 5c fa 65 ff 05 a6 92 f5 42 31 c0 ba 01
>>>>> 00 00 00 <f0> 0f b1 17 75 0a 4c 89 e0 41 5c c3 cc cc cc cc 89 c6 e8 d0
>>>>> 07 00
>>>>> [ 3049.264882] RSP: 0018:ffffb1bca6b9bd00 EFLAGS: 00010046
>>>>> [ 3049.265365] RAX: 0000000000000000 RBX: 66fb103e19e9c989 RCX: 0000000000000001
>>>>> [ 3049.265953] RDX: 0000000000000001 RSI: 0000000000000001 RDI: 66fb103e19e9cc89
>>>>> [ 3049.266542] RBP: ffffffffc15df280 R08: 0000000000000001 R09: ffffa049a1785cb8
>>>>> [ 3049.267112] R10: ffffb1bca6b9bd70 R11: ffffa04964e49000 R12: 0000000000000246
>>>>> [ 3049.267702] R13: 66fb103e19e9cc89 R14: ffffa048445590a0 R15: 0000000000000001
>>>>> [ 3049.268278] FS: 00007fa3ddf03740(0000) GS:ffffa05703d00000(0000)
>>>>> knlGS:0000000000000000
>>>>> [ 3049.268928] CS: 0010 DS: 0000 ES: 0000 CR0: 0000000080050033
>>>>> [ 3049.269443] CR2: 00007fa3dddfca50 CR3: 0000000342d1e004 CR4: 0000000000770ef0
>>>>> [ 3049.270025] PKRU: 55555554
>>>>> [ 3049.270371] Call Trace:
>>>>> [ 3049.270723] <TASK>
>>>>> [ 3049.271035] ? die_addr+0x33/0x90
>>>>> [ 3049.271423] ? exc_general_protection+0x1ea/0x450
>>>>> [ 3049.271879] ? asm_exc_general_protection+0x22/0x30
>>>>> [ 3049.272344] ? _raw_spin_lock_irqsave+0x19/0x40
>>>>> [ 3049.272803] __percpu_counter_sum+0xd/0x70
>>>>> [ 3049.273219] nfsd_show+0x4f/0x1d0 [nfsd]
>>>>> [ 3049.273666] seq_read_iter+0x11d/0x4d0
>>>>> [ 3049.274073] ? avc_has_perm+0x42/0xc0
>>>>> [ 3049.274489] seq_read+0xfe/0x140
>>>>> [ 3049.274866] proc_reg_read+0x56/0xa0
>>>>> [ 3049.275257] vfs_read+0xa7/0x340
>>>>> [ 3049.275647] ? __do_sys_newfstat+0x57/0x60
>>>>> [ 3049.276059] ksys_read+0x5f/0xe0
>>>>> [ 3049.276439] do_syscall_64+0x5e/0x170
>>>>> [ 3049.276836] entry_SYSCALL_64_after_hwframe+0x78/0x80
>>>>> [ 3049.277296] RIP: 0033:0x7fa3ddcfd9b2
>>>>> [ 3049.277719] Code: c0 e9 b2 fe ff ff 50 48 8d 3d ea 1d 0c 00 e8 c5
>>>>> fd 01 00 0f 1f 44 00 00 f3 0f 1e fa 64 8b 04 25 18 00 00 00 85 c0 75
>>>>> 10 0f 05 <48> 3d 00 f0 ff ff 77 56 c3 0f 1f 44 00 00 48 83 ec 28 48 89
>>>>> 54 24
>>>>> [ 3049.279139] RSP: 002b:00007ffd930672e8 EFLAGS: 00000246 ORIG_RAX:
>>>>> 0000000000000000
>>>>> [ 3049.279788] RAX: ffffffffffffffda RBX: 0000555ded47c2a0 RCX: 00007fa3ddcfd9b2
>>>>> [ 3049.280402] RDX: 0000000000000400 RSI: 0000555ded47c480 RDI: 0000000000000003
>>>>> [ 3049.281046] RBP: 00007fa3dddf75e0 R08: 0000000000000003 R09: 0000000000000077
>>>>> [ 3049.281673] R10: 000000000000005d R11: 0000000000000246 R12: 0000555ded47c2a0
>>>>> [ 3049.282307] R13: 0000000000000d68 R14: 00007fa3dddf69e0 R15: 0000000000000d68
>>>>> [ 3049.282928] </TASK>
>>>>> [ 3049.283310] Modules linked in: mptcp_diag(E) xsk_diag(E)
>>>>> raw_diag(E) unix_diag(E) af_packet_diag(E) netlink_diag(E) udp_diag(E)
>>>>> tcp_diag(E) inet_diag(E) tun(E) br_netfilter(E) bridge(E) stp(E)
>>>>> llc(E) nfsd(E) auth_rpcgss(E) nfs_acl(E) lockd(E) grace(E) sunrpc(E)
>>>>> nf_conntrack(E) nf_defrag_ipv6(E) nf_defrag_ipv4(E) binfmt_misc(E)
>>>>> zram(E) tls(E) isofs(E) vfat(E) fat(E) intel_rapl_msr(E)
>>>>> intel_rapl_common(E) kvm_amd(E) ccp(E) kvm(E) irqbypass(E)
>>>>> virtio_net(E) i2c_i801(E) virtio_gpu(E) i2c_smbus(E) net_failover(E)
>>>>> virtio_balloon(E) failover(E) virtio_dma_buf(E) fuse(E) ext4(E)
>>>>> mbcache(E) jbd2(E) sr_mod(E) cdrom(E) sg(E) ahci(E) libahci(E)
>>>>> crct10dif_pclmul(E) crc32_pclmul(Ea) polyval_clmulni(E)
>>>>> polyval_generic(E) libata(E) ghash_clmulni_intel(E) sha512_ssse3(E)
>>>>> virtio_blk(E) serio_raw(E) btrfs(E) xor(E) zstd_compress(E)
>>>>> raid6_pq(E) libcrc32c(E) crc32c_intel(E) dm_mirror(E)
>>>>> dm_region_hash(E) dm_log(E) dm_mod(E)
>>>>> [ 3049.283345] Unloaded tainted modules: edac_mce_amd(E):1 padlock_aes(E)
>>>>>
>>>>> Any suggestion on how to fix it is appreciated.
>>>>
>>>> Bisect between v6.8.9 and v6.8.10 would give us the exact point
>>>> where the failures were introduced.
>>>>
>>>> I see that GregKH pulled in:
>>>>
>>>> 26a0ddb04230 ("nfsd: rename NFSD_NET_* to NFSD_STATS_*")
>>>> b7b05f98f3f0 ("nfsd: expose /proc/net/sunrpc/nfsd in net namespaces")
>>>> abf5fb593c90 ("nfsd: make all of the nfsd stats per-network namespace")
>>>>
>>>> for v6.8.10 as a Stable-Dep-of: 18180a4550d0 ("NFSD: Fix nfsd4_encode_fattr4() crasher")
>>>>
>>>> Which is a little baffling, I don't see how those two change sets
>>>> are mechanically related to each other. But I suspect the culprit is
>>>> one of those three stat-related patches.
>>>>
>>>>
>>>> --
>>>> Chuck Lever
>>>
>>>
>>> Hello,
>>>
>>> I run bisecting. It was easy to reproduce, simple execution of
>>> "nfsstat" from terminal stuck the server:
>>>
>>> abf5fb593c90d3ab55d6cf1dea7bec8ee0bf3566 is the first bad commit
>>>
>>>
>>> $ git bisect bad
>>> abf5fb593c90d3ab55d6cf1dea7bec8ee0bf3566 is the first bad commit
>>> commit abf5fb593c90d3ab55d6cf1dea7bec8ee0bf3566 (HEAD)
>>> Author: Josef Bacik <josef(a)toxicpanda.com>
>>> Date: Fri Jan 26 10:39:47 2024 -0500
>>>
>>> nfsd: make all of the nfsd stats per-network namespace
>>>
>>> [ Upstream commit 4b14885411f74b2b0ce0eb2b39d0fffe54e5ca0d ]
>>>
>>> We have a global set of counters that we modify for all of the nfsd
>>> operations, but now that we're exposing these stats across all network
>>> namespaces we need to make the stats also be per-network namespace. We
>>> already have some caching stats that are per-network namespace, so move
>>> these definitions into the same counter and then adjust all the helpers
>>> and users of these stats to provide the appropriate nfsd_net struct so
>>> that the stats are maintained for the per-network namespace objects.
>>>
>>> Signed-off-by: Josef Bacik <josef(a)toxicpanda.com>
>>> Reviewed-by: Jeff Layton <jlayton(a)kernel.org>
>>> Signed-off-by: Chuck Lever <chuck.lever(a)oracle.com>
>>> Stable-dep-of: 18180a4550d0 ("NFSD: Fix nfsd4_encode_fattr4() crasher")
>>> Signed-off-by: Sasha Levin <sashal(a)kernel.org>
>>>
>>> fs/nfsd/cache.h | 2 --
>>> fs/nfsd/netns.h | 17 +++++++++++++++--
>>> fs/nfsd/nfs4proc.c | 6 +++---
>>> fs/nfsd/nfs4state.c | 3 ++-
>>> fs/nfsd/nfscache.c | 36 +++++++-----------------------------
>>> fs/nfsd/nfsctl.c | 12 +++---------
>>> fs/nfsd/nfsfh.c | 3 ++-
>>> fs/nfsd/stats.c | 26 ++++++++++++++------------
>>> fs/nfsd/stats.h | 54 +++++++++++++++++++-----------------------------------
>>> fs/nfsd/vfs.c | 6 ++++--
>>> 10 files changed, 69 insertions(+), 96 deletions(-)
>>>
>>> $ git bisect log
>>> git bisect start
>>> # status: waiting for both good and bad commits
>>> # good: [f3d61438b613b87afb63118bea6fb18c50ba7a6b] Linux 6.8.9
>>> git bisect good f3d61438b613b87afb63118bea6fb18c50ba7a6b
>>> # status: waiting for bad commit, 1 good commit known
>>> # bad: [a0c69a570e420e86c7569b8c052913213eef2b45] Linux 6.8.10
>>> git bisect bad a0c69a570e420e86c7569b8c052913213eef2b45
>>> # bad: [4aaed9dbe8acd2b6114458f0498a617283d6275b] hv_netvsc: Don't
>>> free decrypted memory
>>> git bisect bad 4aaed9dbe8acd2b6114458f0498a617283d6275b
>>> # bad: [ee190d04c2f99c8e557b00e997621c04592baed1] net: gro: add flush
>>> check in udp_gro_receive_segment
>>> git bisect bad ee190d04c2f99c8e557b00e997621c04592baed1
>>> # bad: [781e34b736014188ba9e46a71535237313dcda81] efi/unaccepted:
>>> touch soft lockup during memory accept
>>> git bisect bad 781e34b736014188ba9e46a71535237313dcda81
>>> # bad: [6a7b07689af6e4e023404bf69b1230f43b2a15bc] NFSD: Fix
>>> nfsd4_encode_fattr4() crasher
>>> git bisect bad 6a7b07689af6e4e023404bf69b1230f43b2a15bc
>>> # good: [e05194baae299f2148ab5f6bab659c6ce8d1f6d3] nfs: expose
>>> /proc/net/sunrpc/nfs in net namespaces
>>> git bisect good e05194baae299f2148ab5f6bab659c6ce8d1f6d3
>>> # good: [946ab150335d92f852288c1c6b0f0466b5d6e97f] power: supply:
>>> mt6360_charger: Fix of_match for usb-otg-vbus regulator
>>> git bisect good 946ab150335d92f852288c1c6b0f0466b5d6e97f
>>> # good: [b7b05f98f3f06fea3986b46e5c7fe2928676b02d] nfsd: expose
>>> /proc/net/sunrpc/nfsd in net namespaces
>>> git bisect good b7b05f98f3f06fea3986b46e5c7fe2928676b02d
>>> # bad: [0e8003af77879572dbc1df56860cbe2bfa8498f0] NFSD: add support
>>> for CB_GETATTR callback
>>> git bisect bad 0e8003af77879572dbc1df56860cbe2bfa8498f0
>>> # bad: [abf5fb593c90d3ab55d6cf1dea7bec8ee0bf3566] nfsd: make all of
>>> the nfsd stats per-network namespace
>>> git bisect bad abf5fb593c90d3ab55d6cf1dea7bec8ee0bf3566
>>> # first bad commit: [abf5fb593c90d3ab55d6cf1dea7bec8ee0bf3566] nfsd:
>>> make all of the nfsd stats per-network namespace
>>
>> I built a full 6.8.10 with reverted single commit
>> "abf5fb593c90d3ab55d6cf1dea7bec8ee0bf3566". The server does not get
>> stuck when calling "nfsstat".
>
> Good to know, but I don't think it's entirely safe to revert
> only that patch -- all three would have to come off.
>
> I can't seem to get nfsstat to trigger a problem on my
> server.
>
>
> --
> Chuck Lever
>
>
bcm4377_init_cfg() uses pci_{read,write}_config_dword() that return
PCIBIOS_* codes. The return codes are returned into the calling
bcm4377_probe() which directly returns the error which is of incorrect
type (a probe should return normal errnos).
Convert PCIBIOS_* returns code using pcibios_err_to_errno() into normal
errno before returning it from bcm4377_init_cfg. This conversion is the
easiest by adding a label next to return and doing the conversion there
once rather than adding pcibios_err_to_errno() into every single return
statement.
Fixes: 8a06127602de ("Bluetooth: hci_bcm4377: Add new driver for BCM4377 PCIe boards")
Cc: stable(a)vger.kernel.org
Signed-off-by: Ilpo Järvinen <ilpo.jarvinen(a)linux.intel.com>
---
drivers/bluetooth/hci_bcm4377.c | 18 ++++++++++--------
1 file changed, 10 insertions(+), 8 deletions(-)
diff --git a/drivers/bluetooth/hci_bcm4377.c b/drivers/bluetooth/hci_bcm4377.c
index 0c2f15235b4c..b00240109dc3 100644
--- a/drivers/bluetooth/hci_bcm4377.c
+++ b/drivers/bluetooth/hci_bcm4377.c
@@ -2134,44 +2134,46 @@ static int bcm4377_init_cfg(struct bcm4377_data *bcm4377)
BCM4377_PCIECFG_BAR0_WINDOW1,
bcm4377->hw->bar0_window1);
if (ret)
- return ret;
+ goto fail;
ret = pci_write_config_dword(bcm4377->pdev,
BCM4377_PCIECFG_BAR0_WINDOW2,
bcm4377->hw->bar0_window2);
if (ret)
- return ret;
+ goto fail;
ret = pci_write_config_dword(
bcm4377->pdev, BCM4377_PCIECFG_BAR0_CORE2_WINDOW1,
BCM4377_PCIECFG_BAR0_CORE2_WINDOW1_DEFAULT);
if (ret)
- return ret;
+ goto fail;
if (bcm4377->hw->has_bar0_core2_window2) {
ret = pci_write_config_dword(bcm4377->pdev,
BCM4377_PCIECFG_BAR0_CORE2_WINDOW2,
bcm4377->hw->bar0_core2_window2);
if (ret)
- return ret;
+ goto fail;
}
ret = pci_write_config_dword(bcm4377->pdev, BCM4377_PCIECFG_BAR2_WINDOW,
BCM4377_PCIECFG_BAR2_WINDOW_DEFAULT);
if (ret)
- return ret;
+ goto fail;
ret = pci_read_config_dword(bcm4377->pdev,
BCM4377_PCIECFG_SUBSYSTEM_CTRL, &ctrl);
if (ret)
- return ret;
+ goto fail;
if (bcm4377->hw->clear_pciecfg_subsystem_ctrl_bit19)
ctrl &= ~BIT(19);
ctrl |= BIT(16);
- return pci_write_config_dword(bcm4377->pdev,
- BCM4377_PCIECFG_SUBSYSTEM_CTRL, ctrl);
+ ret = pci_write_config_dword(bcm4377->pdev,
+ BCM4377_PCIECFG_SUBSYSTEM_CTRL, ctrl);
+fail:
+ return pcibios_err_to_errno(ret);
}
static int bcm4377_probe_dmi(struct bcm4377_data *bcm4377)
--
2.39.2
From: Nathan Lynch <nathanl(a)linux.ibm.com>
[ Upstream commit ff2e185cf73df480ec69675936c4ee75a445c3e4 ]
plpar_hcall(), plpar_hcall9(), and related functions expect callers to
provide valid result buffers of certain minimum size. Currently this
is communicated only through comments in the code and the compiler has
no idea.
For example, if I write a bug like this:
long retbuf[PLPAR_HCALL_BUFSIZE]; // should be PLPAR_HCALL9_BUFSIZE
plpar_hcall9(H_ALLOCATE_VAS_WINDOW, retbuf, ...);
This compiles with no diagnostics emitted, but likely results in stack
corruption at runtime when plpar_hcall9() stores results past the end
of the array. (To be clear this is a contrived example and I have not
found a real instance yet.)
To make this class of error less likely, we can use explicitly-sized
array parameters instead of pointers in the declarations for the hcall
APIs. When compiled with -Warray-bounds[1], the code above now
provokes a diagnostic like this:
error: array argument is too small;
is of size 32, callee requires at least 72 [-Werror,-Warray-bounds]
60 | plpar_hcall9(H_ALLOCATE_VAS_WINDOW, retbuf,
| ^ ~~~~~~
[1] Enabled for LLVM builds but not GCC for now. See commit
0da6e5fd6c37 ("gcc: disable '-Warray-bounds' for gcc-13 too") and
related changes.
Signed-off-by: Nathan Lynch <nathanl(a)linux.ibm.com>
Signed-off-by: Michael Ellerman <mpe(a)ellerman.id.au>
Link: https://msgid.link/20240408-pseries-hvcall-retbuf-v1-1-ebc73d7253cf@linux.i…
Signed-off-by: Sasha Levin <sashal(a)kernel.org>
---
arch/powerpc/include/asm/hvcall.h | 8 ++++----
1 file changed, 4 insertions(+), 4 deletions(-)
diff --git a/arch/powerpc/include/asm/hvcall.h b/arch/powerpc/include/asm/hvcall.h
index a0b17f9f1ea4e..8347f57e1c6a3 100644
--- a/arch/powerpc/include/asm/hvcall.h
+++ b/arch/powerpc/include/asm/hvcall.h
@@ -383,7 +383,7 @@ long plpar_hcall_norets(unsigned long opcode, ...);
* Used for all but the craziest of phyp interfaces (see plpar_hcall9)
*/
#define PLPAR_HCALL_BUFSIZE 4
-long plpar_hcall(unsigned long opcode, unsigned long *retbuf, ...);
+long plpar_hcall(unsigned long opcode, unsigned long retbuf[static PLPAR_HCALL_BUFSIZE], ...);
/**
* plpar_hcall_raw: - Make a hypervisor call without calculating hcall stats
@@ -397,7 +397,7 @@ long plpar_hcall(unsigned long opcode, unsigned long *retbuf, ...);
* plpar_hcall, but plpar_hcall_raw works in real mode and does not
* calculate hypervisor call statistics.
*/
-long plpar_hcall_raw(unsigned long opcode, unsigned long *retbuf, ...);
+long plpar_hcall_raw(unsigned long opcode, unsigned long retbuf[static PLPAR_HCALL_BUFSIZE], ...);
/**
* plpar_hcall9: - Make a pseries hypervisor call with up to 9 return arguments
@@ -408,8 +408,8 @@ long plpar_hcall_raw(unsigned long opcode, unsigned long *retbuf, ...);
* PLPAR_HCALL9_BUFSIZE to size the return argument buffer.
*/
#define PLPAR_HCALL9_BUFSIZE 9
-long plpar_hcall9(unsigned long opcode, unsigned long *retbuf, ...);
-long plpar_hcall9_raw(unsigned long opcode, unsigned long *retbuf, ...);
+long plpar_hcall9(unsigned long opcode, unsigned long retbuf[static PLPAR_HCALL9_BUFSIZE], ...);
+long plpar_hcall9_raw(unsigned long opcode, unsigned long retbuf[static PLPAR_HCALL9_BUFSIZE], ...);
struct hvcall_mpp_data {
unsigned long entitled_mem;
--
2.43.0
kthread creation may possibly fail inside race_signal_callback(). In
such a case stop the already started threads, put the already taken
references to them and return with error code.
Found by Linux Verification Center (linuxtesting.org).
Fixes: 2989f6451084 ("dma-buf: Add selftests for dma-fence")
Cc: stable(a)vger.kernel.org
Signed-off-by: Fedor Pchelkin <pchelkin(a)ispras.ru>
---
v2: use kthread_stop_put() to actually put the last reference as
T.J. Mercier noticed;
link to v1: https://lore.kernel.org/lkml/20240522122326.696928-1-pchelkin@ispras.ru/
drivers/dma-buf/st-dma-fence.c | 6 ++++++
1 file changed, 6 insertions(+)
diff --git a/drivers/dma-buf/st-dma-fence.c b/drivers/dma-buf/st-dma-fence.c
index b7c6f7ea9e0c..6a1bfcd0cc21 100644
--- a/drivers/dma-buf/st-dma-fence.c
+++ b/drivers/dma-buf/st-dma-fence.c
@@ -540,6 +540,12 @@ static int race_signal_callback(void *arg)
t[i].before = pass;
t[i].task = kthread_run(thread_signal_callback, &t[i],
"dma-fence:%d", i);
+ if (IS_ERR(t[i].task)) {
+ ret = PTR_ERR(t[i].task);
+ while (--i >= 0)
+ kthread_stop_put(t[i].task);
+ return ret;
+ }
get_task_struct(t[i].task);
}
--
2.39.2
From: Rand Deeb <rand.sec96(a)gmail.com>
[ Upstream commit 789c17185fb0f39560496c2beab9b57ce1d0cbe7 ]
The ssb_device_uevent() function first attempts to convert the 'dev' pointer
to 'struct ssb_device *'. However, it mistakenly dereferences 'dev' before
performing the NULL check, potentially leading to a NULL pointer
dereference if 'dev' is NULL.
To fix this issue, move the NULL check before dereferencing the 'dev' pointer,
ensuring that the pointer is valid before attempting to use it.
Found by Linux Verification Center (linuxtesting.org) with SVACE.
Signed-off-by: Rand Deeb <rand.sec96(a)gmail.com>
Signed-off-by: Kalle Valo <kvalo(a)kernel.org>
Link: https://msgid.link/20240306123028.164155-1-rand.sec96@gmail.com
Signed-off-by: Sasha Levin <sashal(a)kernel.org>
---
drivers/ssb/main.c | 4 +++-
1 file changed, 3 insertions(+), 1 deletion(-)
diff --git a/drivers/ssb/main.c b/drivers/ssb/main.c
index b9934b9c2d708..070a99a4180cc 100644
--- a/drivers/ssb/main.c
+++ b/drivers/ssb/main.c
@@ -341,11 +341,13 @@ static int ssb_bus_match(struct device *dev, struct device_driver *drv)
static int ssb_device_uevent(const struct device *dev, struct kobj_uevent_env *env)
{
- const struct ssb_device *ssb_dev = dev_to_ssb_dev(dev);
+ const struct ssb_device *ssb_dev;
if (!dev)
return -ENODEV;
+ ssb_dev = dev_to_ssb_dev(dev);
+
return add_uevent_var(env,
"MODALIAS=ssb:v%04Xid%04Xrev%02X",
ssb_dev->id.vendor, ssb_dev->id.coreid,
--
2.43.0
From: Rand Deeb <rand.sec96(a)gmail.com>
[ Upstream commit 789c17185fb0f39560496c2beab9b57ce1d0cbe7 ]
The ssb_device_uevent() function first attempts to convert the 'dev' pointer
to 'struct ssb_device *'. However, it mistakenly dereferences 'dev' before
performing the NULL check, potentially leading to a NULL pointer
dereference if 'dev' is NULL.
To fix this issue, move the NULL check before dereferencing the 'dev' pointer,
ensuring that the pointer is valid before attempting to use it.
Found by Linux Verification Center (linuxtesting.org) with SVACE.
Signed-off-by: Rand Deeb <rand.sec96(a)gmail.com>
Signed-off-by: Kalle Valo <kvalo(a)kernel.org>
Link: https://msgid.link/20240306123028.164155-1-rand.sec96@gmail.com
Signed-off-by: Sasha Levin <sashal(a)kernel.org>
---
drivers/ssb/main.c | 4 +++-
1 file changed, 3 insertions(+), 1 deletion(-)
diff --git a/drivers/ssb/main.c b/drivers/ssb/main.c
index ab080cf26c9ff..0c736d51566dc 100644
--- a/drivers/ssb/main.c
+++ b/drivers/ssb/main.c
@@ -341,11 +341,13 @@ static int ssb_bus_match(struct device *dev, struct device_driver *drv)
static int ssb_device_uevent(const struct device *dev, struct kobj_uevent_env *env)
{
- const struct ssb_device *ssb_dev = dev_to_ssb_dev(dev);
+ const struct ssb_device *ssb_dev;
if (!dev)
return -ENODEV;
+ ssb_dev = dev_to_ssb_dev(dev);
+
return add_uevent_var(env,
"MODALIAS=ssb:v%04Xid%04Xrev%02X",
ssb_dev->id.vendor, ssb_dev->id.coreid,
--
2.43.0
From: "Alessandro Carminati (Red Hat)" <alessandro.carminati(a)gmail.com>
[ Upstream commit f803bcf9208a2540acb4c32bdc3616673169f490 ]
In some systems, the netcat server can incur in delay to start listening.
When this happens, the test can randomly fail in various points.
This is an example error message:
# ip gre none gso
# encap 192.168.1.1 to 192.168.1.2, type gre, mac none len 2000
# test basic connectivity
# Ncat: Connection refused.
The issue stems from a race condition between the netcat client and server.
The test author had addressed this problem by implementing a sleep, which
I have removed in this patch.
This patch introduces a function capable of sleeping for up to two seconds.
However, it can terminate the waiting period early if the port is reported
to be listening.
Signed-off-by: Alessandro Carminati (Red Hat) <alessandro.carminati(a)gmail.com>
Signed-off-by: Andrii Nakryiko <andrii(a)kernel.org>
Link: https://lore.kernel.org/bpf/20240314105911.213411-1-alessandro.carminati@gm…
Signed-off-by: Sasha Levin <sashal(a)kernel.org>
---
tools/testing/selftests/bpf/test_tc_tunnel.sh | 13 ++++++++++++-
1 file changed, 12 insertions(+), 1 deletion(-)
diff --git a/tools/testing/selftests/bpf/test_tc_tunnel.sh b/tools/testing/selftests/bpf/test_tc_tunnel.sh
index 334bdfeab9403..365a2c7a89bad 100755
--- a/tools/testing/selftests/bpf/test_tc_tunnel.sh
+++ b/tools/testing/selftests/bpf/test_tc_tunnel.sh
@@ -72,7 +72,6 @@ cleanup() {
server_listen() {
ip netns exec "${ns2}" nc "${netcat_opt}" -l "${port}" > "${outfile}" &
server_pid=$!
- sleep 0.2
}
client_connect() {
@@ -93,6 +92,16 @@ verify_data() {
fi
}
+wait_for_port() {
+ for i in $(seq 20); do
+ if ip netns exec "${ns2}" ss ${2:--4}OHntl | grep -q "$1"; then
+ return 0
+ fi
+ sleep 0.1
+ done
+ return 1
+}
+
set -e
# no arguments: automated test, run all
@@ -190,6 +199,7 @@ setup
# basic communication works
echo "test basic connectivity"
server_listen
+wait_for_port ${port} ${netcat_opt}
client_connect
verify_data
@@ -201,6 +211,7 @@ ip netns exec "${ns1}" tc filter add dev veth1 egress \
section "encap_${tuntype}_${mac}"
echo "test bpf encap without decap (expect failure)"
server_listen
+wait_for_port ${port} ${netcat_opt}
! client_connect
if [[ "$tuntype" =~ "udp" ]]; then
--
2.43.0
From: "Alessandro Carminati (Red Hat)" <alessandro.carminati(a)gmail.com>
[ Upstream commit f803bcf9208a2540acb4c32bdc3616673169f490 ]
In some systems, the netcat server can incur in delay to start listening.
When this happens, the test can randomly fail in various points.
This is an example error message:
# ip gre none gso
# encap 192.168.1.1 to 192.168.1.2, type gre, mac none len 2000
# test basic connectivity
# Ncat: Connection refused.
The issue stems from a race condition between the netcat client and server.
The test author had addressed this problem by implementing a sleep, which
I have removed in this patch.
This patch introduces a function capable of sleeping for up to two seconds.
However, it can terminate the waiting period early if the port is reported
to be listening.
Signed-off-by: Alessandro Carminati (Red Hat) <alessandro.carminati(a)gmail.com>
Signed-off-by: Andrii Nakryiko <andrii(a)kernel.org>
Link: https://lore.kernel.org/bpf/20240314105911.213411-1-alessandro.carminati@gm…
Signed-off-by: Sasha Levin <sashal(a)kernel.org>
---
tools/testing/selftests/bpf/test_tc_tunnel.sh | 13 ++++++++++++-
1 file changed, 12 insertions(+), 1 deletion(-)
diff --git a/tools/testing/selftests/bpf/test_tc_tunnel.sh b/tools/testing/selftests/bpf/test_tc_tunnel.sh
index 088fcad138c98..38c6e9f16f41e 100755
--- a/tools/testing/selftests/bpf/test_tc_tunnel.sh
+++ b/tools/testing/selftests/bpf/test_tc_tunnel.sh
@@ -71,7 +71,6 @@ cleanup() {
server_listen() {
ip netns exec "${ns2}" nc "${netcat_opt}" -l "${port}" > "${outfile}" &
server_pid=$!
- sleep 0.2
}
client_connect() {
@@ -92,6 +91,16 @@ verify_data() {
fi
}
+wait_for_port() {
+ for i in $(seq 20); do
+ if ip netns exec "${ns2}" ss ${2:--4}OHntl | grep -q "$1"; then
+ return 0
+ fi
+ sleep 0.1
+ done
+ return 1
+}
+
set -e
# no arguments: automated test, run all
@@ -189,6 +198,7 @@ setup
# basic communication works
echo "test basic connectivity"
server_listen
+wait_for_port ${port} ${netcat_opt}
client_connect
verify_data
@@ -200,6 +210,7 @@ ip netns exec "${ns1}" tc filter add dev veth1 egress \
section "encap_${tuntype}_${mac}"
echo "test bpf encap without decap (expect failure)"
server_listen
+wait_for_port ${port} ${netcat_opt}
! client_connect
if [[ "$tuntype" =~ "udp" ]]; then
--
2.43.0
From: "Alessandro Carminati (Red Hat)" <alessandro.carminati(a)gmail.com>
[ Upstream commit f803bcf9208a2540acb4c32bdc3616673169f490 ]
In some systems, the netcat server can incur in delay to start listening.
When this happens, the test can randomly fail in various points.
This is an example error message:
# ip gre none gso
# encap 192.168.1.1 to 192.168.1.2, type gre, mac none len 2000
# test basic connectivity
# Ncat: Connection refused.
The issue stems from a race condition between the netcat client and server.
The test author had addressed this problem by implementing a sleep, which
I have removed in this patch.
This patch introduces a function capable of sleeping for up to two seconds.
However, it can terminate the waiting period early if the port is reported
to be listening.
Signed-off-by: Alessandro Carminati (Red Hat) <alessandro.carminati(a)gmail.com>
Signed-off-by: Andrii Nakryiko <andrii(a)kernel.org>
Link: https://lore.kernel.org/bpf/20240314105911.213411-1-alessandro.carminati@gm…
Signed-off-by: Sasha Levin <sashal(a)kernel.org>
---
tools/testing/selftests/bpf/test_tc_tunnel.sh | 13 ++++++++++++-
1 file changed, 12 insertions(+), 1 deletion(-)
diff --git a/tools/testing/selftests/bpf/test_tc_tunnel.sh b/tools/testing/selftests/bpf/test_tc_tunnel.sh
index 7c76b841b17bb..21bde60c95230 100755
--- a/tools/testing/selftests/bpf/test_tc_tunnel.sh
+++ b/tools/testing/selftests/bpf/test_tc_tunnel.sh
@@ -71,7 +71,6 @@ cleanup() {
server_listen() {
ip netns exec "${ns2}" nc "${netcat_opt}" -l -p "${port}" > "${outfile}" &
server_pid=$!
- sleep 0.2
}
client_connect() {
@@ -92,6 +91,16 @@ verify_data() {
fi
}
+wait_for_port() {
+ for i in $(seq 20); do
+ if ip netns exec "${ns2}" ss ${2:--4}OHntl | grep -q "$1"; then
+ return 0
+ fi
+ sleep 0.1
+ done
+ return 1
+}
+
set -e
# no arguments: automated test, run all
@@ -183,6 +192,7 @@ setup
# basic communication works
echo "test basic connectivity"
server_listen
+wait_for_port ${port} ${netcat_opt}
client_connect
verify_data
@@ -194,6 +204,7 @@ ip netns exec "${ns1}" tc filter add dev veth1 egress \
section "encap_${tuntype}_${mac}"
echo "test bpf encap without decap (expect failure)"
server_listen
+wait_for_port ${port} ${netcat_opt}
! client_connect
if [[ "$tuntype" =~ "udp" ]]; then
--
2.43.0
From: "Alessandro Carminati (Red Hat)" <alessandro.carminati(a)gmail.com>
[ Upstream commit f803bcf9208a2540acb4c32bdc3616673169f490 ]
In some systems, the netcat server can incur in delay to start listening.
When this happens, the test can randomly fail in various points.
This is an example error message:
# ip gre none gso
# encap 192.168.1.1 to 192.168.1.2, type gre, mac none len 2000
# test basic connectivity
# Ncat: Connection refused.
The issue stems from a race condition between the netcat client and server.
The test author had addressed this problem by implementing a sleep, which
I have removed in this patch.
This patch introduces a function capable of sleeping for up to two seconds.
However, it can terminate the waiting period early if the port is reported
to be listening.
Signed-off-by: Alessandro Carminati (Red Hat) <alessandro.carminati(a)gmail.com>
Signed-off-by: Andrii Nakryiko <andrii(a)kernel.org>
Link: https://lore.kernel.org/bpf/20240314105911.213411-1-alessandro.carminati@gm…
Signed-off-by: Sasha Levin <sashal(a)kernel.org>
---
tools/testing/selftests/bpf/test_tc_tunnel.sh | 13 ++++++++++++-
1 file changed, 12 insertions(+), 1 deletion(-)
diff --git a/tools/testing/selftests/bpf/test_tc_tunnel.sh b/tools/testing/selftests/bpf/test_tc_tunnel.sh
index 7c76b841b17bb..21bde60c95230 100755
--- a/tools/testing/selftests/bpf/test_tc_tunnel.sh
+++ b/tools/testing/selftests/bpf/test_tc_tunnel.sh
@@ -71,7 +71,6 @@ cleanup() {
server_listen() {
ip netns exec "${ns2}" nc "${netcat_opt}" -l -p "${port}" > "${outfile}" &
server_pid=$!
- sleep 0.2
}
client_connect() {
@@ -92,6 +91,16 @@ verify_data() {
fi
}
+wait_for_port() {
+ for i in $(seq 20); do
+ if ip netns exec "${ns2}" ss ${2:--4}OHntl | grep -q "$1"; then
+ return 0
+ fi
+ sleep 0.1
+ done
+ return 1
+}
+
set -e
# no arguments: automated test, run all
@@ -183,6 +192,7 @@ setup
# basic communication works
echo "test basic connectivity"
server_listen
+wait_for_port ${port} ${netcat_opt}
client_connect
verify_data
@@ -194,6 +204,7 @@ ip netns exec "${ns1}" tc filter add dev veth1 egress \
section "encap_${tuntype}_${mac}"
echo "test bpf encap without decap (expect failure)"
server_listen
+wait_for_port ${port} ${netcat_opt}
! client_connect
if [[ "$tuntype" =~ "udp" ]]; then
--
2.43.0
intel_th_pci_activate() uses pci_{read,write}_config_dword() that
return PCIBIOS_* codes. The value is returned as is to the caller. The
non-errno return value is returned all the way to active_store() which
then returns the value like it is an error. PCIBIOS_* return codes,
however, are positive (0x8X) so the return value of the store function
is treated as the length consumed from the write buffer which can
confuse the userspace writer.
Convert PCIBIOS_* returns code using pcibios_err_to_errno() into normal
errno before returning it from intel_th_pci_activate().
Fixes: a0e7df335afd ("intel_th: Perform time resync on capture start")
Cc: stable(a)vger.kernel.org
Signed-off-by: Ilpo Järvinen <ilpo.jarvinen(a)linux.intel.com>
---
drivers/hwtracing/intel_th/pci.c | 2 +-
1 file changed, 1 insertion(+), 1 deletion(-)
diff --git a/drivers/hwtracing/intel_th/pci.c b/drivers/hwtracing/intel_th/pci.c
index 147d338c191e..40e2c922fbe7 100644
--- a/drivers/hwtracing/intel_th/pci.c
+++ b/drivers/hwtracing/intel_th/pci.c
@@ -46,7 +46,7 @@ static int intel_th_pci_activate(struct intel_th *th)
if (err)
dev_err(&pdev->dev, "failed to read NPKDSC register\n");
- return err;
+ return pcibios_err_to_errno(err);
}
static void intel_th_pci_deactivate(struct intel_th *th)
--
2.39.2
[Why]
After supend/resume, with topology unchanged, observe that
link_address_sent of all mstb are marked as false even the topology probing
is done without any error.
It is caused by wrongly also include "ret == 0" case as a probing failure
case.
[How]
Remove inappropriate checking conditions.
Cc: Lyude Paul <lyude(a)redhat.com>
Cc: Harry Wentland <hwentlan(a)amd.com>
Cc: Jani Nikula <jani.nikula(a)intel.com>
Cc: stable(a)vger.kernel.org
Fixes: 37dfdc55ffeb ("drm/dp_mst: Cleanup drm_dp_send_link_address() a bit")
Signed-off-by: Wayne Lin <Wayne.Lin(a)amd.com>
---
drivers/gpu/drm/display/drm_dp_mst_topology.c | 4 ++--
1 file changed, 2 insertions(+), 2 deletions(-)
diff --git a/drivers/gpu/drm/display/drm_dp_mst_topology.c b/drivers/gpu/drm/display/drm_dp_mst_topology.c
index 7f8e1cfbe19d..68831f4e502a 100644
--- a/drivers/gpu/drm/display/drm_dp_mst_topology.c
+++ b/drivers/gpu/drm/display/drm_dp_mst_topology.c
@@ -2929,7 +2929,7 @@ static int drm_dp_send_link_address(struct drm_dp_mst_topology_mgr *mgr,
/* FIXME: Actually do some real error handling here */
ret = drm_dp_mst_wait_tx_reply(mstb, txmsg);
- if (ret <= 0) {
+ if (ret < 0) {
drm_err(mgr->dev, "Sending link address failed with %d\n", ret);
goto out;
}
@@ -2981,7 +2981,7 @@ static int drm_dp_send_link_address(struct drm_dp_mst_topology_mgr *mgr,
mutex_unlock(&mgr->lock);
out:
- if (ret <= 0)
+ if (ret < 0)
mstb->link_address_sent = false;
kfree(txmsg);
return ret < 0 ? ret : changed;
--
2.37.3
Until recently the "upper layer" was MTD. But following incremental
reworks to bring spi-nand support and more recently generic ECC support,
there is now an intermediate "generic NAND" layer that also needs to get
access to some values. When using "converted" ECC engines, like the
software ones, these values are already propagated correctly. But
otherwise when using good old raw NAND controller drivers, we need to
manually set these values ourselves at the end of the "scan" operation,
once these values have been negotiated.
Without this propagation, later (generic) checks like the one warning
users that the ECC strength is not high enough might simply no longer
work.
Fixes: 8c126720fe10 ("mtd: rawnand: Use the ECC framework nand_ecc_is_strong_enough() helper")
Cc: stable(a)vger.kernel.org
Reported-by: Sascha Hauer <s.hauer(a)pengutronix.de>
Closes: https://lore.kernel.org/all/Zhe2JtvvN1M4Ompw@pengutronix.de/
Signed-off-by: Miquel Raynal <miquel.raynal(a)bootlin.com>
---
Hello Sascha, this is only compile tested, would you mind checking if
that fixes your setup?
Thanks, Miquèl
drivers/mtd/nand/raw/nand_base.c | 9 ++++++++-
1 file changed, 8 insertions(+), 1 deletion(-)
diff --git a/drivers/mtd/nand/raw/nand_base.c b/drivers/mtd/nand/raw/nand_base.c
index d7dbbd469b89..acd137dd0957 100644
--- a/drivers/mtd/nand/raw/nand_base.c
+++ b/drivers/mtd/nand/raw/nand_base.c
@@ -6301,6 +6301,7 @@ static const struct nand_ops rawnand_ops = {
static int nand_scan_tail(struct nand_chip *chip)
{
struct mtd_info *mtd = nand_to_mtd(chip);
+ struct nand_device *base = &chip->base;
struct nand_ecc_ctrl *ecc = &chip->ecc;
int ret, i;
@@ -6445,9 +6446,13 @@ static int nand_scan_tail(struct nand_chip *chip)
if (!ecc->write_oob_raw)
ecc->write_oob_raw = ecc->write_oob;
- /* propagate ecc info to mtd_info */
+ /* Propagate ECC info to the generic NAND and MTD layers */
mtd->ecc_strength = ecc->strength;
+ if (!base->ecc.ctx.conf.strength)
+ base->ecc.ctx.conf.strength = ecc->strength;
mtd->ecc_step_size = ecc->size;
+ if (!base->ecc.ctx.conf.step_size)
+ base->ecc.ctx.conf.step_size = ecc->size;
/*
* Set the number of read / write steps for one page depending on ECC
@@ -6455,6 +6460,8 @@ static int nand_scan_tail(struct nand_chip *chip)
*/
if (!ecc->steps)
ecc->steps = mtd->writesize / ecc->size;
+ if (!base->ecc.ctx.nsteps)
+ base->ecc.ctx.nsteps = ecc->steps;
if (ecc->steps * ecc->size != mtd->writesize) {
WARN(1, "Invalid ECC parameters\n");
ret = -EINVAL;
--
2.40.1
The nand_read_data_op() operation, which only consists in DATA_IN
cycles, is sadly not supported by all controllers despite being very
basic. The core, for some time, supposed all drivers would support
it. An improvement to this situation for supporting more constrained
controller added a check to verify if the operation was supported before
attempting it by running the function with the check_only boolean set
first, and then possibly falling back to another (possibly slightly less
optimized) alternative.
An even newer addition moved that check very early and probe time, in
order to perform the check only once. The content of the operation was
not so important, as long as the controller driver would tell whether
such operation on the NAND bus would be possible or not. In practice, no
buffer was provided (no fake buffer or whatever) as it is anyway not
relevant for the "check_only" condition. Unfortunately, early in the
function, there is an if statement verifying that the input parameters
are right for normal use, making the early check always unsuccessful.
Fixes: 9f820fc0651c ("mtd: rawnand: Check the data only read pattern only once")
Cc: stable(a)vger.kernel.org
Reported-by: Alexander Dahl <ada(a)thorsis.com>
Closes: https://lore.kernel.org/linux-mtd/20240306-shaky-bunion-d28b65ea97d7@thorsi…
Reported-by: Steven Seeger <steven.seeger(a)flightsystems.net>
Closes: https://lore.kernel.org/linux-mtd/DM6PR05MB4506554457CF95191A670BDEF7062@DM…
Signed-off-by: Miquel Raynal <miquel.raynal(a)bootlin.com>
Reviewed-by: Alexander Dahl <ada(a)thorsis.com>
---
drivers/mtd/nand/raw/nand_base.c | 2 +-
1 file changed, 1 insertion(+), 1 deletion(-)
diff --git a/drivers/mtd/nand/raw/nand_base.c b/drivers/mtd/nand/raw/nand_base.c
index acd137dd0957..248e654ecefd 100644
--- a/drivers/mtd/nand/raw/nand_base.c
+++ b/drivers/mtd/nand/raw/nand_base.c
@@ -2173,7 +2173,7 @@ EXPORT_SYMBOL_GPL(nand_reset_op);
int nand_read_data_op(struct nand_chip *chip, void *buf, unsigned int len,
bool force_8bit, bool check_only)
{
- if (!len || !buf)
+ if (!len || (!check_only && !buf))
return -EINVAL;
if (nand_has_exec_op(chip)) {
--
2.40.1
Early during NAND identification, mtd_info fields have not yet been
initialized (namely, writesize and oobsize) and thus cannot be used for
sanity checks yet. Of course if there is a misuse of
nand_change_read_column_op() so early we won't be warned, but there is
anyway no actual check to perform at this stage as we do not yet know
the NAND geometry.
So, if the fields are empty, especially mtd->writesize which is *always*
set quite rapidly after identification, let's skip the sanity checks.
nand_change_read_column_op() is subject to be used early for ONFI/JEDEC
identification in the very unlikely case of:
- bitflips appearing in the parameter page,
- the controller driver not supporting simple DATA_IN cycles.
As nand_change_read_column_op() uses nand_fill_column_cycles() the logic
explaind above also applies in this secondary helper.
Fixes: c27842e7e11f ("mtd: rawnand: onfi: Adapt the parameter page read to constraint controllers")
Fixes: daca31765e8b ("mtd: rawnand: jedec: Adapt the parameter page read to constraint controllers")
Cc: stable(a)vger.kernel.org
Reported-by: Alexander Dahl <ada(a)thorsis.com>
Closes: https://lore.kernel.org/linux-mtd/20240306-shaky-bunion-d28b65ea97d7@thorsi…
Reported-by: Steven Seeger <steven.seeger(a)flightsystems.net>
Closes: https://lore.kernel.org/linux-mtd/DM6PR05MB4506554457CF95191A670BDEF7062@DM…
Signed-off-by: Miquel Raynal <miquel.raynal(a)bootlin.com>
---
Changes in v2:
* Dropped the double (( ))
* Fixed nand_fill_column_cycles() as well.
---
drivers/mtd/nand/raw/nand_base.c | 57 ++++++++++++++++++--------------
1 file changed, 32 insertions(+), 25 deletions(-)
diff --git a/drivers/mtd/nand/raw/nand_base.c b/drivers/mtd/nand/raw/nand_base.c
index 248e654ecefd..53e16d39af4b 100644
--- a/drivers/mtd/nand/raw/nand_base.c
+++ b/drivers/mtd/nand/raw/nand_base.c
@@ -1093,28 +1093,32 @@ static int nand_fill_column_cycles(struct nand_chip *chip, u8 *addrs,
unsigned int offset_in_page)
{
struct mtd_info *mtd = nand_to_mtd(chip);
+ bool ident_stage = !mtd->writesize;
- /* Make sure the offset is less than the actual page size. */
- if (offset_in_page > mtd->writesize + mtd->oobsize)
- return -EINVAL;
-
- /*
- * On small page NANDs, there's a dedicated command to access the OOB
- * area, and the column address is relative to the start of the OOB
- * area, not the start of the page. Asjust the address accordingly.
- */
- if (mtd->writesize <= 512 && offset_in_page >= mtd->writesize)
- offset_in_page -= mtd->writesize;
-
- /*
- * The offset in page is expressed in bytes, if the NAND bus is 16-bit
- * wide, then it must be divided by 2.
- */
- if (chip->options & NAND_BUSWIDTH_16) {
- if (WARN_ON(offset_in_page % 2))
+ /* Bypass all checks during NAND identification */
+ if (likely(!ident_stage)) {
+ /* Make sure the offset is less than the actual page size. */
+ if (offset_in_page > mtd->writesize + mtd->oobsize)
return -EINVAL;
- offset_in_page /= 2;
+ /*
+ * On small page NANDs, there's a dedicated command to access the OOB
+ * area, and the column address is relative to the start of the OOB
+ * area, not the start of the page. Asjust the address accordingly.
+ */
+ if (mtd->writesize <= 512 && offset_in_page >= mtd->writesize)
+ offset_in_page -= mtd->writesize;
+
+ /*
+ * The offset in page is expressed in bytes, if the NAND bus is 16-bit
+ * wide, then it must be divided by 2.
+ */
+ if (chip->options & NAND_BUSWIDTH_16) {
+ if (WARN_ON(offset_in_page % 2))
+ return -EINVAL;
+
+ offset_in_page /= 2;
+ }
}
addrs[0] = offset_in_page;
@@ -1123,7 +1127,7 @@ static int nand_fill_column_cycles(struct nand_chip *chip, u8 *addrs,
* Small page NANDs use 1 cycle for the columns, while large page NANDs
* need 2
*/
- if (mtd->writesize <= 512)
+ if (!ident_stage && mtd->writesize <= 512)
return 1;
addrs[1] = offset_in_page >> 8;
@@ -1436,16 +1440,19 @@ int nand_change_read_column_op(struct nand_chip *chip,
unsigned int len, bool force_8bit)
{
struct mtd_info *mtd = nand_to_mtd(chip);
+ bool ident_stage = !mtd->writesize;
if (len && !buf)
return -EINVAL;
- if (offset_in_page + len > mtd->writesize + mtd->oobsize)
- return -EINVAL;
+ if (!ident_stage) {
+ if (offset_in_page + len > mtd->writesize + mtd->oobsize)
+ return -EINVAL;
- /* Small page NANDs do not support column change. */
- if (mtd->writesize <= 512)
- return -ENOTSUPP;
+ /* Small page NANDs do not support column change. */
+ if (mtd->writesize <= 512)
+ return -ENOTSUPP;
+ }
if (nand_has_exec_op(chip)) {
const struct nand_interface_config *conf =
--
2.40.1
.setup_interface first gets called with a "target" value of
NAND_DATA_IFACE_CHECK_ONLY, in which case an error is expected
if the controller driver does not support the timing mode (NVDDR).
Fixes: a9ecc8c814e9 ("mtd: rawnand: Choose the best timings, NV-DDR included")
Signed-off-by: Val Packett <val(a)packett.cool>
Cc: stable(a)vger.kernel.org
---
drivers/mtd/nand/raw/rockchip-nand-controller.c | 6 +++---
1 file changed, 3 insertions(+), 3 deletions(-)
diff --git a/drivers/mtd/nand/raw/rockchip-nand-controller.c b/drivers/mtd/nand/raw/rockchip-nand-controller.c
index 7baaef69d..555804476 100644
--- a/drivers/mtd/nand/raw/rockchip-nand-controller.c
+++ b/drivers/mtd/nand/raw/rockchip-nand-controller.c
@@ -420,13 +420,13 @@ static int rk_nfc_setup_interface(struct nand_chip *chip, int target,
u32 rate, tc2rw, trwpw, trw2c;
u32 temp;
- if (target < 0)
- return 0;
-
timings = nand_get_sdr_timings(conf);
if (IS_ERR(timings))
return -EOPNOTSUPP;
+ if (target < 0)
+ return 0;
+
if (IS_ERR(nfc->nfc_clk))
rate = clk_get_rate(nfc->ahb_clk);
else
--
2.45.0
The platform driver conversion of EINJ mistakenly used
platform_device_del() to unwind platform_device_register_full() at
module exit. This leads to a small leak of one 'struct platform_device'
instance per module load/unload cycle. Switch to
platform_device_unregister() which performs both device_del() and final
put_device().
Fixes: 5621fafaac00 ("EINJ: Migrate to a platform driver")
Cc: <stable(a)vger.kernel.org>
Cc: Jonathan Cameron <Jonathan.Cameron(a)huawei.com>
Cc: Ben Cheatham <Benjamin.Cheatham(a)amd.com>
Signed-off-by: Dan Williams <dan.j.williams(a)intel.com>
---
drivers/acpi/apei/einj-core.c | 2 +-
1 file changed, 1 insertion(+), 1 deletion(-)
diff --git a/drivers/acpi/apei/einj-core.c b/drivers/acpi/apei/einj-core.c
index 01faca3a238a..bb9f8475ce59 100644
--- a/drivers/acpi/apei/einj-core.c
+++ b/drivers/acpi/apei/einj-core.c
@@ -903,7 +903,7 @@ static void __exit einj_exit(void)
if (einj_initialized)
platform_driver_unregister(&einj_driver);
- platform_device_del(einj_dev);
+ platform_device_unregister(einj_dev);
}
module_init(einj_init);
This is an automatic generated email to let you know that the following patch were queued:
Subject: media: mgb4: Fix double debugfs remove
Author: Martin Tůma <martin.tuma(a)digiteqautomotive.com>
Date: Tue May 21 18:22:54 2024 +0200
Fixes an error where debugfs_remove_recursive() is called first on a parent
directory and then again on a child which causes a kernel panic.
Signed-off-by: Martin Tůma <martin.tuma(a)digiteqautomotive.com>
Signed-off-by: Hans Verkuil <hverkuil-cisco(a)xs4all.nl>
Fixes: 0ab13674a9bd ("media: pci: mgb4: Added Digiteq Automotive MGB4 driver")
Cc: <stable(a)vger.kernel.org>
[hverkuil: added Fixes/Cc tags]
drivers/media/pci/mgb4/mgb4_core.c | 7 ++++---
1 file changed, 4 insertions(+), 3 deletions(-)
---
diff --git a/drivers/media/pci/mgb4/mgb4_core.c b/drivers/media/pci/mgb4/mgb4_core.c
index 60498a5abebf..ab4f07e2e560 100644
--- a/drivers/media/pci/mgb4/mgb4_core.c
+++ b/drivers/media/pci/mgb4/mgb4_core.c
@@ -642,9 +642,6 @@ static void mgb4_remove(struct pci_dev *pdev)
struct mgb4_dev *mgbdev = pci_get_drvdata(pdev);
int i;
-#ifdef CONFIG_DEBUG_FS
- debugfs_remove_recursive(mgbdev->debugfs);
-#endif
#if IS_REACHABLE(CONFIG_HWMON)
hwmon_device_unregister(mgbdev->hwmon_dev);
#endif
@@ -659,6 +656,10 @@ static void mgb4_remove(struct pci_dev *pdev)
if (mgbdev->vin[i])
mgb4_vin_free(mgbdev->vin[i]);
+#ifdef CONFIG_DEBUG_FS
+ debugfs_remove_recursive(mgbdev->debugfs);
+#endif
+
device_remove_groups(&mgbdev->pdev->dev, mgb4_pci_groups);
free_spi(mgbdev);
free_i2c(mgbdev);
When multiple streams are in use, multiple TDs might be in flight when
an endpoint is stopped. We need to issue a Set TR Dequeue Pointer for
each, to ensure everything is reset properly and the caches cleared.
Change the logic so that any N>1 TDs found active for different streams
are deferred until after the first one is processed, calling
xhci_invalidate_cancelled_tds() again from xhci_handle_cmd_set_deq() to
queue another command until we are done with all of them. Also change
the error/"should never happen" paths to ensure we at least clear any
affected TDs, even if we can't issue a command to clear the hardware
cache, and complain loudly with an xhci_warn() if this ever happens.
This problem case dates back to commit e9df17eb1408 ("USB: xhci: Correct
assumptions about number of rings per endpoint.") early on in the XHCI
driver's life, when stream support was first added. At that point, this
condition would cause TDs to not be given back at all, causing hanging
transfers - but no security bug. It was then identified but not fixed
nor made into a warning in commit 674f8438c121 ("xhci: split handling
halted endpoints into two steps"), which added a FIXME comment for the
problem case (without materially changing the behavior as far as I can
tell, though the new logic made the problem more obvious).
Then later, in commit 94f339147fc3 ("xhci: Fix failure to give back some
cached cancelled URBs."), it was acknowledged again. This commit was
unfortunately not reviewed at all, as it was authored by the maintainer
directly. Had it been, perhaps a second set of eyes would've noticed
that it does not fix the bug, but rather just makes it (much) worse.
It turns the "transfers hang" bug into a "random memory corruption" bug,
by blindly marking TDs as complete without actually clearing them at all
nor moving the dequeue pointer past them, which means they aren't actually
complete, and the xHC will try to transfer data to/from them when the
endpoint resumes, now to freed memory buffers.
This could have been a legitimate oversight, but apparently the commit
author was aware of the problem (yet still chose to submit it): It was
still mentioned as a FIXME, an xhci_dbg() was added to log the problem
condition, and the remaining issue was mentioned in the commit
description. The choice of making the log type xhci_dbg() for what is,
at this point, a completely unhandled and known broken condition is
puzzling and unfortunate, as it guarantees that no actual users would
see the log in production, thereby making it nigh undebuggable (indeed,
even if you turn on DEBUG, the message doesn't really hint at there
being a problem at all).
It took me *months* of random xHC crashes to finally find a reliable
repro and be able to do a deep dive debug session, which could all have
been avoided had this unhandled, broken condition been actually reported
with a warning, as it should have been as a bug intentionally left in
unfixed (never mind that it shouldn't have been left in at all).
> Another fix to solve clearing the caches of all stream rings with
> cancelled TDs is needed, but not as urgent.
3 years after that statement and 14 years after the original bug was
introduced, I think it's finally time to fix it. And maybe next time
let's not leave bugs unfixed (that are actually worse than the original
bug), and let's actually get people to review kernel commits please.
Fixes xHC crashes and IOMMU faults with UAS devices when handling
errors/faults. Easiest repro is to use `hdparm` to mark an early sector
(e.g. 1024) on a disk as bad, then `cat /dev/sdX > /dev/null` in a loop.
At least in the case of JMicron controllers, the read errors end up
having to cancel two TDs (for two queued requests to different streams)
and the one that didn't get cleared properly ends up faulting the xHC
entirely when it tries to access DMA pages that have since been unmapped,
referred to by the stale TDs. This normally happens quickly (after two
or three loops). After this fix, I left the `cat` in a loop running
overnight and experienced no xHC failures, with all read errors
recovered properly. Repro'd and tested on an Apple M1 Mac Mini
(dwc3 host).
On systems without an IOMMU, this bug would instead silently corrupt
freed memory, making this a security bug (even on systems with IOMMUs
this could silently corrupt memory belonging to other USB devices on the
same controller, so it's still a security bug). Given that the kernel
autoprobes partition tables, I'm pretty sure a malicious USB device
pretending to be a UAS device and reporting an error with the right
timing could deliberately trigger a UAF and write to freed memory, with
no user action.
Fixes: e9df17eb1408 ("USB: xhci: Correct assumptions about number of rings per endpoint.")
Fixes: 94f339147fc3 ("xhci: Fix failure to give back some cached cancelled URBs.")
Fixes: 674f8438c121 ("xhci: split handling halted endpoints into two steps")
Cc: stable(a)vger.kernel.org
Cc: security(a)kernel.org
Signed-off-by: Hector Martin <marcan(a)marcan.st>
---
drivers/usb/host/xhci-ring.c | 54 +++++++++++++++++++++++++++++++++++---------
drivers/usb/host/xhci.h | 1 +
2 files changed, 44 insertions(+), 11 deletions(-)
diff --git a/drivers/usb/host/xhci-ring.c b/drivers/usb/host/xhci-ring.c
index 575f0fd9c9f1..9c06502be098 100644
--- a/drivers/usb/host/xhci-ring.c
+++ b/drivers/usb/host/xhci-ring.c
@@ -1034,13 +1034,27 @@ static int xhci_invalidate_cancelled_tds(struct xhci_virt_ep *ep)
break;
case TD_DIRTY: /* TD is cached, clear it */
case TD_HALTED:
+ case TD_CLEARING_CACHE_DEFERRED:
+ if (cached_td) {
+ if (cached_td->urb->stream_id != td->urb->stream_id) {
+ /* Multiple streams case, defer move dq */
+ xhci_dbg(xhci,
+ "Move dq deferred: stream %u URB %p\n",
+ td->urb->stream_id, td->urb);
+ td->cancel_status = TD_CLEARING_CACHE_DEFERRED;
+ break;
+ }
+
+ /* Should never happen, at least try to clear the TD if it does */
+ xhci_warn(xhci,
+ "Found multiple active URBs %p and %p in stream %u?\n",
+ td->urb, cached_td->urb,
+ td->urb->stream_id);
+ td_to_noop(xhci, ring, cached_td, false);
+ cached_td->cancel_status = TD_CLEARED;
+ }
+
td->cancel_status = TD_CLEARING_CACHE;
- if (cached_td)
- /* FIXME stream case, several stopped rings */
- xhci_dbg(xhci,
- "Move dq past stream %u URB %p instead of stream %u URB %p\n",
- td->urb->stream_id, td->urb,
- cached_td->urb->stream_id, cached_td->urb);
cached_td = td;
break;
}
@@ -1060,10 +1074,16 @@ static int xhci_invalidate_cancelled_tds(struct xhci_virt_ep *ep)
if (err) {
/* Failed to move past cached td, just set cached TDs to no-op */
list_for_each_entry_safe(td, tmp_td, &ep->cancelled_td_list, cancelled_td_list) {
- if (td->cancel_status != TD_CLEARING_CACHE)
+ /*
+ * Deferred TDs need to have the deq pointer set after the above command
+ * completes, so if that failed we just give up on all of them (and
+ * complain loudly since this could cause issues due to caching).
+ */
+ if (td->cancel_status != TD_CLEARING_CACHE &&
+ td->cancel_status != TD_CLEARING_CACHE_DEFERRED)
continue;
- xhci_dbg(xhci, "Failed to clear cancelled cached URB %p, mark clear anyway\n",
- td->urb);
+ xhci_warn(xhci, "Failed to clear cancelled cached URB %p, mark clear anyway\n",
+ td->urb);
td_to_noop(xhci, ring, td, false);
td->cancel_status = TD_CLEARED;
}
@@ -1350,6 +1370,7 @@ static void xhci_handle_cmd_set_deq(struct xhci_hcd *xhci, int slot_id,
struct xhci_ep_ctx *ep_ctx;
struct xhci_slot_ctx *slot_ctx;
struct xhci_td *td, *tmp_td;
+ bool deferred = false;
ep_index = TRB_TO_EP_INDEX(le32_to_cpu(trb->generic.field[3]));
stream_id = TRB_TO_STREAM_ID(le32_to_cpu(trb->generic.field[2]));
@@ -1436,6 +1457,8 @@ static void xhci_handle_cmd_set_deq(struct xhci_hcd *xhci, int slot_id,
xhci_dbg(ep->xhci, "%s: Giveback cancelled URB %p TD\n",
__func__, td->urb);
xhci_td_cleanup(ep->xhci, td, ep_ring, td->status);
+ } else if (td->cancel_status == TD_CLEARING_CACHE_DEFERRED) {
+ deferred = true;
} else {
xhci_dbg(ep->xhci, "%s: Keep cancelled URB %p TD as cancel_status is %d\n",
__func__, td->urb, td->cancel_status);
@@ -1445,8 +1468,17 @@ static void xhci_handle_cmd_set_deq(struct xhci_hcd *xhci, int slot_id,
ep->ep_state &= ~SET_DEQ_PENDING;
ep->queued_deq_seg = NULL;
ep->queued_deq_ptr = NULL;
- /* Restart any rings with pending URBs */
- ring_doorbell_for_active_rings(xhci, slot_id, ep_index);
+
+ if (deferred) {
+ /* We have more streams to clear */
+ xhci_dbg(ep->xhci, "%s: Pending TDs to clear, continuing with invalidation\n",
+ __func__);
+ xhci_invalidate_cancelled_tds(ep);
+ } else {
+ /* Restart any rings with pending URBs */
+ xhci_dbg(ep->xhci, "%s: All TDs cleared, ring doorbell\n", __func__);
+ ring_doorbell_for_active_rings(xhci, slot_id, ep_index);
+ }
}
static void xhci_handle_cmd_reset_ep(struct xhci_hcd *xhci, int slot_id,
diff --git a/drivers/usb/host/xhci.h b/drivers/usb/host/xhci.h
index 6f4bf98a6282..aa4379bdb90c 100644
--- a/drivers/usb/host/xhci.h
+++ b/drivers/usb/host/xhci.h
@@ -1276,6 +1276,7 @@ enum xhci_cancelled_td_status {
TD_DIRTY = 0,
TD_HALTED,
TD_CLEARING_CACHE,
+ TD_CLEARING_CACHE_DEFERRED,
TD_CLEARED,
};
---
base-commit: a38297e3fb012ddfa7ce0321a7e5a8daeb1872b6
change-id: 20240524-xhci-streams-124e88db52e6
Best regards,
--
Hector Martin <marcan(a)marcan.st>
Am 26.05.24 um 22:19 schrieb Sasha Levin:
> This is a note to let you know that I've just added the patch titled
>
> platform/x86: xiaomi-wmi: Fix race condition when reporting key events
>
> to the 5.4-stable tree which can be found at:
> http://www.kernel.org/git/?p=linux/kernel/git/stable/stable-queue.git;a=sum…
>
> The filename of the patch is:
> platform-x86-xiaomi-wmi-fix-race-condition-when-repo.patch
> and it can be found in the queue-5.4 subdirectory.
>
> If you, or anyone else, feels it should not be added to the stable tree,
> please let <stable(a)vger.kernel.org> know about it.
>
Hi,
the underlying race condition can only be triggered since
commit e2ffcda16290 ("ACPI: OSL: Allow Notify () handlers to run on all CPUs"), which
afaik was introduced with kernel 6.8.
Because of this, i do not think that we have to backport this commit to kernels before 6.8.
Thanks,
Armin Wolf
>
> commit 7217162b48f60edc29afbeff641b7de02076bb86
> Author: Armin Wolf <W_Armin(a)gmx.de>
> Date: Tue Apr 2 16:30:57 2024 +0200
>
> platform/x86: xiaomi-wmi: Fix race condition when reporting key events
>
> [ Upstream commit 290680c2da8061e410bcaec4b21584ed951479af ]
>
> Multiple WMI events can be received concurrently, so multiple instances
> of xiaomi_wmi_notify() can be active at the same time. Since the input
> device is shared between those handlers, the key input sequence can be
> disturbed.
>
> Fix this by protecting the key input sequence with a mutex.
>
> Compile-tested only.
>
> Fixes: edb73f4f0247 ("platform/x86: wmi: add Xiaomi WMI key driver")
> Signed-off-by: Armin Wolf <W_Armin(a)gmx.de>
> Link: https://lore.kernel.org/r/20240402143059.8456-2-W_Armin@gmx.de
> Reviewed-by: Kuppuswamy Sathyanarayanan <sathyanarayanan.kuppuswamy(a)linux.intel.com>
> Reviewed-by: Hans de Goede <hdegoede(a)redhat.com>
> Signed-off-by: Hans de Goede <hdegoede(a)redhat.com>
> Signed-off-by: Sasha Levin <sashal(a)kernel.org>
>
> diff --git a/drivers/platform/x86/xiaomi-wmi.c b/drivers/platform/x86/xiaomi-wmi.c
> index 54a2546bb93bf..be80f0bda9484 100644
> --- a/drivers/platform/x86/xiaomi-wmi.c
> +++ b/drivers/platform/x86/xiaomi-wmi.c
> @@ -2,8 +2,10 @@
> /* WMI driver for Xiaomi Laptops */
>
> #include <linux/acpi.h>
> +#include <linux/device.h>
> #include <linux/input.h>
> #include <linux/module.h>
> +#include <linux/mutex.h>
> #include <linux/wmi.h>
>
> #include <uapi/linux/input-event-codes.h>
> @@ -20,12 +22,21 @@
>
> struct xiaomi_wmi {
> struct input_dev *input_dev;
> + struct mutex key_lock; /* Protects the key event sequence */
> unsigned int key_code;
> };
>
> +static void xiaomi_mutex_destroy(void *data)
> +{
> + struct mutex *lock = data;
> +
> + mutex_destroy(lock);
> +}
> +
> static int xiaomi_wmi_probe(struct wmi_device *wdev, const void *context)
> {
> struct xiaomi_wmi *data;
> + int ret;
>
> if (wdev == NULL || context == NULL)
> return -EINVAL;
> @@ -35,6 +46,11 @@ static int xiaomi_wmi_probe(struct wmi_device *wdev, const void *context)
> return -ENOMEM;
> dev_set_drvdata(&wdev->dev, data);
>
> + mutex_init(&data->key_lock);
> + ret = devm_add_action_or_reset(&wdev->dev, xiaomi_mutex_destroy, &data->key_lock);
> + if (ret < 0)
> + return ret;
> +
> data->input_dev = devm_input_allocate_device(&wdev->dev);
> if (data->input_dev == NULL)
> return -ENOMEM;
> @@ -59,10 +75,12 @@ static void xiaomi_wmi_notify(struct wmi_device *wdev, union acpi_object *dummy)
> if (data == NULL)
> return;
>
> + mutex_lock(&data->key_lock);
> input_report_key(data->input_dev, data->key_code, 1);
> input_sync(data->input_dev);
> input_report_key(data->input_dev, data->key_code, 0);
> input_sync(data->input_dev);
> + mutex_unlock(&data->key_lock);
> }
>
> static const struct wmi_device_id xiaomi_wmi_id_table[] = {
Am 26.05.24 um 22:14 schrieb Sasha Levin:
> This is a note to let you know that I've just added the patch titled
>
> platform/x86: xiaomi-wmi: Fix race condition when reporting key events
>
> to the 5.10-stable tree which can be found at:
> http://www.kernel.org/git/?p=linux/kernel/git/stable/stable-queue.git;a=sum…
>
> The filename of the patch is:
> platform-x86-xiaomi-wmi-fix-race-condition-when-repo.patch
> and it can be found in the queue-5.10 subdirectory.
>
> If you, or anyone else, feels it should not be added to the stable tree,
> please let <stable(a)vger.kernel.org> know about it.
>
Hi,
the underlying race condition can only be triggered since
commit e2ffcda16290 ("ACPI: OSL: Allow Notify () handlers to run on all CPUs"), which
afaik was introduced with kernel 6.8.
Because of this, i do not think that we have to backport this commit to kernels before 6.8.
Thanks,
Armin Wolf
>
> commit 6f4e7901c3ed3c0bd3da7af5854dbb765fad2e00
> Author: Armin Wolf <W_Armin(a)gmx.de>
> Date: Tue Apr 2 16:30:57 2024 +0200
>
> platform/x86: xiaomi-wmi: Fix race condition when reporting key events
>
> [ Upstream commit 290680c2da8061e410bcaec4b21584ed951479af ]
>
> Multiple WMI events can be received concurrently, so multiple instances
> of xiaomi_wmi_notify() can be active at the same time. Since the input
> device is shared between those handlers, the key input sequence can be
> disturbed.
>
> Fix this by protecting the key input sequence with a mutex.
>
> Compile-tested only.
>
> Fixes: edb73f4f0247 ("platform/x86: wmi: add Xiaomi WMI key driver")
> Signed-off-by: Armin Wolf <W_Armin(a)gmx.de>
> Link: https://lore.kernel.org/r/20240402143059.8456-2-W_Armin@gmx.de
> Reviewed-by: Kuppuswamy Sathyanarayanan <sathyanarayanan.kuppuswamy(a)linux.intel.com>
> Reviewed-by: Hans de Goede <hdegoede(a)redhat.com>
> Signed-off-by: Hans de Goede <hdegoede(a)redhat.com>
> Signed-off-by: Sasha Levin <sashal(a)kernel.org>
>
> diff --git a/drivers/platform/x86/xiaomi-wmi.c b/drivers/platform/x86/xiaomi-wmi.c
> index 54a2546bb93bf..be80f0bda9484 100644
> --- a/drivers/platform/x86/xiaomi-wmi.c
> +++ b/drivers/platform/x86/xiaomi-wmi.c
> @@ -2,8 +2,10 @@
> /* WMI driver for Xiaomi Laptops */
>
> #include <linux/acpi.h>
> +#include <linux/device.h>
> #include <linux/input.h>
> #include <linux/module.h>
> +#include <linux/mutex.h>
> #include <linux/wmi.h>
>
> #include <uapi/linux/input-event-codes.h>
> @@ -20,12 +22,21 @@
>
> struct xiaomi_wmi {
> struct input_dev *input_dev;
> + struct mutex key_lock; /* Protects the key event sequence */
> unsigned int key_code;
> };
>
> +static void xiaomi_mutex_destroy(void *data)
> +{
> + struct mutex *lock = data;
> +
> + mutex_destroy(lock);
> +}
> +
> static int xiaomi_wmi_probe(struct wmi_device *wdev, const void *context)
> {
> struct xiaomi_wmi *data;
> + int ret;
>
> if (wdev == NULL || context == NULL)
> return -EINVAL;
> @@ -35,6 +46,11 @@ static int xiaomi_wmi_probe(struct wmi_device *wdev, const void *context)
> return -ENOMEM;
> dev_set_drvdata(&wdev->dev, data);
>
> + mutex_init(&data->key_lock);
> + ret = devm_add_action_or_reset(&wdev->dev, xiaomi_mutex_destroy, &data->key_lock);
> + if (ret < 0)
> + return ret;
> +
> data->input_dev = devm_input_allocate_device(&wdev->dev);
> if (data->input_dev == NULL)
> return -ENOMEM;
> @@ -59,10 +75,12 @@ static void xiaomi_wmi_notify(struct wmi_device *wdev, union acpi_object *dummy)
> if (data == NULL)
> return;
>
> + mutex_lock(&data->key_lock);
> input_report_key(data->input_dev, data->key_code, 1);
> input_sync(data->input_dev);
> input_report_key(data->input_dev, data->key_code, 0);
> input_sync(data->input_dev);
> + mutex_unlock(&data->key_lock);
> }
>
> static const struct wmi_device_id xiaomi_wmi_id_table[] = {
Am 26.05.24 um 22:07 schrieb Sasha Levin:
> This is a note to let you know that I've just added the patch titled
>
> platform/x86: xiaomi-wmi: Fix race condition when reporting key events
>
> to the 5.15-stable tree which can be found at:
> http://www.kernel.org/git/?p=linux/kernel/git/stable/stable-queue.git;a=sum…
>
> The filename of the patch is:
> platform-x86-xiaomi-wmi-fix-race-condition-when-repo.patch
> and it can be found in the queue-5.15 subdirectory.
>
> If you, or anyone else, feels it should not be added to the stable tree,
> please let <stable(a)vger.kernel.org> know about it.
>
Hi,
the underlying race condition can only be triggered since
commit e2ffcda16290 ("ACPI: OSL: Allow Notify () handlers to run on all CPUs"), which
afaik was introduced with kernel 6.8.
Because of this, i do not think that we have to backport this commit to kernels before 6.8.
Thanks,
Armin Wolf
>
> commit 1f436551dd453c28c23f800e7273136e526197cb
> Author: Armin Wolf <W_Armin(a)gmx.de>
> Date: Tue Apr 2 16:30:57 2024 +0200
>
> platform/x86: xiaomi-wmi: Fix race condition when reporting key events
>
> [ Upstream commit 290680c2da8061e410bcaec4b21584ed951479af ]
>
> Multiple WMI events can be received concurrently, so multiple instances
> of xiaomi_wmi_notify() can be active at the same time. Since the input
> device is shared between those handlers, the key input sequence can be
> disturbed.
>
> Fix this by protecting the key input sequence with a mutex.
>
> Compile-tested only.
>
> Fixes: edb73f4f0247 ("platform/x86: wmi: add Xiaomi WMI key driver")
> Signed-off-by: Armin Wolf <W_Armin(a)gmx.de>
> Link: https://lore.kernel.org/r/20240402143059.8456-2-W_Armin@gmx.de
> Reviewed-by: Kuppuswamy Sathyanarayanan <sathyanarayanan.kuppuswamy(a)linux.intel.com>
> Reviewed-by: Hans de Goede <hdegoede(a)redhat.com>
> Signed-off-by: Hans de Goede <hdegoede(a)redhat.com>
> Signed-off-by: Sasha Levin <sashal(a)kernel.org>
>
> diff --git a/drivers/platform/x86/xiaomi-wmi.c b/drivers/platform/x86/xiaomi-wmi.c
> index 54a2546bb93bf..be80f0bda9484 100644
> --- a/drivers/platform/x86/xiaomi-wmi.c
> +++ b/drivers/platform/x86/xiaomi-wmi.c
> @@ -2,8 +2,10 @@
> /* WMI driver for Xiaomi Laptops */
>
> #include <linux/acpi.h>
> +#include <linux/device.h>
> #include <linux/input.h>
> #include <linux/module.h>
> +#include <linux/mutex.h>
> #include <linux/wmi.h>
>
> #include <uapi/linux/input-event-codes.h>
> @@ -20,12 +22,21 @@
>
> struct xiaomi_wmi {
> struct input_dev *input_dev;
> + struct mutex key_lock; /* Protects the key event sequence */
> unsigned int key_code;
> };
>
> +static void xiaomi_mutex_destroy(void *data)
> +{
> + struct mutex *lock = data;
> +
> + mutex_destroy(lock);
> +}
> +
> static int xiaomi_wmi_probe(struct wmi_device *wdev, const void *context)
> {
> struct xiaomi_wmi *data;
> + int ret;
>
> if (wdev == NULL || context == NULL)
> return -EINVAL;
> @@ -35,6 +46,11 @@ static int xiaomi_wmi_probe(struct wmi_device *wdev, const void *context)
> return -ENOMEM;
> dev_set_drvdata(&wdev->dev, data);
>
> + mutex_init(&data->key_lock);
> + ret = devm_add_action_or_reset(&wdev->dev, xiaomi_mutex_destroy, &data->key_lock);
> + if (ret < 0)
> + return ret;
> +
> data->input_dev = devm_input_allocate_device(&wdev->dev);
> if (data->input_dev == NULL)
> return -ENOMEM;
> @@ -59,10 +75,12 @@ static void xiaomi_wmi_notify(struct wmi_device *wdev, union acpi_object *dummy)
> if (data == NULL)
> return;
>
> + mutex_lock(&data->key_lock);
> input_report_key(data->input_dev, data->key_code, 1);
> input_sync(data->input_dev);
> input_report_key(data->input_dev, data->key_code, 0);
> input_sync(data->input_dev);
> + mutex_unlock(&data->key_lock);
> }
>
> static const struct wmi_device_id xiaomi_wmi_id_table[] = {
Am 26.05.24 um 21:57 schrieb Sasha Levin:
> This is a note to let you know that I've just added the patch titled
>
> platform/x86: xiaomi-wmi: Fix race condition when reporting key events
>
> to the 6.1-stable tree which can be found at:
> http://www.kernel.org/git/?p=linux/kernel/git/stable/stable-queue.git;a=sum…
>
> The filename of the patch is:
> platform-x86-xiaomi-wmi-fix-race-condition-when-repo.patch
> and it can be found in the queue-6.1 subdirectory.
>
> If you, or anyone else, feels it should not be added to the stable tree,
> please let <stable(a)vger.kernel.org> know about it.
>
Hi,
the underlying race condition can only be triggered since
commit e2ffcda16290 ("ACPI: OSL: Allow Notify () handlers to run on all CPUs"), which
afaik was introduced with kernel 6.8.
Because of this, i do not think that we have to backport this commit to kernels before 6.8.
Thanks,
Armin Wolf
>
> commit 1abdef69265133db29772ed5cefea2338f8ce173
> Author: Armin Wolf <W_Armin(a)gmx.de>
> Date: Tue Apr 2 16:30:57 2024 +0200
>
> platform/x86: xiaomi-wmi: Fix race condition when reporting key events
>
> [ Upstream commit 290680c2da8061e410bcaec4b21584ed951479af ]
>
> Multiple WMI events can be received concurrently, so multiple instances
> of xiaomi_wmi_notify() can be active at the same time. Since the input
> device is shared between those handlers, the key input sequence can be
> disturbed.
>
> Fix this by protecting the key input sequence with a mutex.
>
> Compile-tested only.
>
> Fixes: edb73f4f0247 ("platform/x86: wmi: add Xiaomi WMI key driver")
> Signed-off-by: Armin Wolf <W_Armin(a)gmx.de>
> Link: https://lore.kernel.org/r/20240402143059.8456-2-W_Armin@gmx.de
> Reviewed-by: Kuppuswamy Sathyanarayanan <sathyanarayanan.kuppuswamy(a)linux.intel.com>
> Reviewed-by: Hans de Goede <hdegoede(a)redhat.com>
> Signed-off-by: Hans de Goede <hdegoede(a)redhat.com>
> Signed-off-by: Sasha Levin <sashal(a)kernel.org>
>
> diff --git a/drivers/platform/x86/xiaomi-wmi.c b/drivers/platform/x86/xiaomi-wmi.c
> index 54a2546bb93bf..be80f0bda9484 100644
> --- a/drivers/platform/x86/xiaomi-wmi.c
> +++ b/drivers/platform/x86/xiaomi-wmi.c
> @@ -2,8 +2,10 @@
> /* WMI driver for Xiaomi Laptops */
>
> #include <linux/acpi.h>
> +#include <linux/device.h>
> #include <linux/input.h>
> #include <linux/module.h>
> +#include <linux/mutex.h>
> #include <linux/wmi.h>
>
> #include <uapi/linux/input-event-codes.h>
> @@ -20,12 +22,21 @@
>
> struct xiaomi_wmi {
> struct input_dev *input_dev;
> + struct mutex key_lock; /* Protects the key event sequence */
> unsigned int key_code;
> };
>
> +static void xiaomi_mutex_destroy(void *data)
> +{
> + struct mutex *lock = data;
> +
> + mutex_destroy(lock);
> +}
> +
> static int xiaomi_wmi_probe(struct wmi_device *wdev, const void *context)
> {
> struct xiaomi_wmi *data;
> + int ret;
>
> if (wdev == NULL || context == NULL)
> return -EINVAL;
> @@ -35,6 +46,11 @@ static int xiaomi_wmi_probe(struct wmi_device *wdev, const void *context)
> return -ENOMEM;
> dev_set_drvdata(&wdev->dev, data);
>
> + mutex_init(&data->key_lock);
> + ret = devm_add_action_or_reset(&wdev->dev, xiaomi_mutex_destroy, &data->key_lock);
> + if (ret < 0)
> + return ret;
> +
> data->input_dev = devm_input_allocate_device(&wdev->dev);
> if (data->input_dev == NULL)
> return -ENOMEM;
> @@ -59,10 +75,12 @@ static void xiaomi_wmi_notify(struct wmi_device *wdev, union acpi_object *dummy)
> if (data == NULL)
> return;
>
> + mutex_lock(&data->key_lock);
> input_report_key(data->input_dev, data->key_code, 1);
> input_sync(data->input_dev);
> input_report_key(data->input_dev, data->key_code, 0);
> input_sync(data->input_dev);
> + mutex_unlock(&data->key_lock);
> }
>
> static const struct wmi_device_id xiaomi_wmi_id_table[] = {
Hi,
I noticed that since https://git.kernel.org/pub/scm/linux/kernel/git/torvalds/linux.git/commit/?… (which also got backported to -stable kernels) several of messages from dmesg regarding the ATA subsystem are no longer logged.
For example, on my Dell PowerEdge 840 which has only one PATA port I used to get:
scsi host1: ata_piix
ata1: PATA max UDMA/100 cmd 0x1f0 ctl 0x3f6 bmdma 0xfc00 irq 14
ata2: PATA max UDMA/100 cmd 0x170 ctl 0x376 bmdma 0xfc08 irq 15
ata_piix 0000:00:1f.2: MAP [ P0 P2 P1 P3 ]
ata2: port disabled--ignoring
ata1.01: NODEV after polling detection
ata1.00: ATAPI: SAMSUNG CD-R/RW SW-248F, R602, max UDMA/33
After that commit, the following two log entries are missing:
ata2: port disabled--ignoring
ata1.01: NODEV after polling detection
Note that these are just examples, there are many more messages impacted by that.
Looking at the code, these messages are logged via ata_link_dbg / ata_dev_dbg:
ata_link_dbg(link, "port disabled--ignoring\n"); [in drivers/ata/libata-eh.c]
ata_dev_dbg(dev, "NODEV after polling detection\n"); [in drivers/ata/libata-core.c]
The commit change how the logging is called - ata_dev_printk function which was calling printk() directly got replaced with the following macro:
+#define ata_dev_printk(level, dev, fmt, ...) \
+ pr_ ## level("ata%u.%02u: " fmt, \
+ (dev)->link->ap->print_id, \
+ (dev)->link->pmp + (dev)->devno, \
+ ##__VA_ARGS__)
(...)
#define ata_link_dbg(link, fmt, ...) \
- ata_link_printk(link, KERN_DEBUG, fmt, ##__VA_ARGS__)
+ ata_link_printk(debug, link, fmt, ##__VA_ARGS__)
(...)
#define ata_dev_dbg(dev, fmt, ...) \
- ata_dev_printk(dev, KERN_DEBUG, fmt, ##__VA_ARGS__)
+ ata_dev_printk(debug, dev, fmt, ##__VA_ARGS__
So, instead of printk(..., level == KERN_DEBUG, ) we now call pr_debug(...). This is a problem as printk(msg, KERN_DEBUG) != pr_debug(msg).
pr_debug is defined as:
/* If you are writing a driver, please use dev_dbg instead */
#if defined(CONFIG_DYNAMIC_DEBUG) || \
(defined(CONFIG_DYNAMIC_DEBUG_CORE) && defined(DYNAMIC_DEBUG_MODULE))
#include <linux/dynamic_debug.h>
/**
* pr_debug - Print a debug-level message conditionally
* @fmt: format string
* @...: arguments for the format string
*
* This macro expands to dynamic_pr_debug() if CONFIG_DYNAMIC_DEBUG is
* set. Otherwise, if DEBUG is defined, it's equivalent to a printk with
* KERN_DEBUG loglevel. If DEBUG is not defined it does nothing.
*
* It uses pr_fmt() to generate the format string (dynamic_pr_debug() uses
* pr_fmt() internally).
*/
#define pr_debug(fmt, ...) \
dynamic_pr_debug(fmt, ##__VA_ARGS__)
#elif defined(DEBUG)
#define pr_debug(fmt, ...) \
printk(KERN_DEBUG pr_fmt(fmt), ##__VA_ARGS__)
#else
#define pr_debug(fmt, ...) \
no_printk(KERN_DEBUG pr_fmt(fmt), ##__VA_ARGS__)
#endif
Without CONFIG_DYNAMIC_DEBUG and if no CONFIG_DEBUG is enabled, the end result is calling no_printk which means nothing gets logged.
Looking at the code, there are more impacted calls, like for example ata_dev_dbg(dev, "disabling queued TRIM support\n") for ATA_HORKAGE_NO_NCQ_TRIM, which also seems like an important information to log, and there are more.
Was this change done intentionally? Note that most of the "debug" printks in libata code seem to be guarded by ata_msg_info / ata_msg_probe / ATA_DEBUG which was sufficient to prevent excess debug information logging.
One of the cases like this was covered in the patch:
-#ifdef ATA_DEBUG
if (status != 0xff && (status & (ATA_BUSY | ATA_DRQ)))
- ata_port_printk(ap, KERN_DEBUG, "abnormal Status 0x%X\n",
- status);
-#endif
+ ata_port_dbg(ap, "abnormal Status 0x%X\n", status);
Assuming this is the intended direction, would it make sense for now to at promote "unconditionally" logged messages from ata_link_dbg/ata_dev_dbg to ata_link_info/ata_dev_info?
Longer term, perhaps we want to revisit ata_msg_info/ata_msg_probe/ATA_DEBUG/ATA_VERBOSE_DEBUG vs ata_dev_printk/ata_link_printk/pr_debug (and maybe also pr_devel), especially that DYNAMIC_DEBUG is available these days...
Best regards,
Krzysztof Olędzki
The patch below does not apply to the 4.19-stable tree.
If someone wants it applied there, or to any other stable or longterm
tree, then please email the backport, including the original git commit
id to <stable(a)vger.kernel.org>.
To reproduce the conflict and resubmit, you may use the following commands:
git fetch https://git.kernel.org/pub/scm/linux/kernel/git/stable/linux.git/ linux-4.19.y
git checkout FETCH_HEAD
git cherry-pick -x f5d4e04634c9cf68bdf23de08ada0bb92e8befe7
# <resolve conflicts, build, test, etc.>
git commit -s
git send-email --to '<stable(a)vger.kernel.org>' --in-reply-to '2024052609-speckled-elusive-2d6c@gregkh' --subject-prefix 'PATCH 4.19.y' HEAD^..
Possible dependencies:
thanks,
greg k-h
------------------ original commit in Linus's tree ------------------
From f5d4e04634c9cf68bdf23de08ada0bb92e8befe7 Mon Sep 17 00:00:00 2001
From: Ryusuke Konishi <konishi.ryusuke(a)gmail.com>
Date: Mon, 20 May 2024 22:26:19 +0900
Subject: [PATCH] nilfs2: fix use-after-free of timer for log writer thread
Patch series "nilfs2: fix log writer related issues".
This bug fix series covers three nilfs2 log writer-related issues,
including a timer use-after-free issue and potential deadlock issue on
unmount, and a potential freeze issue in event synchronization found
during their analysis. Details are described in each commit log.
This patch (of 3):
A use-after-free issue has been reported regarding the timer sc_timer on
the nilfs_sc_info structure.
The problem is that even though it is used to wake up a sleeping log
writer thread, sc_timer is not shut down until the nilfs_sc_info structure
is about to be freed, and is used regardless of the thread's lifetime.
Fix this issue by limiting the use of sc_timer only while the log writer
thread is alive.
Link: https://lkml.kernel.org/r/20240520132621.4054-1-konishi.ryusuke@gmail.com
Link: https://lkml.kernel.org/r/20240520132621.4054-2-konishi.ryusuke@gmail.com
Fixes: fdce895ea5dd ("nilfs2: change sc_timer from a pointer to an embedded one in struct nilfs_sc_info")
Signed-off-by: Ryusuke Konishi <konishi.ryusuke(a)gmail.com>
Reported-by: "Bai, Shuangpeng" <sjb7183(a)psu.edu>
Closes: https://groups.google.com/g/syzkaller/c/MK_LYqtt8ko/m/8rgdWeseAwAJ
Tested-by: Ryusuke Konishi <konishi.ryusuke(a)gmail.com>
Cc: <stable(a)vger.kernel.org>
Signed-off-by: Andrew Morton <akpm(a)linux-foundation.org>
diff --git a/fs/nilfs2/segment.c b/fs/nilfs2/segment.c
index 6be7dd423fbd..7cb34e1c9206 100644
--- a/fs/nilfs2/segment.c
+++ b/fs/nilfs2/segment.c
@@ -2118,8 +2118,10 @@ static void nilfs_segctor_start_timer(struct nilfs_sc_info *sci)
{
spin_lock(&sci->sc_state_lock);
if (!(sci->sc_state & NILFS_SEGCTOR_COMMIT)) {
- sci->sc_timer.expires = jiffies + sci->sc_interval;
- add_timer(&sci->sc_timer);
+ if (sci->sc_task) {
+ sci->sc_timer.expires = jiffies + sci->sc_interval;
+ add_timer(&sci->sc_timer);
+ }
sci->sc_state |= NILFS_SEGCTOR_COMMIT;
}
spin_unlock(&sci->sc_state_lock);
@@ -2320,10 +2322,21 @@ int nilfs_construct_dsync_segment(struct super_block *sb, struct inode *inode,
*/
static void nilfs_segctor_accept(struct nilfs_sc_info *sci)
{
+ bool thread_is_alive;
+
spin_lock(&sci->sc_state_lock);
sci->sc_seq_accepted = sci->sc_seq_request;
+ thread_is_alive = (bool)sci->sc_task;
spin_unlock(&sci->sc_state_lock);
- del_timer_sync(&sci->sc_timer);
+
+ /*
+ * This function does not race with the log writer thread's
+ * termination. Therefore, deleting sc_timer, which should not be
+ * done after the log writer thread exits, can be done safely outside
+ * the area protected by sc_state_lock.
+ */
+ if (thread_is_alive)
+ del_timer_sync(&sci->sc_timer);
}
/**
@@ -2349,7 +2362,7 @@ static void nilfs_segctor_notify(struct nilfs_sc_info *sci, int mode, int err)
sci->sc_flush_request &= ~FLUSH_DAT_BIT;
/* re-enable timer if checkpoint creation was not done */
- if ((sci->sc_state & NILFS_SEGCTOR_COMMIT) &&
+ if ((sci->sc_state & NILFS_SEGCTOR_COMMIT) && sci->sc_task &&
time_before(jiffies, sci->sc_timer.expires))
add_timer(&sci->sc_timer);
}
@@ -2539,6 +2552,7 @@ static int nilfs_segctor_thread(void *arg)
int timeout = 0;
sci->sc_timer_task = current;
+ timer_setup(&sci->sc_timer, nilfs_construction_timeout, 0);
/* start sync. */
sci->sc_task = current;
@@ -2606,6 +2620,7 @@ static int nilfs_segctor_thread(void *arg)
end_thread:
/* end sync. */
sci->sc_task = NULL;
+ timer_shutdown_sync(&sci->sc_timer);
wake_up(&sci->sc_wait_task); /* for nilfs_segctor_kill_thread() */
spin_unlock(&sci->sc_state_lock);
return 0;
@@ -2669,7 +2684,6 @@ static struct nilfs_sc_info *nilfs_segctor_new(struct super_block *sb,
INIT_LIST_HEAD(&sci->sc_gc_inodes);
INIT_LIST_HEAD(&sci->sc_iput_queue);
INIT_WORK(&sci->sc_iput_work, nilfs_iput_work_func);
- timer_setup(&sci->sc_timer, nilfs_construction_timeout, 0);
sci->sc_interval = HZ * NILFS_SC_DEFAULT_TIMEOUT;
sci->sc_mjcp_freq = HZ * NILFS_SC_DEFAULT_SR_FREQ;
@@ -2748,7 +2762,6 @@ static void nilfs_segctor_destroy(struct nilfs_sc_info *sci)
down_write(&nilfs->ns_segctor_sem);
- timer_shutdown_sync(&sci->sc_timer);
kfree(sci);
}
The patch below does not apply to the 5.4-stable tree.
If someone wants it applied there, or to any other stable or longterm
tree, then please email the backport, including the original git commit
id to <stable(a)vger.kernel.org>.
To reproduce the conflict and resubmit, you may use the following commands:
git fetch https://git.kernel.org/pub/scm/linux/kernel/git/stable/linux.git/ linux-5.4.y
git checkout FETCH_HEAD
git cherry-pick -x f5d4e04634c9cf68bdf23de08ada0bb92e8befe7
# <resolve conflicts, build, test, etc.>
git commit -s
git send-email --to '<stable(a)vger.kernel.org>' --in-reply-to '2024052606-skater-friction-3021@gregkh' --subject-prefix 'PATCH 5.4.y' HEAD^..
Possible dependencies:
thanks,
greg k-h
------------------ original commit in Linus's tree ------------------
From f5d4e04634c9cf68bdf23de08ada0bb92e8befe7 Mon Sep 17 00:00:00 2001
From: Ryusuke Konishi <konishi.ryusuke(a)gmail.com>
Date: Mon, 20 May 2024 22:26:19 +0900
Subject: [PATCH] nilfs2: fix use-after-free of timer for log writer thread
Patch series "nilfs2: fix log writer related issues".
This bug fix series covers three nilfs2 log writer-related issues,
including a timer use-after-free issue and potential deadlock issue on
unmount, and a potential freeze issue in event synchronization found
during their analysis. Details are described in each commit log.
This patch (of 3):
A use-after-free issue has been reported regarding the timer sc_timer on
the nilfs_sc_info structure.
The problem is that even though it is used to wake up a sleeping log
writer thread, sc_timer is not shut down until the nilfs_sc_info structure
is about to be freed, and is used regardless of the thread's lifetime.
Fix this issue by limiting the use of sc_timer only while the log writer
thread is alive.
Link: https://lkml.kernel.org/r/20240520132621.4054-1-konishi.ryusuke@gmail.com
Link: https://lkml.kernel.org/r/20240520132621.4054-2-konishi.ryusuke@gmail.com
Fixes: fdce895ea5dd ("nilfs2: change sc_timer from a pointer to an embedded one in struct nilfs_sc_info")
Signed-off-by: Ryusuke Konishi <konishi.ryusuke(a)gmail.com>
Reported-by: "Bai, Shuangpeng" <sjb7183(a)psu.edu>
Closes: https://groups.google.com/g/syzkaller/c/MK_LYqtt8ko/m/8rgdWeseAwAJ
Tested-by: Ryusuke Konishi <konishi.ryusuke(a)gmail.com>
Cc: <stable(a)vger.kernel.org>
Signed-off-by: Andrew Morton <akpm(a)linux-foundation.org>
diff --git a/fs/nilfs2/segment.c b/fs/nilfs2/segment.c
index 6be7dd423fbd..7cb34e1c9206 100644
--- a/fs/nilfs2/segment.c
+++ b/fs/nilfs2/segment.c
@@ -2118,8 +2118,10 @@ static void nilfs_segctor_start_timer(struct nilfs_sc_info *sci)
{
spin_lock(&sci->sc_state_lock);
if (!(sci->sc_state & NILFS_SEGCTOR_COMMIT)) {
- sci->sc_timer.expires = jiffies + sci->sc_interval;
- add_timer(&sci->sc_timer);
+ if (sci->sc_task) {
+ sci->sc_timer.expires = jiffies + sci->sc_interval;
+ add_timer(&sci->sc_timer);
+ }
sci->sc_state |= NILFS_SEGCTOR_COMMIT;
}
spin_unlock(&sci->sc_state_lock);
@@ -2320,10 +2322,21 @@ int nilfs_construct_dsync_segment(struct super_block *sb, struct inode *inode,
*/
static void nilfs_segctor_accept(struct nilfs_sc_info *sci)
{
+ bool thread_is_alive;
+
spin_lock(&sci->sc_state_lock);
sci->sc_seq_accepted = sci->sc_seq_request;
+ thread_is_alive = (bool)sci->sc_task;
spin_unlock(&sci->sc_state_lock);
- del_timer_sync(&sci->sc_timer);
+
+ /*
+ * This function does not race with the log writer thread's
+ * termination. Therefore, deleting sc_timer, which should not be
+ * done after the log writer thread exits, can be done safely outside
+ * the area protected by sc_state_lock.
+ */
+ if (thread_is_alive)
+ del_timer_sync(&sci->sc_timer);
}
/**
@@ -2349,7 +2362,7 @@ static void nilfs_segctor_notify(struct nilfs_sc_info *sci, int mode, int err)
sci->sc_flush_request &= ~FLUSH_DAT_BIT;
/* re-enable timer if checkpoint creation was not done */
- if ((sci->sc_state & NILFS_SEGCTOR_COMMIT) &&
+ if ((sci->sc_state & NILFS_SEGCTOR_COMMIT) && sci->sc_task &&
time_before(jiffies, sci->sc_timer.expires))
add_timer(&sci->sc_timer);
}
@@ -2539,6 +2552,7 @@ static int nilfs_segctor_thread(void *arg)
int timeout = 0;
sci->sc_timer_task = current;
+ timer_setup(&sci->sc_timer, nilfs_construction_timeout, 0);
/* start sync. */
sci->sc_task = current;
@@ -2606,6 +2620,7 @@ static int nilfs_segctor_thread(void *arg)
end_thread:
/* end sync. */
sci->sc_task = NULL;
+ timer_shutdown_sync(&sci->sc_timer);
wake_up(&sci->sc_wait_task); /* for nilfs_segctor_kill_thread() */
spin_unlock(&sci->sc_state_lock);
return 0;
@@ -2669,7 +2684,6 @@ static struct nilfs_sc_info *nilfs_segctor_new(struct super_block *sb,
INIT_LIST_HEAD(&sci->sc_gc_inodes);
INIT_LIST_HEAD(&sci->sc_iput_queue);
INIT_WORK(&sci->sc_iput_work, nilfs_iput_work_func);
- timer_setup(&sci->sc_timer, nilfs_construction_timeout, 0);
sci->sc_interval = HZ * NILFS_SC_DEFAULT_TIMEOUT;
sci->sc_mjcp_freq = HZ * NILFS_SC_DEFAULT_SR_FREQ;
@@ -2748,7 +2762,6 @@ static void nilfs_segctor_destroy(struct nilfs_sc_info *sci)
down_write(&nilfs->ns_segctor_sem);
- timer_shutdown_sync(&sci->sc_timer);
kfree(sci);
}
The patch below does not apply to the 5.10-stable tree.
If someone wants it applied there, or to any other stable or longterm
tree, then please email the backport, including the original git commit
id to <stable(a)vger.kernel.org>.
To reproduce the conflict and resubmit, you may use the following commands:
git fetch https://git.kernel.org/pub/scm/linux/kernel/git/stable/linux.git/ linux-5.10.y
git checkout FETCH_HEAD
git cherry-pick -x f5d4e04634c9cf68bdf23de08ada0bb92e8befe7
# <resolve conflicts, build, test, etc.>
git commit -s
git send-email --to '<stable(a)vger.kernel.org>' --in-reply-to '2024052605-pretext-jugular-5085@gregkh' --subject-prefix 'PATCH 5.10.y' HEAD^..
Possible dependencies:
thanks,
greg k-h
------------------ original commit in Linus's tree ------------------
From f5d4e04634c9cf68bdf23de08ada0bb92e8befe7 Mon Sep 17 00:00:00 2001
From: Ryusuke Konishi <konishi.ryusuke(a)gmail.com>
Date: Mon, 20 May 2024 22:26:19 +0900
Subject: [PATCH] nilfs2: fix use-after-free of timer for log writer thread
Patch series "nilfs2: fix log writer related issues".
This bug fix series covers three nilfs2 log writer-related issues,
including a timer use-after-free issue and potential deadlock issue on
unmount, and a potential freeze issue in event synchronization found
during their analysis. Details are described in each commit log.
This patch (of 3):
A use-after-free issue has been reported regarding the timer sc_timer on
the nilfs_sc_info structure.
The problem is that even though it is used to wake up a sleeping log
writer thread, sc_timer is not shut down until the nilfs_sc_info structure
is about to be freed, and is used regardless of the thread's lifetime.
Fix this issue by limiting the use of sc_timer only while the log writer
thread is alive.
Link: https://lkml.kernel.org/r/20240520132621.4054-1-konishi.ryusuke@gmail.com
Link: https://lkml.kernel.org/r/20240520132621.4054-2-konishi.ryusuke@gmail.com
Fixes: fdce895ea5dd ("nilfs2: change sc_timer from a pointer to an embedded one in struct nilfs_sc_info")
Signed-off-by: Ryusuke Konishi <konishi.ryusuke(a)gmail.com>
Reported-by: "Bai, Shuangpeng" <sjb7183(a)psu.edu>
Closes: https://groups.google.com/g/syzkaller/c/MK_LYqtt8ko/m/8rgdWeseAwAJ
Tested-by: Ryusuke Konishi <konishi.ryusuke(a)gmail.com>
Cc: <stable(a)vger.kernel.org>
Signed-off-by: Andrew Morton <akpm(a)linux-foundation.org>
diff --git a/fs/nilfs2/segment.c b/fs/nilfs2/segment.c
index 6be7dd423fbd..7cb34e1c9206 100644
--- a/fs/nilfs2/segment.c
+++ b/fs/nilfs2/segment.c
@@ -2118,8 +2118,10 @@ static void nilfs_segctor_start_timer(struct nilfs_sc_info *sci)
{
spin_lock(&sci->sc_state_lock);
if (!(sci->sc_state & NILFS_SEGCTOR_COMMIT)) {
- sci->sc_timer.expires = jiffies + sci->sc_interval;
- add_timer(&sci->sc_timer);
+ if (sci->sc_task) {
+ sci->sc_timer.expires = jiffies + sci->sc_interval;
+ add_timer(&sci->sc_timer);
+ }
sci->sc_state |= NILFS_SEGCTOR_COMMIT;
}
spin_unlock(&sci->sc_state_lock);
@@ -2320,10 +2322,21 @@ int nilfs_construct_dsync_segment(struct super_block *sb, struct inode *inode,
*/
static void nilfs_segctor_accept(struct nilfs_sc_info *sci)
{
+ bool thread_is_alive;
+
spin_lock(&sci->sc_state_lock);
sci->sc_seq_accepted = sci->sc_seq_request;
+ thread_is_alive = (bool)sci->sc_task;
spin_unlock(&sci->sc_state_lock);
- del_timer_sync(&sci->sc_timer);
+
+ /*
+ * This function does not race with the log writer thread's
+ * termination. Therefore, deleting sc_timer, which should not be
+ * done after the log writer thread exits, can be done safely outside
+ * the area protected by sc_state_lock.
+ */
+ if (thread_is_alive)
+ del_timer_sync(&sci->sc_timer);
}
/**
@@ -2349,7 +2362,7 @@ static void nilfs_segctor_notify(struct nilfs_sc_info *sci, int mode, int err)
sci->sc_flush_request &= ~FLUSH_DAT_BIT;
/* re-enable timer if checkpoint creation was not done */
- if ((sci->sc_state & NILFS_SEGCTOR_COMMIT) &&
+ if ((sci->sc_state & NILFS_SEGCTOR_COMMIT) && sci->sc_task &&
time_before(jiffies, sci->sc_timer.expires))
add_timer(&sci->sc_timer);
}
@@ -2539,6 +2552,7 @@ static int nilfs_segctor_thread(void *arg)
int timeout = 0;
sci->sc_timer_task = current;
+ timer_setup(&sci->sc_timer, nilfs_construction_timeout, 0);
/* start sync. */
sci->sc_task = current;
@@ -2606,6 +2620,7 @@ static int nilfs_segctor_thread(void *arg)
end_thread:
/* end sync. */
sci->sc_task = NULL;
+ timer_shutdown_sync(&sci->sc_timer);
wake_up(&sci->sc_wait_task); /* for nilfs_segctor_kill_thread() */
spin_unlock(&sci->sc_state_lock);
return 0;
@@ -2669,7 +2684,6 @@ static struct nilfs_sc_info *nilfs_segctor_new(struct super_block *sb,
INIT_LIST_HEAD(&sci->sc_gc_inodes);
INIT_LIST_HEAD(&sci->sc_iput_queue);
INIT_WORK(&sci->sc_iput_work, nilfs_iput_work_func);
- timer_setup(&sci->sc_timer, nilfs_construction_timeout, 0);
sci->sc_interval = HZ * NILFS_SC_DEFAULT_TIMEOUT;
sci->sc_mjcp_freq = HZ * NILFS_SC_DEFAULT_SR_FREQ;
@@ -2748,7 +2762,6 @@ static void nilfs_segctor_destroy(struct nilfs_sc_info *sci)
down_write(&nilfs->ns_segctor_sem);
- timer_shutdown_sync(&sci->sc_timer);
kfree(sci);
}
The patch below does not apply to the 5.15-stable tree.
If someone wants it applied there, or to any other stable or longterm
tree, then please email the backport, including the original git commit
id to <stable(a)vger.kernel.org>.
To reproduce the conflict and resubmit, you may use the following commands:
git fetch https://git.kernel.org/pub/scm/linux/kernel/git/stable/linux.git/ linux-5.15.y
git checkout FETCH_HEAD
git cherry-pick -x f5d4e04634c9cf68bdf23de08ada0bb92e8befe7
# <resolve conflicts, build, test, etc.>
git commit -s
git send-email --to '<stable(a)vger.kernel.org>' --in-reply-to '2024052603-probing-percolate-6265@gregkh' --subject-prefix 'PATCH 5.15.y' HEAD^..
Possible dependencies:
thanks,
greg k-h
------------------ original commit in Linus's tree ------------------
From f5d4e04634c9cf68bdf23de08ada0bb92e8befe7 Mon Sep 17 00:00:00 2001
From: Ryusuke Konishi <konishi.ryusuke(a)gmail.com>
Date: Mon, 20 May 2024 22:26:19 +0900
Subject: [PATCH] nilfs2: fix use-after-free of timer for log writer thread
Patch series "nilfs2: fix log writer related issues".
This bug fix series covers three nilfs2 log writer-related issues,
including a timer use-after-free issue and potential deadlock issue on
unmount, and a potential freeze issue in event synchronization found
during their analysis. Details are described in each commit log.
This patch (of 3):
A use-after-free issue has been reported regarding the timer sc_timer on
the nilfs_sc_info structure.
The problem is that even though it is used to wake up a sleeping log
writer thread, sc_timer is not shut down until the nilfs_sc_info structure
is about to be freed, and is used regardless of the thread's lifetime.
Fix this issue by limiting the use of sc_timer only while the log writer
thread is alive.
Link: https://lkml.kernel.org/r/20240520132621.4054-1-konishi.ryusuke@gmail.com
Link: https://lkml.kernel.org/r/20240520132621.4054-2-konishi.ryusuke@gmail.com
Fixes: fdce895ea5dd ("nilfs2: change sc_timer from a pointer to an embedded one in struct nilfs_sc_info")
Signed-off-by: Ryusuke Konishi <konishi.ryusuke(a)gmail.com>
Reported-by: "Bai, Shuangpeng" <sjb7183(a)psu.edu>
Closes: https://groups.google.com/g/syzkaller/c/MK_LYqtt8ko/m/8rgdWeseAwAJ
Tested-by: Ryusuke Konishi <konishi.ryusuke(a)gmail.com>
Cc: <stable(a)vger.kernel.org>
Signed-off-by: Andrew Morton <akpm(a)linux-foundation.org>
diff --git a/fs/nilfs2/segment.c b/fs/nilfs2/segment.c
index 6be7dd423fbd..7cb34e1c9206 100644
--- a/fs/nilfs2/segment.c
+++ b/fs/nilfs2/segment.c
@@ -2118,8 +2118,10 @@ static void nilfs_segctor_start_timer(struct nilfs_sc_info *sci)
{
spin_lock(&sci->sc_state_lock);
if (!(sci->sc_state & NILFS_SEGCTOR_COMMIT)) {
- sci->sc_timer.expires = jiffies + sci->sc_interval;
- add_timer(&sci->sc_timer);
+ if (sci->sc_task) {
+ sci->sc_timer.expires = jiffies + sci->sc_interval;
+ add_timer(&sci->sc_timer);
+ }
sci->sc_state |= NILFS_SEGCTOR_COMMIT;
}
spin_unlock(&sci->sc_state_lock);
@@ -2320,10 +2322,21 @@ int nilfs_construct_dsync_segment(struct super_block *sb, struct inode *inode,
*/
static void nilfs_segctor_accept(struct nilfs_sc_info *sci)
{
+ bool thread_is_alive;
+
spin_lock(&sci->sc_state_lock);
sci->sc_seq_accepted = sci->sc_seq_request;
+ thread_is_alive = (bool)sci->sc_task;
spin_unlock(&sci->sc_state_lock);
- del_timer_sync(&sci->sc_timer);
+
+ /*
+ * This function does not race with the log writer thread's
+ * termination. Therefore, deleting sc_timer, which should not be
+ * done after the log writer thread exits, can be done safely outside
+ * the area protected by sc_state_lock.
+ */
+ if (thread_is_alive)
+ del_timer_sync(&sci->sc_timer);
}
/**
@@ -2349,7 +2362,7 @@ static void nilfs_segctor_notify(struct nilfs_sc_info *sci, int mode, int err)
sci->sc_flush_request &= ~FLUSH_DAT_BIT;
/* re-enable timer if checkpoint creation was not done */
- if ((sci->sc_state & NILFS_SEGCTOR_COMMIT) &&
+ if ((sci->sc_state & NILFS_SEGCTOR_COMMIT) && sci->sc_task &&
time_before(jiffies, sci->sc_timer.expires))
add_timer(&sci->sc_timer);
}
@@ -2539,6 +2552,7 @@ static int nilfs_segctor_thread(void *arg)
int timeout = 0;
sci->sc_timer_task = current;
+ timer_setup(&sci->sc_timer, nilfs_construction_timeout, 0);
/* start sync. */
sci->sc_task = current;
@@ -2606,6 +2620,7 @@ static int nilfs_segctor_thread(void *arg)
end_thread:
/* end sync. */
sci->sc_task = NULL;
+ timer_shutdown_sync(&sci->sc_timer);
wake_up(&sci->sc_wait_task); /* for nilfs_segctor_kill_thread() */
spin_unlock(&sci->sc_state_lock);
return 0;
@@ -2669,7 +2684,6 @@ static struct nilfs_sc_info *nilfs_segctor_new(struct super_block *sb,
INIT_LIST_HEAD(&sci->sc_gc_inodes);
INIT_LIST_HEAD(&sci->sc_iput_queue);
INIT_WORK(&sci->sc_iput_work, nilfs_iput_work_func);
- timer_setup(&sci->sc_timer, nilfs_construction_timeout, 0);
sci->sc_interval = HZ * NILFS_SC_DEFAULT_TIMEOUT;
sci->sc_mjcp_freq = HZ * NILFS_SC_DEFAULT_SR_FREQ;
@@ -2748,7 +2762,6 @@ static void nilfs_segctor_destroy(struct nilfs_sc_info *sci)
down_write(&nilfs->ns_segctor_sem);
- timer_shutdown_sync(&sci->sc_timer);
kfree(sci);
}
commit cd94d1b182d2 ("dm/amd/pm: Fix problems with reboot/shutdown for
some SMU 13.0.4/13.0.11 users") attempted to fix shutdown issues
that were reported since commit 31729e8c21ec ("drm/amd/pm: fixes a
random hang in S4 for SMU v13.0.4/11") but caused issues for some
people.
Adjust the workaround flow to properly only apply in the S4 case:
-> For shutdown go through SMU_MSG_PrepareMp1ForUnload
-> For S4 go through SMU_MSG_GfxDeviceDriverReset and
SMU_MSG_PrepareMp1ForUnload
Reported-and-tested-by: lectrode <electrodexsnet(a)gmail.com>
Closes: https://github.com/void-linux/void-packages/issues/50417
Cc: stable(a)vger.kernel.org
Fixes: cd94d1b182d2 ("dm/amd/pm: Fix problems with reboot/shutdown for some SMU 13.0.4/13.0.11 users")
Signed-off-by: Mario Limonciello <mario.limonciello(a)amd.com>
---
Cc: regressions(a)lists.linux.dev
---
.../drm/amd/pm/swsmu/smu13/smu_v13_0_4_ppt.c | 20 ++++++++++---------
1 file changed, 11 insertions(+), 9 deletions(-)
diff --git a/drivers/gpu/drm/amd/pm/swsmu/smu13/smu_v13_0_4_ppt.c b/drivers/gpu/drm/amd/pm/swsmu/smu13/smu_v13_0_4_ppt.c
index 4abfcd32747d..c7ab0d7027d9 100644
--- a/drivers/gpu/drm/amd/pm/swsmu/smu13/smu_v13_0_4_ppt.c
+++ b/drivers/gpu/drm/amd/pm/swsmu/smu13/smu_v13_0_4_ppt.c
@@ -226,15 +226,17 @@ static int smu_v13_0_4_system_features_control(struct smu_context *smu, bool en)
struct amdgpu_device *adev = smu->adev;
int ret = 0;
- if (!en && adev->in_s4) {
- /* Adds a GFX reset as workaround just before sending the
- * MP1_UNLOAD message to prevent GC/RLC/PMFW from entering
- * an invalid state.
- */
- ret = smu_cmn_send_smc_msg_with_param(smu, SMU_MSG_GfxDeviceDriverReset,
- SMU_RESET_MODE_2, NULL);
- if (ret)
- return ret;
+ if (!en && !adev->in_s0ix) {
+ if (adev->in_s4) {
+ /* Adds a GFX reset as workaround just before sending the
+ * MP1_UNLOAD message to prevent GC/RLC/PMFW from entering
+ * an invalid state.
+ */
+ ret = smu_cmn_send_smc_msg_with_param(smu, SMU_MSG_GfxDeviceDriverReset,
+ SMU_RESET_MODE_2, NULL);
+ if (ret)
+ return ret;
+ }
ret = smu_cmn_send_smc_msg(smu, SMU_MSG_PrepareMp1ForUnload, NULL);
}
--
2.43.0
FWW this is the output from faddr2line for Linux 6.8.11:
$ scripts/faddr2line --list vmlinux blk_try_enter_queue+0xc/0x75
blk_try_enter_queue+0xc/0x75:
__ref_is_percpu at include/linux/percpu-refcount.h:174 (discriminator 2)
169 * READ_ONCE() is required when fetching it.
170 *
171 * The dependency ordering from the READ_ONCE() pairs
172 * with smp_store_release() in __percpu_ref_switch_to_percpu().
173 */
>174< percpu_ptr = READ_ONCE(ref->percpu_count_ptr);
175
176 /*
177 * Theoretically, the following could test just ATOMIC; however,
178 * then we'd have to mask off DEAD separately as DEAD may be
179 * visible without ATOMIC if we race with percpu_ref_kill(). DEAD
(inlined by) percpu_ref_tryget_live_rcu at
include/linux/percpu-refcount.h:282 (discriminator 2)
277 unsigned long __percpu *percpu_count;
278 bool ret = false;
279
280 WARN_ON_ONCE(!rcu_read_lock_held());
281
>282< if (likely(__ref_is_percpu(ref, &percpu_count))) {
283 this_cpu_inc(*percpu_count);
284 ret = true;
285 } else if (!(ref->percpu_count_ptr & __PERCPU_REF_DEAD)) {
286 ret = atomic_long_inc_not_zero(&ref->data->count);
287 }
(inlined by) blk_try_enter_queue at block/blk.h:43 (discriminator 2)
38 void submit_bio_noacct_nocheck(struct bio *bio);
39
40 static inline bool blk_try_enter_queue(struct request_queue *q, bool pm)
41 {
42 rcu_read_lock();
>43< if (!percpu_ref_tryget_live_rcu(&q->q_usage_counter))
44 goto fail;
45
46 /*
47 * The code that increments the pm_only counter must ensure that the
48 * counter is globally visible before the queue is unfrozen.
From: Kemeng Shi <shikemeng(a)huaweicloud.com>
[ Upstream commit d92109891f21cf367caa2cc6dff11a4411d917f4 ]
For case there is no more inodes for IO in io list from last wb_writeback,
We may bail out early even there is inode in dirty list should be written
back. Only bail out when we queued once to avoid missing dirtied inode.
This is from code reading...
Signed-off-by: Kemeng Shi <shikemeng(a)huaweicloud.com>
Link: https://lore.kernel.org/r/20240228091958.288260-3-shikemeng@huaweicloud.com
Reviewed-by: Jan Kara <jack(a)suse.cz>
[brauner(a)kernel.org: fold in memory corruption fix from Jan in [1]]
Link: https://lore.kernel.org/r/20240405132346.bid7gibby3lxxhez@quack3 [1]
Signed-off-by: Christian Brauner <brauner(a)kernel.org>
Signed-off-by: Sasha Levin <sashal(a)kernel.org>
---
fs/fs-writeback.c | 7 +++++--
1 file changed, 5 insertions(+), 2 deletions(-)
diff --git a/fs/fs-writeback.c b/fs/fs-writeback.c
index 1767493dffda7..0a498bc60f557 100644
--- a/fs/fs-writeback.c
+++ b/fs/fs-writeback.c
@@ -2044,6 +2044,7 @@ static long wb_writeback(struct bdi_writeback *wb,
struct inode *inode;
long progress;
struct blk_plug plug;
+ bool queued = false;
blk_start_plug(&plug);
for (;;) {
@@ -2086,8 +2087,10 @@ static long wb_writeback(struct bdi_writeback *wb,
dirtied_before = jiffies;
trace_writeback_start(wb, work);
- if (list_empty(&wb->b_io))
+ if (list_empty(&wb->b_io)) {
queue_io(wb, work, dirtied_before);
+ queued = true;
+ }
if (work->sb)
progress = writeback_sb_inodes(work->sb, wb, work);
else
@@ -2102,7 +2105,7 @@ static long wb_writeback(struct bdi_writeback *wb,
* mean the overall work is done. So we keep looping as long
* as made some progress on cleaning pages or inodes.
*/
- if (progress) {
+ if (progress || !queued) {
spin_unlock(&wb->list_lock);
continue;
}
--
2.43.0
From: Kemeng Shi <shikemeng(a)huaweicloud.com>
[ Upstream commit d92109891f21cf367caa2cc6dff11a4411d917f4 ]
For case there is no more inodes for IO in io list from last wb_writeback,
We may bail out early even there is inode in dirty list should be written
back. Only bail out when we queued once to avoid missing dirtied inode.
This is from code reading...
Signed-off-by: Kemeng Shi <shikemeng(a)huaweicloud.com>
Link: https://lore.kernel.org/r/20240228091958.288260-3-shikemeng@huaweicloud.com
Reviewed-by: Jan Kara <jack(a)suse.cz>
[brauner(a)kernel.org: fold in memory corruption fix from Jan in [1]]
Link: https://lore.kernel.org/r/20240405132346.bid7gibby3lxxhez@quack3 [1]
Signed-off-by: Christian Brauner <brauner(a)kernel.org>
Signed-off-by: Sasha Levin <sashal(a)kernel.org>
---
fs/fs-writeback.c | 7 +++++--
1 file changed, 5 insertions(+), 2 deletions(-)
diff --git a/fs/fs-writeback.c b/fs/fs-writeback.c
index 3d84fcc471c60..e89222ae285e9 100644
--- a/fs/fs-writeback.c
+++ b/fs/fs-writeback.c
@@ -2044,6 +2044,7 @@ static long wb_writeback(struct bdi_writeback *wb,
struct inode *inode;
long progress;
struct blk_plug plug;
+ bool queued = false;
blk_start_plug(&plug);
for (;;) {
@@ -2086,8 +2087,10 @@ static long wb_writeback(struct bdi_writeback *wb,
dirtied_before = jiffies;
trace_writeback_start(wb, work);
- if (list_empty(&wb->b_io))
+ if (list_empty(&wb->b_io)) {
queue_io(wb, work, dirtied_before);
+ queued = true;
+ }
if (work->sb)
progress = writeback_sb_inodes(work->sb, wb, work);
else
@@ -2102,7 +2105,7 @@ static long wb_writeback(struct bdi_writeback *wb,
* mean the overall work is done. So we keep looping as long
* as made some progress on cleaning pages or inodes.
*/
- if (progress) {
+ if (progress || !queued) {
spin_unlock(&wb->list_lock);
continue;
}
--
2.43.0
From: Jean-Baptiste Maneyrol <jean-baptiste.maneyrol(a)tdk.com>
ODR switching happens in 2 steps, update to store the new value and then
apply when the ODR change flag is received in the data. When switching to
the same ODR value, the ODR change flag is never happening, and frequency
switching is blocked waiting for the never coming apply.
Fix the issue by preventing update to happen when switching to same ODR
value.
Fixes: 0ecc363ccea7 ("iio: make invensense timestamp module generic")
Cc: stable(a)vger.kernel.org
Signed-off-by: Jean-Baptiste Maneyrol <jean-baptiste.maneyrol(a)tdk.com>
---
drivers/iio/common/inv_sensors/inv_sensors_timestamp.c | 6 +++++-
1 file changed, 5 insertions(+), 1 deletion(-)
diff --git a/drivers/iio/common/inv_sensors/inv_sensors_timestamp.c b/drivers/iio/common/inv_sensors/inv_sensors_timestamp.c
index fa205f17bd90..f44458c380d9 100644
--- a/drivers/iio/common/inv_sensors/inv_sensors_timestamp.c
+++ b/drivers/iio/common/inv_sensors/inv_sensors_timestamp.c
@@ -60,11 +60,15 @@ EXPORT_SYMBOL_NS_GPL(inv_sensors_timestamp_init, IIO_INV_SENSORS_TIMESTAMP);
int inv_sensors_timestamp_update_odr(struct inv_sensors_timestamp *ts,
uint32_t period, bool fifo)
{
+ uint32_t mult;
+
/* when FIFO is on, prevent odr change if one is already pending */
if (fifo && ts->new_mult != 0)
return -EAGAIN;
- ts->new_mult = period / ts->chip.clock_period;
+ mult = period / ts->chip.clock_period;
+ if (mult != ts->mult)
+ ts->new_mult = mult;
return 0;
}
--
2.34.1
This is the start of the stable review cycle for the 6.9.2 release.
There are 25 patches in this series, all will be posted as a response
to this one. If anyone has any issues with these being applied, please
let me know.
Responses should be made by Sat, 25 May 2024 13:03:15 +0000.
Anything received after that time might be too late.
The whole patch series can be found in one patch at:
https://www.kernel.org/pub/linux/kernel/v6.x/stable-review/patch-6.9.2-rc1.…
or in the git tree and branch at:
git://git.kernel.org/pub/scm/linux/kernel/git/stable/linux-stable-rc.git linux-6.9.y
and the diffstat can be found below.
thanks,
greg k-h
-------------
Pseudo-Shortlog of commits:
Greg Kroah-Hartman <gregkh(a)linuxfoundation.org>
Linux 6.9.2-rc1
Christoph Hellwig <hch(a)lst.de>
block: add a partscan sysfs attribute for disks
Christoph Hellwig <hch(a)lst.de>
block: add a disk_has_partscan helper
SeongJae Park <sj(a)kernel.org>
Docs/admin-guide/mm/damon/usage: fix wrong schemes effective quota update command
SeongJae Park <sj(a)kernel.org>
Docs/admin-guide/mm/damon/usage: fix wrong example of DAMOS filter matching sysfs file
Akira Yokosawa <akiyks(a)gmail.com>
docs: kernel_include.py: Cope with docutils 0.21
Thomas Weißschuh <linux(a)weissschuh.net>
admin-guide/hw-vuln/core-scheduling: fix return type of PR_SCHED_CORE_GET
Jarkko Sakkinen <jarkko(a)kernel.org>
KEYS: trusted: Do not use WARN when encode fails
Hans Verkuil <hverkuil-cisco(a)xs4all.nl>
Revert "media: v4l2-ctrls: show all owned controls in log_status"
AngeloGioacchino Del Regno <angelogioacchino.delregno(a)collabora.com>
remoteproc: mediatek: Make sure IPI buffer fits in L2TCM
Daniel Thompson <daniel.thompson(a)linaro.org>
serial: kgdboc: Fix NMI-safety problems from keyboard reset code
Javier Carrasco <javier.carrasco(a)wolfvision.net>
usb: typec: tipd: fix event checking for tps6598x
Javier Carrasco <javier.carrasco(a)wolfvision.net>
usb: typec: tipd: fix event checking for tps25750
Heikki Krogerus <heikki.krogerus(a)linux.intel.com>
usb: typec: ucsi: displayport: Fix potential deadlock
Jose Ignacio Tornos Martinez <jtornosm(a)redhat.com>
net: usb: ax88179_178a: fix link status when link is set to down/up
Prashanth K <quic_prashk(a)quicinc.com>
usb: dwc3: Wait unconditionally after issuing EndXfer command
Carlos Llamas <cmllamas(a)google.com>
binder: fix max_thread type inconsistency
Bard Liao <yung-chuan.liao(a)linux.intel.com>
ASoC: Intel: sof_sdw: use generic rtd_init function for Realtek SDW DMICs
Jarkko Sakkinen <jarkko(a)kernel.org>
KEYS: trusted: Fix memory leak in tpm2_key_encode()
Sungwoo Kim <iam(a)sung-woo.kim>
Bluetooth: L2CAP: Fix div-by-zero in l2cap_le_flowctl_init()
Uros Bizjak <ubizjak(a)gmail.com>
x86/percpu: Use __force to cast from __percpu address space
Ronald Wahl <ronald.wahl(a)raritan.com>
net: ks8851: Fix another TX stall caused by wrong ISR flag handling
Jose Fernandez <josef(a)netflix.com>
drm/amd/display: Fix division by zero in setup_dsc_config
Perry Yuan <perry.yuan(a)amd.com>
cpufreq: amd-pstate: fix the highest frequency issue which limits performance
Ben Greear <greearb(a)candelatech.com>
wifi: iwlwifi: Use request_module_nowait
Peter Tsao <peter.tsao(a)mediatek.com>
Bluetooth: btusb: Fix the patch for MT7920 the affected to MT7921
-------------
Diffstat:
Documentation/ABI/stable/sysfs-block | 10 +++
.../admin-guide/hw-vuln/core-scheduling.rst | 4 +-
Documentation/admin-guide/mm/damon/usage.rst | 6 +-
Documentation/sphinx/kernel_include.py | 1 -
Makefile | 4 +-
arch/x86/include/asm/percpu.h | 6 +-
block/genhd.c | 15 +++--
block/partitions/core.c | 5 +-
drivers/android/binder.c | 2 +-
drivers/android/binder_internal.h | 2 +-
drivers/bluetooth/btusb.c | 1 +
drivers/cpufreq/amd-pstate.c | 22 ++++++-
drivers/gpu/drm/amd/display/dc/dsc/dc_dsc.c | 7 +-
drivers/media/v4l2-core/v4l2-ctrls-core.c | 18 ++----
drivers/net/ethernet/micrel/ks8851_common.c | 18 +-----
drivers/net/usb/ax88179_178a.c | 37 +++++++----
drivers/net/wireless/intel/iwlwifi/iwl-drv.c | 10 +--
drivers/remoteproc/mtk_scp.c | 10 ++-
drivers/tty/serial/kgdboc.c | 30 ++++++++-
drivers/usb/dwc3/gadget.c | 4 +-
drivers/usb/typec/tipd/core.c | 51 ++++++++++-----
drivers/usb/typec/tipd/tps6598x.h | 11 ++++
drivers/usb/typec/ucsi/displayport.c | 4 --
include/linux/blkdev.h | 13 ++++
include/net/bluetooth/hci.h | 9 +++
include/net/bluetooth/hci_core.h | 1 +
net/bluetooth/hci_conn.c | 75 +++++++++++++++-------
net/bluetooth/hci_event.c | 31 +++++----
net/bluetooth/iso.c | 2 +-
net/bluetooth/l2cap_core.c | 17 +----
net/bluetooth/sco.c | 6 +-
security/keys/trusted-keys/trusted_tpm2.c | 25 ++++++--
sound/soc/intel/boards/Makefile | 1 +
sound/soc/intel/boards/sof_sdw.c | 12 ++--
sound/soc/intel/boards/sof_sdw_common.h | 1 +
sound/soc/intel/boards/sof_sdw_rt_dmic.c | 52 +++++++++++++++
36 files changed, 357 insertions(+), 166 deletions(-)
This is the start of the stable review cycle for the 6.6.32 release.
There are 102 patches in this series, all will be posted as a response
to this one. If anyone has any issues with these being applied, please
let me know.
Responses should be made by Sat, 25 May 2024 13:03:15 +0000.
Anything received after that time might be too late.
The whole patch series can be found in one patch at:
https://www.kernel.org/pub/linux/kernel/v6.x/stable-review/patch-6.6.32-rc1…
or in the git tree and branch at:
git://git.kernel.org/pub/scm/linux/kernel/git/stable/linux-stable-rc.git linux-6.6.y
and the diffstat can be found below.
thanks,
greg k-h
-------------
Pseudo-Shortlog of commits:
Greg Kroah-Hartman <gregkh(a)linuxfoundation.org>
Linux 6.6.32-rc1
Christoph Hellwig <hch(a)lst.de>
block: add a partscan sysfs attribute for disks
Christoph Hellwig <hch(a)lst.de>
block: add a disk_has_partscan helper
SeongJae Park <sj(a)kernel.org>
Docs/admin-guide/mm/damon/usage: fix wrong example of DAMOS filter matching sysfs file
Akira Yokosawa <akiyks(a)gmail.com>
docs: kernel_include.py: Cope with docutils 0.21
Thomas Weißschuh <linux(a)weissschuh.net>
admin-guide/hw-vuln/core-scheduling: fix return type of PR_SCHED_CORE_GET
Jarkko Sakkinen <jarkko(a)kernel.org>
KEYS: trusted: Do not use WARN when encode fails
AngeloGioacchino Del Regno <angelogioacchino.delregno(a)collabora.com>
remoteproc: mediatek: Make sure IPI buffer fits in L2TCM
Daniel Thompson <daniel.thompson(a)linaro.org>
serial: kgdboc: Fix NMI-safety problems from keyboard reset code
Javier Carrasco <javier.carrasco(a)wolfvision.net>
usb: typec: tipd: fix event checking for tps6598x
Heikki Krogerus <heikki.krogerus(a)linux.intel.com>
usb: typec: ucsi: displayport: Fix potential deadlock
Jose Ignacio Tornos Martinez <jtornosm(a)redhat.com>
net: usb: ax88179_178a: fix link status when link is set to down/up
Prashanth K <quic_prashk(a)quicinc.com>
usb: dwc3: Wait unconditionally after issuing EndXfer command
Carlos Llamas <cmllamas(a)google.com>
binder: fix max_thread type inconsistency
Srinivasan Shanmugam <srinivasan.shanmugam(a)amd.com>
drm/amdgpu: Fix possible NULL dereference in amdgpu_ras_query_error_status_helper()
Christian Brauner <brauner(a)kernel.org>
erofs: reliably distinguish block based and fscache mode
Baokun Li <libaokun1(a)huawei.com>
erofs: get rid of erofs_fs_context
Jiri Olsa <jolsa(a)kernel.org>
bpf: Add missing BPF_LINK_TYPE invocations
Mark Brown <broonie(a)kernel.org>
kselftest: Add a ksft_perror() helper
Mengqi Zhang <mengqi.zhang(a)mediatek.com>
mmc: core: Add HS400 tuning in HS400es initialization
Jarkko Sakkinen <jarkko(a)kernel.org>
KEYS: trusted: Fix memory leak in tpm2_key_encode()
Sungwoo Kim <iam(a)sung-woo.kim>
Bluetooth: L2CAP: Fix div-by-zero in l2cap_le_flowctl_init()
Sungwoo Kim <iam(a)sung-woo.kim>
Bluetooth: L2CAP: Fix slab-use-after-free in l2cap_connect()
Jacob Keller <jacob.e.keller(a)intel.com>
ice: remove unnecessary duplicate checks for VF VSI ID
Jacob Keller <jacob.e.keller(a)intel.com>
ice: pass VSI pointer into ice_vc_isvalid_q_id
Ronald Wahl <ronald.wahl(a)raritan.com>
net: ks8851: Fix another TX stall caused by wrong ISR flag handling
Jose Fernandez <josef(a)netflix.com>
drm/amd/display: Fix division by zero in setup_dsc_config
Gustavo A. R. Silva <gustavoars(a)kernel.org>
smb: smb2pdu.h: Avoid -Wflex-array-member-not-at-end warnings
Namjae Jeon <linkinjeon(a)kernel.org>
ksmbd: add continuous availability share parameter
David Howells <dhowells(a)redhat.com>
cifs: Add tracing for the cifs_tcon struct refcounting
Paulo Alcantara <pc(a)manguebit.com>
smb: client: instantiate when creating SFU files
Paulo Alcantara <pc(a)manguebit.com>
smb: client: fix NULL ptr deref in cifs_mark_open_handles_for_deleted_file()
Steve French <stfrench(a)microsoft.com>
smb3: add trace event for mknod
Steve French <stfrench(a)microsoft.com>
smb311: additional compression flag defined in updated protocol spec
Steve French <stfrench(a)microsoft.com>
smb311: correct incorrect offset field in compression header
Steve French <stfrench(a)microsoft.com>
cifs: Move some extern decls from .c files to .h
Namjae Jeon <linkinjeon(a)kernel.org>
ksmbd: fix potencial out-of-bounds when buffer offset is invalid
Namjae Jeon <linkinjeon(a)kernel.org>
ksmbd: fix slab-out-of-bounds in smb_strndup_from_utf16()
Colin Ian King <colin.i.king(a)gmail.com>
ksmbd: Fix spelling mistake "connction" -> "connection"
Marios Makassikis <mmakassikis(a)freebox.fr>
ksmbd: fix possible null-deref in smb_lazy_parent_lease_break_close
Bharath SM <bharathsm(a)microsoft.com>
cifs: remove redundant variable assignment
Meetakshi Setiya <msetiya(a)microsoft.com>
cifs: fixes for get_inode_info
Bharath SM <bharathsm(a)microsoft.com>
cifs: defer close file handles having RH lease
Namjae Jeon <linkinjeon(a)kernel.org>
ksmbd: add support for durable handles v1/v2
Namjae Jeon <linkinjeon(a)kernel.org>
ksmbd: mark SMB2_SESSION_EXPIRED to session when destroying previous session
Enzo Matsumiya <ematsumiya(a)suse.de>
smb: common: simplify compression headers
Enzo Matsumiya <ematsumiya(a)suse.de>
smb: common: fix fields sizes in compression_pattern_payload_v1
Enzo Matsumiya <ematsumiya(a)suse.de>
smb: client: negotiate compression algorithms
Steve French <stfrench(a)microsoft.com>
smb3: add dynamic trace point for ioctls
Paulo Alcantara <pc(a)manguebit.com>
smb: client: return reparse type in /proc/mounts
Paulo Alcantara <pc(a)manguebit.com>
smb: client: set correct d_type for reparse DFS/DFSR and mount point
Paulo Alcantara <pc(a)manguebit.com>
smb: client: parse uid, gid, mode and dev from WSL reparse points
Steve French <stfrench(a)microsoft.com>
smb: client: introduce SMB2_OP_QUERY_WSL_EA
Dan Carpenter <dan.carpenter(a)linaro.org>
smb: client: Fix a NULL vs IS_ERR() check in wsl_set_xattrs()
Paulo Alcantara <pc(a)manguebit.com>
smb: client: add support for WSL reparse points
Paulo Alcantara <pc(a)manguebit.com>
smb: client: reduce number of parameters in smb2_compound_op()
Paulo Alcantara <pc(a)manguebit.com>
smb: client: fix potential broken compound request
Paulo Alcantara <pc(a)manguebit.com>
smb: client: move most of reparse point handling code to common file
Paulo Alcantara <pc(a)manguebit.com>
smb: client: introduce reparse mount option
Meetakshi Setiya <msetiya(a)microsoft.com>
smb: client: retry compound request without reusing lease
Steve French <stfrench(a)microsoft.com>
smb: client: do not defer close open handles to deleted files
Meetakshi Setiya <msetiya(a)microsoft.com>
smb: client: reuse file lease key in compound operations
Paulo Alcantara <pc(a)manguebit.com>
smb: client: get rid of smb311_posix_query_path_info()
Steve French <stfrench(a)microsoft.com>
smb: client: parse owner/group when creating reparse points
Steve French <stfrench(a)microsoft.com>
smb3: update allocation size more accurately on write completion
Paulo Alcantara <pc(a)manguebit.com>
smb: client: handle path separator of created SMB symlinks
Steve French <stfrench(a)microsoft.com>
cifs: update the same create_guid on replay
Yang Li <yang.lee(a)linux.alibaba.com>
ksmbd: Add kernel-doc for ksmbd_extract_sharename() function
Shyam Prasad N <sprasad(a)microsoft.com>
cifs: set replay flag for retries of write command
Shyam Prasad N <sprasad(a)microsoft.com>
cifs: commands that are retried should have replay flag set
Alexey Dobriyan <adobriyan(a)gmail.com>
smb: client: delete "true", "false" defines
Yang Li <yang.lee(a)linux.alibaba.com>
smb: Fix some kernel-doc comments
Shyam Prasad N <sprasad(a)microsoft.com>
cifs: new mount option called retrans
Paulo Alcantara <pc(a)manguebit.com>
smb: client: don't clobber ->i_rdev from cached reparse points
Shyam Prasad N <sprasad(a)microsoft.com>
cifs: new nt status codes from MS-SMB2
Shyam Prasad N <sprasad(a)microsoft.com>
cifs: pick channel for tcon and tdis
Steve French <stfrench(a)microsoft.com>
cifs: minor comment cleanup
Colin Ian King <colin.i.king(a)gmail.com>
cifs: remove redundant variable tcon_exist
Randy Dunlap <rdunlap(a)infradead.org>
ksmbd: vfs: fix all kernel-doc warnings
Randy Dunlap <rdunlap(a)infradead.org>
ksmbd: auth: fix most kernel-doc warnings
Steve French <stfrench(a)microsoft.com>
cifs: remove unneeded return statement
Paulo Alcantara <pc(a)manguebit.com>
cifs: get rid of dup length check in parse_reparse_point()
David Howells <dhowells(a)redhat.com>
cifs: Pass unbyteswapped eof value into SMB2_set_eof()
Markus Elfring <elfring(a)users.sourceforge.net>
smb3: Improve exception handling in allocate_mr_list()
Steve French <stfrench(a)microsoft.com>
cifs: fix in logging in cifs_chan_update_iface
Paulo Alcantara <pc(a)manguebit.com>
smb: client: handle special files and symlinks in SMB3 POSIX
Paulo Alcantara <pc(a)manguebit.com>
smb: client: cleanup smb2_query_reparse_point()
Paulo Alcantara <pc(a)manguebit.com>
smb: client: allow creating symlinks via reparse points
Steve French <stfrench(a)microsoft.com>
smb: client: optimise reparse point querying
Steve French <stfrench(a)microsoft.com>
smb: client: allow creating special files via reparse points
Steve French <stfrench(a)microsoft.com>
smb: client: extend smb2_compound_op() to accept more commands
Pierre Mariani <pierre.mariani(a)gmail.com>
smb: client: Fix minor whitespace errors and warnings
Steve French <stfrench(a)microsoft.com>
smb: client: introduce cifs_sfu_make_node()
Ritvik Budhiraja <rbudhiraja(a)microsoft.com>
cifs: fix use after free for iface while disabling secondary channels
Steve French <stfrench(a)microsoft.com>
Missing field not being returned in ioctl CIFS_IOC_GET_MNT_INFO
Steve French <stfrench(a)microsoft.com>
smb3: minor cleanup of session handling code
Steve French <stfrench(a)microsoft.com>
smb3: more minor cleanups for session handling routines
Steve French <stfrench(a)microsoft.com>
smb3: minor RDMA cleanup
Shyam Prasad N <sprasad(a)microsoft.com>
cifs: print server capabilities in DebugData
Eric Biggers <ebiggers(a)google.com>
smb: use crypto_shash_digest() in symlink_hash()
Steve French <stfrench(a)microsoft.com>
Add definition for new smb3.1.1 command type
Steve French <stfrench(a)microsoft.com>
SMB3: clarify some of the unused CreateOption flags
Meetakshi Setiya <msetiya(a)microsoft.com>
cifs: Add client version details to NTLM authenticate message
-------------
Diffstat:
Documentation/ABI/stable/sysfs-block | 10 +
.../admin-guide/hw-vuln/core-scheduling.rst | 4 +-
Documentation/admin-guide/mm/damon/usage.rst | 2 +-
Documentation/sphinx/kernel_include.py | 1 -
Makefile | 4 +-
block/genhd.c | 15 +-
block/partitions/core.c | 5 +-
drivers/android/binder.c | 2 +-
drivers/android/binder_internal.h | 2 +-
drivers/gpu/drm/amd/amdgpu/amdgpu_ras.c | 3 +
drivers/gpu/drm/amd/display/dc/dsc/dc_dsc.c | 7 +-
drivers/mmc/core/mmc.c | 9 +-
drivers/net/ethernet/intel/ice/ice_virtchnl.c | 22 +-
drivers/net/ethernet/intel/ice/ice_virtchnl_fdir.c | 3 -
drivers/net/ethernet/micrel/ks8851_common.c | 18 +-
drivers/net/usb/ax88179_178a.c | 37 +-
drivers/remoteproc/mtk_scp.c | 10 +-
drivers/tty/serial/kgdboc.c | 30 +-
drivers/usb/dwc3/gadget.c | 4 +-
drivers/usb/typec/tipd/core.c | 45 +-
drivers/usb/typec/tipd/tps6598x.h | 11 +
drivers/usb/typec/ucsi/displayport.c | 4 -
fs/erofs/internal.h | 7 -
fs/erofs/super.c | 124 +-
fs/smb/client/Makefile | 2 +-
fs/smb/client/cached_dir.c | 24 +-
fs/smb/client/cifs_debug.c | 38 +-
fs/smb/client/cifsfs.c | 10 +-
fs/smb/client/cifsglob.h | 93 +-
fs/smb/client/cifsproto.h | 39 +-
fs/smb/client/cifssmb.c | 18 +-
fs/smb/client/connect.c | 57 +-
fs/smb/client/dir.c | 14 +-
fs/smb/client/file.c | 39 +-
fs/smb/client/fs_context.c | 43 +-
fs/smb/client/fs_context.h | 13 +-
fs/smb/client/fscache.c | 7 +
fs/smb/client/inode.c | 235 ++--
fs/smb/client/ioctl.c | 6 +
fs/smb/client/link.c | 41 +-
fs/smb/client/misc.c | 47 +-
fs/smb/client/ntlmssp.h | 4 +-
fs/smb/client/readdir.c | 32 +-
fs/smb/client/reparse.c | 532 ++++++++
fs/smb/client/reparse.h | 113 ++
fs/smb/client/sess.c | 73 +-
fs/smb/client/smb1ops.c | 80 +-
fs/smb/client/smb2glob.h | 27 +-
fs/smb/client/smb2inode.c | 1402 +++++++++++++-------
fs/smb/client/smb2maperror.c | 2 +
fs/smb/client/smb2misc.c | 10 +-
fs/smb/client/smb2ops.c | 603 ++++-----
fs/smb/client/smb2pdu.c | 340 ++++-
fs/smb/client/smb2pdu.h | 46 +-
fs/smb/client/smb2proto.h | 37 +-
fs/smb/client/smb2status.h | 2 +
fs/smb/client/smb2transport.c | 2 +
fs/smb/client/smbdirect.c | 4 +-
fs/smb/client/smbencrypt.c | 7 -
fs/smb/client/trace.h | 137 +-
fs/smb/common/smb2pdu.h | 122 +-
fs/smb/common/smbfsctl.h | 6 -
fs/smb/server/auth.c | 14 +-
fs/smb/server/ksmbd_netlink.h | 36 +-
fs/smb/server/mgmt/user_session.c | 28 +-
fs/smb/server/mgmt/user_session.h | 3 +
fs/smb/server/misc.c | 1 +
fs/smb/server/oplock.c | 96 +-
fs/smb/server/oplock.h | 7 +-
fs/smb/server/smb2misc.c | 26 +-
fs/smb/server/smb2ops.c | 6 +
fs/smb/server/smb2pdu.c | 338 ++++-
fs/smb/server/smb2pdu.h | 31 +-
fs/smb/server/transport_tcp.c | 2 +
fs/smb/server/vfs.c | 28 +-
fs/smb/server/vfs_cache.c | 137 +-
fs/smb/server/vfs_cache.h | 9 +
include/linux/blkdev.h | 13 +
include/linux/bpf_types.h | 3 +
include/net/bluetooth/hci.h | 9 +
include/net/bluetooth/hci_core.h | 1 +
net/bluetooth/hci_conn.c | 71 +-
net/bluetooth/hci_event.c | 31 +-
net/bluetooth/iso.c | 2 +-
net/bluetooth/l2cap_core.c | 38 +-
net/bluetooth/sco.c | 6 +-
security/keys/trusted-keys/trusted_tpm2.c | 25 +-
tools/testing/selftests/kselftest.h | 14 +
88 files changed, 3923 insertions(+), 1738 deletions(-)
This is the start of the stable review cycle for the 6.1.92 release.
There are 45 patches in this series, all will be posted as a response
to this one. If anyone has any issues with these being applied, please
let me know.
Responses should be made by Sat, 25 May 2024 13:03:15 +0000.
Anything received after that time might be too late.
The whole patch series can be found in one patch at:
https://www.kernel.org/pub/linux/kernel/v6.x/stable-review/patch-6.1.92-rc1…
or in the git tree and branch at:
git://git.kernel.org/pub/scm/linux/kernel/git/stable/linux-stable-rc.git linux-6.1.y
and the diffstat can be found below.
thanks,
greg k-h
-------------
Pseudo-Shortlog of commits:
Greg Kroah-Hartman <gregkh(a)linuxfoundation.org>
Linux 6.1.92-rc1
Akira Yokosawa <akiyks(a)gmail.com>
docs: kernel_include.py: Cope with docutils 0.21
Thomas Weißschuh <linux(a)weissschuh.net>
admin-guide/hw-vuln/core-scheduling: fix return type of PR_SCHED_CORE_GET
Jarkko Sakkinen <jarkko(a)kernel.org>
KEYS: trusted: Do not use WARN when encode fails
AngeloGioacchino Del Regno <angelogioacchino.delregno(a)collabora.com>
remoteproc: mediatek: Make sure IPI buffer fits in L2TCM
Daniel Thompson <daniel.thompson(a)linaro.org>
serial: kgdboc: Fix NMI-safety problems from keyboard reset code
Javier Carrasco <javier.carrasco(a)wolfvision.net>
usb: typec: tipd: fix event checking for tps6598x
Heikki Krogerus <heikki.krogerus(a)linux.intel.com>
usb: typec: ucsi: displayport: Fix potential deadlock
Jose Ignacio Tornos Martinez <jtornosm(a)redhat.com>
net: usb: ax88179_178a: fix link status when link is set to down/up
Prashanth K <quic_prashk(a)quicinc.com>
usb: dwc3: Wait unconditionally after issuing EndXfer command
Carlos Llamas <cmllamas(a)google.com>
binder: fix max_thread type inconsistency
Srinivasan Shanmugam <srinivasan.shanmugam(a)amd.com>
drm/amdgpu: Fix possible NULL dereference in amdgpu_ras_query_error_status_helper()
Mark Rutland <mark.rutland(a)arm.com>
arm64: atomics: lse: remove stale dependency on JUMP_LABEL
Eric Sandeen <sandeen(a)redhat.com>
xfs: short circuit xfs_growfs_data_private() if delta is zero
Hironori Shiina <shiina.hironori(a)gmail.com>
xfs: get root inode correctly at bulkstat
Darrick J. Wong <djwong(a)kernel.org>
xfs: fix log recovery when unknown rocompat bits are set
Darrick J. Wong <djwong(a)kernel.org>
xfs: allow inode inactivation during a ro mount log recovery
Darrick J. Wong <djwong(a)kernel.org>
xfs: invalidate xfs_bufs when allocating cow extents
Darrick J. Wong <djwong(a)kernel.org>
xfs: estimate post-merge refcounts correctly
Darrick J. Wong <djwong(a)kernel.org>
xfs: hoist refcount record merge predicates
Guo Xuenan <guoxuenan(a)huawei.com>
xfs: fix super block buf log item UAF during force shutdown
Guo Xuenan <guoxuenan(a)huawei.com>
xfs: wait iclog complete before tearing down AIL
Darrick J. Wong <djwong(a)kernel.org>
xfs: attach dquots to inode before reading data/cow fork mappings
Darrick J. Wong <djwong(a)kernel.org>
xfs: invalidate block device page cache during unmount
Long Li <leo.lilong(a)huawei.com>
xfs: fix incorrect i_nlink caused by inode racing
Long Li <leo.lilong(a)huawei.com>
xfs: fix sb write verify for lazysbcount
Darrick J. Wong <djwong(a)kernel.org>
xfs: fix incorrect error-out in xfs_remove
Dave Chinner <dchinner(a)redhat.com>
xfs: fix off-by-one-block in xfs_discard_folio()
Dave Chinner <dchinner(a)redhat.com>
xfs: drop write error injection is unfixable, remove it
Dave Chinner <dchinner(a)redhat.com>
xfs: use iomap_valid method to detect stale cached iomaps
Dave Chinner <dchinner(a)redhat.com>
iomap: write iomap validity checks
Dave Chinner <dchinner(a)redhat.com>
xfs: xfs_bmap_punch_delalloc_range() should take a byte range
Dave Chinner <dchinner(a)redhat.com>
iomap: buffered write failure should not truncate the page cache
Dave Chinner <dchinner(a)redhat.com>
xfs,iomap: move delalloc punching to iomap
Dave Chinner <dchinner(a)redhat.com>
xfs: use byte ranges for write cleanup ranges
Dave Chinner <dchinner(a)redhat.com>
xfs: punching delalloc extents on write failure is racy
Dave Chinner <dchinner(a)redhat.com>
xfs: write page faults in iomap are not buffered writes
Mengqi Zhang <mengqi.zhang(a)mediatek.com>
mmc: core: Add HS400 tuning in HS400es initialization
Jarkko Sakkinen <jarkko(a)kernel.org>
KEYS: trusted: Fix memory leak in tpm2_key_encode()
NeilBrown <neilb(a)suse.de>
nfsd: don't allow nfsd threads to be signalled.
Aidan MacDonald <aidanmacdonald.0x0(a)gmail.com>
mfd: stpmic1: Fix swapped mask/unmask in irq chip
Sergey Shtylyov <s.shtylyov(a)omp.ru>
pinctrl: core: handle radix_tree_insert() errors in pinctrl_register_one_pin()
Jacob Keller <jacob.e.keller(a)intel.com>
ice: remove unnecessary duplicate checks for VF VSI ID
Jacob Keller <jacob.e.keller(a)intel.com>
ice: pass VSI pointer into ice_vc_isvalid_q_id
Ronald Wahl <ronald.wahl(a)raritan.com>
net: ks8851: Fix another TX stall caused by wrong ISR flag handling
Jose Fernandez <josef(a)netflix.com>
drm/amd/display: Fix division by zero in setup_dsc_config
-------------
Diffstat:
.../admin-guide/hw-vuln/core-scheduling.rst | 4 +-
Documentation/sphinx/kernel_include.py | 1 -
Makefile | 4 +-
arch/arm64/Kconfig | 1 -
arch/arm64/include/asm/lse.h | 1 -
drivers/android/binder.c | 2 +-
drivers/android/binder_internal.h | 2 +-
drivers/gpu/drm/amd/amdgpu/amdgpu_ras.c | 3 +
drivers/gpu/drm/amd/display/dc/dsc/dc_dsc.c | 7 +-
drivers/mfd/stpmic1.c | 5 +-
drivers/mmc/core/mmc.c | 9 +-
drivers/net/ethernet/intel/ice/ice_virtchnl.c | 22 +-
drivers/net/ethernet/intel/ice/ice_virtchnl_fdir.c | 3 -
drivers/net/ethernet/micrel/ks8851_common.c | 18 +-
drivers/net/usb/ax88179_178a.c | 37 ++-
drivers/pinctrl/core.c | 14 +-
drivers/remoteproc/mtk_scp.c | 10 +-
drivers/tty/serial/kgdboc.c | 30 ++-
drivers/usb/dwc3/gadget.c | 4 +-
drivers/usb/typec/tipd/core.c | 45 ++--
drivers/usb/typec/tipd/tps6598x.h | 11 +
drivers/usb/typec/ucsi/displayport.c | 4 -
fs/iomap/buffered-io.c | 254 ++++++++++++++++++++-
fs/iomap/iter.c | 19 +-
fs/nfs/callback.c | 9 +-
fs/nfsd/nfs4proc.c | 5 +-
fs/nfsd/nfssvc.c | 12 -
fs/xfs/libxfs/xfs_bmap.c | 8 +-
fs/xfs/libxfs/xfs_errortag.h | 12 +-
fs/xfs/libxfs/xfs_refcount.c | 146 ++++++++++--
fs/xfs/libxfs/xfs_sb.c | 7 +-
fs/xfs/xfs_aops.c | 37 +--
fs/xfs/xfs_bmap_util.c | 10 +-
fs/xfs/xfs_bmap_util.h | 2 +-
fs/xfs/xfs_buf.c | 1 +
fs/xfs/xfs_buf_item.c | 2 +
fs/xfs/xfs_error.c | 27 ++-
fs/xfs/xfs_file.c | 2 +-
fs/xfs/xfs_fsops.c | 4 +
fs/xfs/xfs_icache.c | 6 +
fs/xfs/xfs_inode.c | 16 +-
fs/xfs/xfs_ioctl.c | 4 +-
fs/xfs/xfs_iomap.c | 177 ++++++++------
fs/xfs/xfs_iomap.h | 6 +-
fs/xfs/xfs_log.c | 53 ++---
fs/xfs/xfs_mount.c | 15 ++
fs/xfs/xfs_pnfs.c | 6 +-
include/linux/iomap.h | 47 +++-
net/sunrpc/svc_xprt.c | 16 +-
security/keys/trusted-keys/trusted_tpm2.c | 25 +-
50 files changed, 866 insertions(+), 299 deletions(-)
This is the start of the stable review cycle for the 6.8.11 release.
There are 23 patches in this series, all will be posted as a response
to this one. If anyone has any issues with these being applied, please
let me know.
Responses should be made by Sat, 25 May 2024 13:03:15 +0000.
Anything received after that time might be too late.
The whole patch series can be found in one patch at:
https://www.kernel.org/pub/linux/kernel/v6.x/stable-review/patch-6.8.11-rc1…
or in the git tree and branch at:
git://git.kernel.org/pub/scm/linux/kernel/git/stable/linux-stable-rc.git linux-6.8.y
and the diffstat can be found below.
thanks,
greg k-h
-------------
Pseudo-Shortlog of commits:
Greg Kroah-Hartman <gregkh(a)linuxfoundation.org>
Linux 6.8.11-rc1
Christoph Hellwig <hch(a)lst.de>
block: add a partscan sysfs attribute for disks
Christoph Hellwig <hch(a)lst.de>
block: add a disk_has_partscan helper
SeongJae Park <sj(a)kernel.org>
Docs/admin-guide/mm/damon/usage: fix wrong example of DAMOS filter matching sysfs file
Akira Yokosawa <akiyks(a)gmail.com>
docs: kernel_include.py: Cope with docutils 0.21
Thomas Weißschuh <linux(a)weissschuh.net>
admin-guide/hw-vuln/core-scheduling: fix return type of PR_SCHED_CORE_GET
Jarkko Sakkinen <jarkko(a)kernel.org>
KEYS: trusted: Do not use WARN when encode fails
AngeloGioacchino Del Regno <angelogioacchino.delregno(a)collabora.com>
remoteproc: mediatek: Make sure IPI buffer fits in L2TCM
Daniel Thompson <daniel.thompson(a)linaro.org>
serial: kgdboc: Fix NMI-safety problems from keyboard reset code
Javier Carrasco <javier.carrasco(a)wolfvision.net>
usb: typec: tipd: fix event checking for tps6598x
Javier Carrasco <javier.carrasco(a)wolfvision.net>
usb: typec: tipd: fix event checking for tps25750
Heikki Krogerus <heikki.krogerus(a)linux.intel.com>
usb: typec: ucsi: displayport: Fix potential deadlock
Jose Ignacio Tornos Martinez <jtornosm(a)redhat.com>
net: usb: ax88179_178a: fix link status when link is set to down/up
Prashanth K <quic_prashk(a)quicinc.com>
usb: dwc3: Wait unconditionally after issuing EndXfer command
Carlos Llamas <cmllamas(a)google.com>
binder: fix max_thread type inconsistency
Christian Brauner <brauner(a)kernel.org>
erofs: reliably distinguish block based and fscache mode
Baokun Li <libaokun1(a)huawei.com>
erofs: get rid of erofs_fs_context
Jarkko Sakkinen <jarkko(a)kernel.org>
KEYS: trusted: Fix memory leak in tpm2_key_encode()
Sungwoo Kim <iam(a)sung-woo.kim>
Bluetooth: L2CAP: Fix div-by-zero in l2cap_le_flowctl_init()
Sungwoo Kim <iam(a)sung-woo.kim>
Bluetooth: L2CAP: Fix slab-use-after-free in l2cap_connect()
Jacob Keller <jacob.e.keller(a)intel.com>
ice: remove unnecessary duplicate checks for VF VSI ID
Jacob Keller <jacob.e.keller(a)intel.com>
ice: pass VSI pointer into ice_vc_isvalid_q_id
Ronald Wahl <ronald.wahl(a)raritan.com>
net: ks8851: Fix another TX stall caused by wrong ISR flag handling
Jose Fernandez <josef(a)netflix.com>
drm/amd/display: Fix division by zero in setup_dsc_config
-------------
Diffstat:
Documentation/ABI/stable/sysfs-block | 10 ++
.../admin-guide/hw-vuln/core-scheduling.rst | 4 +-
Documentation/admin-guide/mm/damon/usage.rst | 2 +-
Documentation/sphinx/kernel_include.py | 1 -
Makefile | 4 +-
block/genhd.c | 15 ++-
block/partitions/core.c | 5 +-
drivers/android/binder.c | 2 +-
drivers/android/binder_internal.h | 2 +-
drivers/gpu/drm/amd/display/dc/dsc/dc_dsc.c | 7 +-
drivers/net/ethernet/intel/ice/ice_virtchnl.c | 22 ++--
drivers/net/ethernet/intel/ice/ice_virtchnl_fdir.c | 3 -
drivers/net/ethernet/micrel/ks8851_common.c | 18 +--
drivers/net/usb/ax88179_178a.c | 37 ++++--
drivers/remoteproc/mtk_scp.c | 10 +-
drivers/tty/serial/kgdboc.c | 30 ++++-
drivers/usb/dwc3/gadget.c | 4 +-
drivers/usb/typec/tipd/core.c | 51 ++++++---
drivers/usb/typec/tipd/tps6598x.h | 11 ++
drivers/usb/typec/ucsi/displayport.c | 4 -
fs/erofs/internal.h | 7 --
fs/erofs/super.c | 124 +++++++++------------
include/linux/blkdev.h | 13 +++
include/net/bluetooth/hci.h | 9 ++
include/net/bluetooth/hci_core.h | 1 +
net/bluetooth/hci_conn.c | 71 ++++++++----
net/bluetooth/hci_event.c | 31 ++++--
net/bluetooth/iso.c | 2 +-
net/bluetooth/l2cap_core.c | 38 +++----
net/bluetooth/sco.c | 6 +-
security/keys/trusted-keys/trusted_tpm2.c | 25 ++++-
31 files changed, 339 insertions(+), 230 deletions(-)
Commit 1b151e2435fc ("block: Remove special-casing of compound
pages") caused a change in behaviour when releasing the pages
if the buffer does not start at the beginning of the page. This
was because the calculation of the number of pages to release
was incorrect.
This was fixed by commit 38b43539d64b ("block: Fix page refcounts
for unaligned buffers in __bio_release_pages()").
We pin the user buffer during direct I/O writes. If this buffer is a
hugepage, bio_release_page() will unpin it and decrement all references
and pin counts at ->bi_end_io. However, if any references to the hugepage
remain post-I/O, the hugepage will not be freed upon unmap, leading
to a memory leak.
This patch verifies that a hugepage, used as a user buffer for DIO
operations, is correctly freed upon unmapping, regardless of whether
the offsets are aligned or unaligned w.r.t page boundary.
Test Result Fail Scenario (Without the fix)
--------------------------------------------------------
[]# ./hugetlb_dio
TAP version 13
1..4
No. Free pages before allocation : 7
No. Free pages after munmap : 7
ok 1 : Huge pages freed successfully !
No. Free pages before allocation : 7
No. Free pages after munmap : 7
ok 2 : Huge pages freed successfully !
No. Free pages before allocation : 7
No. Free pages after munmap : 7
ok 3 : Huge pages freed successfully !
No. Free pages before allocation : 7
No. Free pages after munmap : 6
not ok 4 : Huge pages not freed!
Totals: pass:3 fail:1 xfail:0 xpass:0 skip:0 error:0
Test Result PASS Scenario (With the fix)
---------------------------------------------------------
[]#./hugetlb_dio
TAP version 13
1..4
No. Free pages before allocation : 7
No. Free pages after munmap : 7
ok 1 : Huge pages freed successfully !
No. Free pages before allocation : 7
No. Free pages after munmap : 7
ok 2 : Huge pages freed successfully !
No. Free pages before allocation : 7
No. Free pages after munmap : 7
ok 3 : Huge pages freed successfully !
No. Free pages before allocation : 7
No. Free pages after munmap : 7
ok 4 : Huge pages freed successfully !
Totals: pass:4 fail:0 xfail:0 xpass:0 skip:0 error:0
Signed-off-by: Donet Tom <donettom(a)linux.ibm.com>
Signed-off-by: Ritesh Harjani (IBM) <ritesh.list(a)gmail.com>
---
tools/testing/selftests/mm/Makefile | 1 +
tools/testing/selftests/mm/hugetlb_dio.c | 118 +++++++++++++++++++++++
2 files changed, 119 insertions(+)
create mode 100644 tools/testing/selftests/mm/hugetlb_dio.c
diff --git a/tools/testing/selftests/mm/Makefile b/tools/testing/selftests/mm/Makefile
index eb5f39a2668b..87d8130b3376 100644
--- a/tools/testing/selftests/mm/Makefile
+++ b/tools/testing/selftests/mm/Makefile
@@ -71,6 +71,7 @@ TEST_GEN_FILES += ksm_functional_tests
TEST_GEN_FILES += mdwe_test
TEST_GEN_FILES += hugetlb_fault_after_madv
TEST_GEN_FILES += hugetlb_madv_vs_map
+TEST_GEN_FILES += hugetlb_dio
ifneq ($(ARCH),arm64)
TEST_GEN_FILES += soft-dirty
diff --git a/tools/testing/selftests/mm/hugetlb_dio.c b/tools/testing/selftests/mm/hugetlb_dio.c
new file mode 100644
index 000000000000..6f6587c7913c
--- /dev/null
+++ b/tools/testing/selftests/mm/hugetlb_dio.c
@@ -0,0 +1,118 @@
+// SPDX-License-Identifier: GPL-2.0
+/*
+ * This program tests for hugepage leaks after DIO writes to a file using a
+ * hugepage as the user buffer. During DIO, the user buffer is pinned and
+ * should be properly unpinned upon completion. This patch verifies that the
+ * kernel correctly unpins the buffer at DIO completion for both aligned and
+ * unaligned user buffer offsets (w.r.t page boundary), ensuring the hugepage
+ * is freed upon unmapping.
+ */
+
+#define _GNU_SOURCE
+#include <stdio.h>
+#include <sys/stat.h>
+#include <stdlib.h>
+#include <fcntl.h>
+#include <stdint.h>
+#include <unistd.h>
+#include <string.h>
+#include <sys/mman.h>
+#include "vm_util.h"
+#include "../kselftest.h"
+
+void run_dio_using_hugetlb(unsigned int start_off, unsigned int end_off)
+{
+ int fd;
+ char *buffer = NULL;
+ char *orig_buffer = NULL;
+ size_t h_pagesize = 0;
+ size_t writesize;
+ int free_hpage_b = 0;
+ int free_hpage_a = 0;
+
+ writesize = end_off - start_off;
+
+ /* Get the default huge page size */
+ h_pagesize = default_huge_page_size();
+ if (!h_pagesize)
+ ksft_exit_fail_msg("Unable to determine huge page size\n");
+
+ /* Open the file to DIO */
+ fd = open("/tmp", O_TMPFILE | O_RDWR | O_DIRECT);
+ if (fd < 0)
+ ksft_exit_fail_msg("Error opening file");
+
+ /* Get the free huge pages before allocation */
+ free_hpage_b = get_free_hugepages();
+ if (free_hpage_b == 0) {
+ close(fd);
+ ksft_exit_skip("No free hugepage, exiting!\n");
+ }
+
+ /* Allocate a hugetlb page */
+ orig_buffer = mmap(NULL, h_pagesize, PROT_READ | PROT_WRITE, MAP_PRIVATE
+ | MAP_ANONYMOUS | MAP_HUGETLB, -1, 0);
+ if (orig_buffer == MAP_FAILED) {
+ close(fd);
+ ksft_exit_fail_msg("Error mapping memory");
+ }
+ buffer = orig_buffer;
+ buffer += start_off;
+
+ memset(buffer, 'A', writesize);
+
+ /* Write the buffer to the file */
+ if (write(fd, buffer, writesize) != (writesize)) {
+ munmap(orig_buffer, h_pagesize);
+ close(fd);
+ ksft_exit_fail_msg("Error writing to file");
+ }
+
+ /* unmap the huge page */
+ munmap(orig_buffer, h_pagesize);
+ close(fd);
+
+ /* Get the free huge pages after unmap*/
+ free_hpage_a = get_free_hugepages();
+
+ /*
+ * If the no. of free hugepages before allocation and after unmap does
+ * not match - that means there could still be a page which is pinned.
+ */
+ if (free_hpage_a != free_hpage_b) {
+ printf("No. Free pages before allocation : %d\n", free_hpage_b);
+ printf("No. Free pages after munmap : %d\n", free_hpage_a);
+ ksft_test_result_fail(": Huge pages not freed!\n");
+ } else {
+ printf("No. Free pages before allocation : %d\n", free_hpage_b);
+ printf("No. Free pages after munmap : %d\n", free_hpage_a);
+ ksft_test_result_pass(": Huge pages freed successfully !\n");
+ }
+}
+
+int main(void)
+{
+ size_t pagesize = 0;
+
+ ksft_print_header();
+ ksft_set_plan(4);
+
+ /* Get base page size */
+ pagesize = psize();
+
+ /* start and end is aligned to pagesize */
+ run_dio_using_hugetlb(0, (pagesize * 3));
+
+ /* start is aligned but end is not aligned */
+ run_dio_using_hugetlb(0, (pagesize * 3) - (pagesize / 2));
+
+ /* start is unaligned and end is aligned */
+ run_dio_using_hugetlb(pagesize / 2, (pagesize * 3));
+
+ /* both start and end are unaligned */
+ run_dio_using_hugetlb(pagesize / 2, (pagesize * 3) + (pagesize / 2));
+
+ ksft_finished();
+ return 0;
+}
+
--
2.39.3
The patch below does not apply to the 5.10-stable tree.
If someone wants it applied there, or to any other stable or longterm
tree, then please email the backport, including the original git commit
id to <stable(a)vger.kernel.org>.
To reproduce the conflict and resubmit, you may use the following commands:
git fetch https://git.kernel.org/pub/scm/linux/kernel/git/stable/linux.git/ linux-5.10.y
git checkout FETCH_HEAD
git cherry-pick -x 3d8f874bd620ce03f75a5512847586828ab86544
# <resolve conflicts, build, test, etc.>
git commit -s
git send-email --to '<stable(a)vger.kernel.org>' --in-reply-to '2024052548-prance-gliding-4f31@gregkh' --subject-prefix 'PATCH 5.10.y' HEAD^..
Possible dependencies:
thanks,
greg k-h
------------------ original commit in Linus's tree ------------------
From 3d8f874bd620ce03f75a5512847586828ab86544 Mon Sep 17 00:00:00 2001
From: Ming Lei <ming.lei(a)redhat.com>
Date: Fri, 10 May 2024 11:50:27 +0800
Subject: [PATCH] io_uring: fail NOP if non-zero op flags is passed in
The NOP op flags should have been checked from beginning like any other
opcode, otherwise NOP may not be extended with the op flags.
Given both liburing and Rust io-uring crate always zeros SQE op flags, just
ignore users which play raw NOP uring interface without zeroing SQE, because
NOP is just for test purpose. Then we can save one NOP2 opcode.
Suggested-by: Jens Axboe <axboe(a)kernel.dk>
Fixes: 2b188cc1bb85 ("Add io_uring IO interface")
Cc: stable(a)vger.kernel.org
Signed-off-by: Ming Lei <ming.lei(a)redhat.com>
Link: https://lore.kernel.org/r/20240510035031.78874-2-ming.lei@redhat.com
Signed-off-by: Jens Axboe <axboe(a)kernel.dk>
diff --git a/io_uring/nop.c b/io_uring/nop.c
index d956599a3c1b..1a4e312dfe51 100644
--- a/io_uring/nop.c
+++ b/io_uring/nop.c
@@ -12,6 +12,8 @@
int io_nop_prep(struct io_kiocb *req, const struct io_uring_sqe *sqe)
{
+ if (READ_ONCE(sqe->rw_flags))
+ return -EINVAL;
return 0;
}
tpm2_load_cmd incorrectly checks options->keyhandle also for the legacy
format, as also implied by the inline comment. Check options->keyhandle
when ASN.1 is loaded.
Cc: James Bottomey <James.Bottomley(a)HansenPartnership.com>
Cc: stable(a)vger.kernel.org # v5.13+
Fixes: f2219745250f ("security: keys: trusted: use ASN.1 TPM2 key format for the blobs")
Signed-off-by: Jarkko Sakkinen <jarkko(a)kernel.org>
---
security/keys/trusted-keys/trusted_tpm2.c | 7 +++----
1 file changed, 3 insertions(+), 4 deletions(-)
diff --git a/security/keys/trusted-keys/trusted_tpm2.c b/security/keys/trusted-keys/trusted_tpm2.c
index 8b7dd73d94c1..4f8207bf52a7 100644
--- a/security/keys/trusted-keys/trusted_tpm2.c
+++ b/security/keys/trusted-keys/trusted_tpm2.c
@@ -400,12 +400,11 @@ static int tpm2_load_cmd(struct tpm_chip *chip,
/* old form */
blob = payload->blob;
payload->old_format = 1;
+ } else {
+ if (!options->keyhandle)
+ return -EINVAL;
}
- /* new format carries keyhandle but old format doesn't */
- if (!options->keyhandle)
- return -EINVAL;
-
/* must be big enough for at least the two be16 size counts */
if (payload->blob_len < 4)
return -EINVAL;
--
2.45.1
The patch below does not apply to the 5.10-stable tree.
If someone wants it applied there, or to any other stable or longterm
tree, then please email the backport, including the original git commit
id to <stable(a)vger.kernel.org>.
To reproduce the conflict and resubmit, you may use the following commands:
git fetch https://git.kernel.org/pub/scm/linux/kernel/git/stable/linux.git/ linux-5.10.y
git checkout FETCH_HEAD
git cherry-pick -x 19fb11d7220b8abc016aa254dc7e6d9f2d49b178
# <resolve conflicts, build, test, etc.>
git commit -s
git send-email --to '<stable(a)vger.kernel.org>' --in-reply-to '2024052522-mud-contempt-f78d@gregkh' --subject-prefix 'PATCH 5.10.y' HEAD^..
Possible dependencies:
thanks,
greg k-h
------------------ original commit in Linus's tree ------------------
From 19fb11d7220b8abc016aa254dc7e6d9f2d49b178 Mon Sep 17 00:00:00 2001
From: Nuno Sa <nuno.sa(a)analog.com>
Date: Fri, 26 Apr 2024 17:42:12 +0200
Subject: [PATCH] dt-bindings: adc: axi-adc: add clocks property
Add a required clock property as we can't access the device registers if
the AXI bus clock is not properly enabled.
Note this clock is a very fundamental one that is typically enabled
pretty early during boot. Independently of that, we should really rely on
it to be enabled.
Reviewed-by: Krzysztof Kozlowski <krzysztof.kozlowski(a)linaro.org>
Fixes: 96553a44e96d ("dt-bindings: iio: adc: add bindings doc for AXI ADC driver")
Signed-off-by: Nuno Sa <nuno.sa(a)analog.com>
Link: https://lore.kernel.org/r/20240426-ad9467-new-features-v2-3-6361fc3ba1cc@an…
Cc: <Stable(a)ver.kernel.org>
Signed-off-by: Jonathan Cameron <Jonathan.Cameron(a)huawei.com>
diff --git a/Documentation/devicetree/bindings/iio/adc/adi,axi-adc.yaml b/Documentation/devicetree/bindings/iio/adc/adi,axi-adc.yaml
index 3d49d21ad33d..e1f450b80db2 100644
--- a/Documentation/devicetree/bindings/iio/adc/adi,axi-adc.yaml
+++ b/Documentation/devicetree/bindings/iio/adc/adi,axi-adc.yaml
@@ -28,6 +28,9 @@ properties:
reg:
maxItems: 1
+ clocks:
+ maxItems: 1
+
dmas:
maxItems: 1
@@ -48,6 +51,7 @@ required:
- compatible
- dmas
- reg
+ - clocks
additionalProperties: false
@@ -58,6 +62,7 @@ examples:
reg = <0x44a00000 0x10000>;
dmas = <&rx_dma 0>;
dma-names = "rx";
+ clocks = <&axi_clk>;
#io-backend-cells = <0>;
};
...
The patch below does not apply to the 5.15-stable tree.
If someone wants it applied there, or to any other stable or longterm
tree, then please email the backport, including the original git commit
id to <stable(a)vger.kernel.org>.
To reproduce the conflict and resubmit, you may use the following commands:
git fetch https://git.kernel.org/pub/scm/linux/kernel/git/stable/linux.git/ linux-5.15.y
git checkout FETCH_HEAD
git cherry-pick -x 19fb11d7220b8abc016aa254dc7e6d9f2d49b178
# <resolve conflicts, build, test, etc.>
git commit -s
git send-email --to '<stable(a)vger.kernel.org>' --in-reply-to '2024052521-audacious-renewably-220c@gregkh' --subject-prefix 'PATCH 5.15.y' HEAD^..
Possible dependencies:
thanks,
greg k-h
------------------ original commit in Linus's tree ------------------
From 19fb11d7220b8abc016aa254dc7e6d9f2d49b178 Mon Sep 17 00:00:00 2001
From: Nuno Sa <nuno.sa(a)analog.com>
Date: Fri, 26 Apr 2024 17:42:12 +0200
Subject: [PATCH] dt-bindings: adc: axi-adc: add clocks property
Add a required clock property as we can't access the device registers if
the AXI bus clock is not properly enabled.
Note this clock is a very fundamental one that is typically enabled
pretty early during boot. Independently of that, we should really rely on
it to be enabled.
Reviewed-by: Krzysztof Kozlowski <krzysztof.kozlowski(a)linaro.org>
Fixes: 96553a44e96d ("dt-bindings: iio: adc: add bindings doc for AXI ADC driver")
Signed-off-by: Nuno Sa <nuno.sa(a)analog.com>
Link: https://lore.kernel.org/r/20240426-ad9467-new-features-v2-3-6361fc3ba1cc@an…
Cc: <Stable(a)ver.kernel.org>
Signed-off-by: Jonathan Cameron <Jonathan.Cameron(a)huawei.com>
diff --git a/Documentation/devicetree/bindings/iio/adc/adi,axi-adc.yaml b/Documentation/devicetree/bindings/iio/adc/adi,axi-adc.yaml
index 3d49d21ad33d..e1f450b80db2 100644
--- a/Documentation/devicetree/bindings/iio/adc/adi,axi-adc.yaml
+++ b/Documentation/devicetree/bindings/iio/adc/adi,axi-adc.yaml
@@ -28,6 +28,9 @@ properties:
reg:
maxItems: 1
+ clocks:
+ maxItems: 1
+
dmas:
maxItems: 1
@@ -48,6 +51,7 @@ required:
- compatible
- dmas
- reg
+ - clocks
additionalProperties: false
@@ -58,6 +62,7 @@ examples:
reg = <0x44a00000 0x10000>;
dmas = <&rx_dma 0>;
dma-names = "rx";
+ clocks = <&axi_clk>;
#io-backend-cells = <0>;
};
...
The patch below does not apply to the 6.1-stable tree.
If someone wants it applied there, or to any other stable or longterm
tree, then please email the backport, including the original git commit
id to <stable(a)vger.kernel.org>.
To reproduce the conflict and resubmit, you may use the following commands:
git fetch https://git.kernel.org/pub/scm/linux/kernel/git/stable/linux.git/ linux-6.1.y
git checkout FETCH_HEAD
git cherry-pick -x 19fb11d7220b8abc016aa254dc7e6d9f2d49b178
# <resolve conflicts, build, test, etc.>
git commit -s
git send-email --to '<stable(a)vger.kernel.org>' --in-reply-to '2024052520-dainty-reflected-ff22@gregkh' --subject-prefix 'PATCH 6.1.y' HEAD^..
Possible dependencies:
thanks,
greg k-h
------------------ original commit in Linus's tree ------------------
From 19fb11d7220b8abc016aa254dc7e6d9f2d49b178 Mon Sep 17 00:00:00 2001
From: Nuno Sa <nuno.sa(a)analog.com>
Date: Fri, 26 Apr 2024 17:42:12 +0200
Subject: [PATCH] dt-bindings: adc: axi-adc: add clocks property
Add a required clock property as we can't access the device registers if
the AXI bus clock is not properly enabled.
Note this clock is a very fundamental one that is typically enabled
pretty early during boot. Independently of that, we should really rely on
it to be enabled.
Reviewed-by: Krzysztof Kozlowski <krzysztof.kozlowski(a)linaro.org>
Fixes: 96553a44e96d ("dt-bindings: iio: adc: add bindings doc for AXI ADC driver")
Signed-off-by: Nuno Sa <nuno.sa(a)analog.com>
Link: https://lore.kernel.org/r/20240426-ad9467-new-features-v2-3-6361fc3ba1cc@an…
Cc: <Stable(a)ver.kernel.org>
Signed-off-by: Jonathan Cameron <Jonathan.Cameron(a)huawei.com>
diff --git a/Documentation/devicetree/bindings/iio/adc/adi,axi-adc.yaml b/Documentation/devicetree/bindings/iio/adc/adi,axi-adc.yaml
index 3d49d21ad33d..e1f450b80db2 100644
--- a/Documentation/devicetree/bindings/iio/adc/adi,axi-adc.yaml
+++ b/Documentation/devicetree/bindings/iio/adc/adi,axi-adc.yaml
@@ -28,6 +28,9 @@ properties:
reg:
maxItems: 1
+ clocks:
+ maxItems: 1
+
dmas:
maxItems: 1
@@ -48,6 +51,7 @@ required:
- compatible
- dmas
- reg
+ - clocks
additionalProperties: false
@@ -58,6 +62,7 @@ examples:
reg = <0x44a00000 0x10000>;
dmas = <&rx_dma 0>;
dma-names = "rx";
+ clocks = <&axi_clk>;
#io-backend-cells = <0>;
};
...
The patch below does not apply to the 6.6-stable tree.
If someone wants it applied there, or to any other stable or longterm
tree, then please email the backport, including the original git commit
id to <stable(a)vger.kernel.org>.
To reproduce the conflict and resubmit, you may use the following commands:
git fetch https://git.kernel.org/pub/scm/linux/kernel/git/stable/linux.git/ linux-6.6.y
git checkout FETCH_HEAD
git cherry-pick -x 19fb11d7220b8abc016aa254dc7e6d9f2d49b178
# <resolve conflicts, build, test, etc.>
git commit -s
git send-email --to '<stable(a)vger.kernel.org>' --in-reply-to '2024052520-enroll-deftly-8f54@gregkh' --subject-prefix 'PATCH 6.6.y' HEAD^..
Possible dependencies:
thanks,
greg k-h
------------------ original commit in Linus's tree ------------------
From 19fb11d7220b8abc016aa254dc7e6d9f2d49b178 Mon Sep 17 00:00:00 2001
From: Nuno Sa <nuno.sa(a)analog.com>
Date: Fri, 26 Apr 2024 17:42:12 +0200
Subject: [PATCH] dt-bindings: adc: axi-adc: add clocks property
Add a required clock property as we can't access the device registers if
the AXI bus clock is not properly enabled.
Note this clock is a very fundamental one that is typically enabled
pretty early during boot. Independently of that, we should really rely on
it to be enabled.
Reviewed-by: Krzysztof Kozlowski <krzysztof.kozlowski(a)linaro.org>
Fixes: 96553a44e96d ("dt-bindings: iio: adc: add bindings doc for AXI ADC driver")
Signed-off-by: Nuno Sa <nuno.sa(a)analog.com>
Link: https://lore.kernel.org/r/20240426-ad9467-new-features-v2-3-6361fc3ba1cc@an…
Cc: <Stable(a)ver.kernel.org>
Signed-off-by: Jonathan Cameron <Jonathan.Cameron(a)huawei.com>
diff --git a/Documentation/devicetree/bindings/iio/adc/adi,axi-adc.yaml b/Documentation/devicetree/bindings/iio/adc/adi,axi-adc.yaml
index 3d49d21ad33d..e1f450b80db2 100644
--- a/Documentation/devicetree/bindings/iio/adc/adi,axi-adc.yaml
+++ b/Documentation/devicetree/bindings/iio/adc/adi,axi-adc.yaml
@@ -28,6 +28,9 @@ properties:
reg:
maxItems: 1
+ clocks:
+ maxItems: 1
+
dmas:
maxItems: 1
@@ -48,6 +51,7 @@ required:
- compatible
- dmas
- reg
+ - clocks
additionalProperties: false
@@ -58,6 +62,7 @@ examples:
reg = <0x44a00000 0x10000>;
dmas = <&rx_dma 0>;
dma-names = "rx";
+ clocks = <&axi_clk>;
#io-backend-cells = <0>;
};
...
The patch below does not apply to the 6.8-stable tree.
If someone wants it applied there, or to any other stable or longterm
tree, then please email the backport, including the original git commit
id to <stable(a)vger.kernel.org>.
To reproduce the conflict and resubmit, you may use the following commands:
git fetch https://git.kernel.org/pub/scm/linux/kernel/git/stable/linux.git/ linux-6.8.y
git checkout FETCH_HEAD
git cherry-pick -x 19fb11d7220b8abc016aa254dc7e6d9f2d49b178
# <resolve conflicts, build, test, etc.>
git commit -s
git send-email --to '<stable(a)vger.kernel.org>' --in-reply-to '2024052519-estrogen-babble-023a@gregkh' --subject-prefix 'PATCH 6.8.y' HEAD^..
Possible dependencies:
thanks,
greg k-h
------------------ original commit in Linus's tree ------------------
From 19fb11d7220b8abc016aa254dc7e6d9f2d49b178 Mon Sep 17 00:00:00 2001
From: Nuno Sa <nuno.sa(a)analog.com>
Date: Fri, 26 Apr 2024 17:42:12 +0200
Subject: [PATCH] dt-bindings: adc: axi-adc: add clocks property
Add a required clock property as we can't access the device registers if
the AXI bus clock is not properly enabled.
Note this clock is a very fundamental one that is typically enabled
pretty early during boot. Independently of that, we should really rely on
it to be enabled.
Reviewed-by: Krzysztof Kozlowski <krzysztof.kozlowski(a)linaro.org>
Fixes: 96553a44e96d ("dt-bindings: iio: adc: add bindings doc for AXI ADC driver")
Signed-off-by: Nuno Sa <nuno.sa(a)analog.com>
Link: https://lore.kernel.org/r/20240426-ad9467-new-features-v2-3-6361fc3ba1cc@an…
Cc: <Stable(a)ver.kernel.org>
Signed-off-by: Jonathan Cameron <Jonathan.Cameron(a)huawei.com>
diff --git a/Documentation/devicetree/bindings/iio/adc/adi,axi-adc.yaml b/Documentation/devicetree/bindings/iio/adc/adi,axi-adc.yaml
index 3d49d21ad33d..e1f450b80db2 100644
--- a/Documentation/devicetree/bindings/iio/adc/adi,axi-adc.yaml
+++ b/Documentation/devicetree/bindings/iio/adc/adi,axi-adc.yaml
@@ -28,6 +28,9 @@ properties:
reg:
maxItems: 1
+ clocks:
+ maxItems: 1
+
dmas:
maxItems: 1
@@ -48,6 +51,7 @@ required:
- compatible
- dmas
- reg
+ - clocks
additionalProperties: false
@@ -58,6 +62,7 @@ examples:
reg = <0x44a00000 0x10000>;
dmas = <&rx_dma 0>;
dma-names = "rx";
+ clocks = <&axi_clk>;
#io-backend-cells = <0>;
};
...
The patch below does not apply to the 5.10-stable tree.
If someone wants it applied there, or to any other stable or longterm
tree, then please email the backport, including the original git commit
id to <stable(a)vger.kernel.org>.
To reproduce the conflict and resubmit, you may use the following commands:
git fetch https://git.kernel.org/pub/scm/linux/kernel/git/stable/linux.git/ linux-5.10.y
git checkout FETCH_HEAD
git cherry-pick -x 317a215d493230da361028ea8a4675de334bfa1a
# <resolve conflicts, build, test, etc.>
git commit -s
git send-email --to '<stable(a)vger.kernel.org>' --in-reply-to '2024052509-upswing-sloping-5ca5@gregkh' --subject-prefix 'PATCH 5.10.y' HEAD^..
Possible dependencies:
thanks,
greg k-h
------------------ original commit in Linus's tree ------------------
From 317a215d493230da361028ea8a4675de334bfa1a Mon Sep 17 00:00:00 2001
From: Ronald Wahl <ronald.wahl(a)raritan.com>
Date: Mon, 13 May 2024 16:39:22 +0200
Subject: [PATCH] net: ks8851: Fix another TX stall caused by wrong ISR flag
handling
Under some circumstances it may happen that the ks8851 Ethernet driver
stops sending data.
Currently the interrupt handler resets the interrupt status flags in the
hardware after handling TX. With this approach we may lose interrupts in
the time window between handling the TX interrupt and resetting the TX
interrupt status bit.
When all of the three following conditions are true then transmitting
data stops:
- TX queue is stopped to wait for room in the hardware TX buffer
- no queued SKBs in the driver (txq) that wait for being written to hw
- hardware TX buffer is empty and the last TX interrupt was lost
This is because reenabling the TX queue happens when handling the TX
interrupt status but if the TX status bit has already been cleared then
this interrupt will never come.
With this commit the interrupt status flags will be cleared before they
are handled. That way we stop losing interrupts.
The wrong handling of the ISR flags was there from the beginning but
with commit 3dc5d4454545 ("net: ks8851: Fix TX stall caused by TX
buffer overrun") the issue becomes apparent.
Fixes: 3dc5d4454545 ("net: ks8851: Fix TX stall caused by TX buffer overrun")
Cc: "David S. Miller" <davem(a)davemloft.net>
Cc: Eric Dumazet <edumazet(a)google.com>
Cc: Jakub Kicinski <kuba(a)kernel.org>
Cc: Paolo Abeni <pabeni(a)redhat.com>
Cc: Simon Horman <horms(a)kernel.org>
Cc: netdev(a)vger.kernel.org
Cc: stable(a)vger.kernel.org # 5.10+
Signed-off-by: Ronald Wahl <ronald.wahl(a)raritan.com>
Reviewed-by: Simon Horman <horms(a)kernel.org>
Signed-off-by: David S. Miller <davem(a)davemloft.net>
diff --git a/drivers/net/ethernet/micrel/ks8851_common.c b/drivers/net/ethernet/micrel/ks8851_common.c
index 502518cdb461..6453c92f0fa7 100644
--- a/drivers/net/ethernet/micrel/ks8851_common.c
+++ b/drivers/net/ethernet/micrel/ks8851_common.c
@@ -328,7 +328,6 @@ static irqreturn_t ks8851_irq(int irq, void *_ks)
{
struct ks8851_net *ks = _ks;
struct sk_buff_head rxq;
- unsigned handled = 0;
unsigned long flags;
unsigned int status;
struct sk_buff *skb;
@@ -336,24 +335,17 @@ static irqreturn_t ks8851_irq(int irq, void *_ks)
ks8851_lock(ks, &flags);
status = ks8851_rdreg16(ks, KS_ISR);
+ ks8851_wrreg16(ks, KS_ISR, status);
netif_dbg(ks, intr, ks->netdev,
"%s: status 0x%04x\n", __func__, status);
- if (status & IRQ_LCI)
- handled |= IRQ_LCI;
-
if (status & IRQ_LDI) {
u16 pmecr = ks8851_rdreg16(ks, KS_PMECR);
pmecr &= ~PMECR_WKEVT_MASK;
ks8851_wrreg16(ks, KS_PMECR, pmecr | PMECR_WKEVT_LINK);
-
- handled |= IRQ_LDI;
}
- if (status & IRQ_RXPSI)
- handled |= IRQ_RXPSI;
-
if (status & IRQ_TXI) {
unsigned short tx_space = ks8851_rdreg16(ks, KS_TXMIR);
@@ -365,20 +357,12 @@ static irqreturn_t ks8851_irq(int irq, void *_ks)
if (netif_queue_stopped(ks->netdev))
netif_wake_queue(ks->netdev);
spin_unlock(&ks->statelock);
-
- handled |= IRQ_TXI;
}
- if (status & IRQ_RXI)
- handled |= IRQ_RXI;
-
if (status & IRQ_SPIBEI) {
netdev_err(ks->netdev, "%s: spi bus error\n", __func__);
- handled |= IRQ_SPIBEI;
}
- ks8851_wrreg16(ks, KS_ISR, handled);
-
if (status & IRQ_RXI) {
/* the datasheet says to disable the rx interrupt during
* packet read-out, however we're masking the interrupt
The patch below does not apply to the 5.15-stable tree.
If someone wants it applied there, or to any other stable or longterm
tree, then please email the backport, including the original git commit
id to <stable(a)vger.kernel.org>.
To reproduce the conflict and resubmit, you may use the following commands:
git fetch https://git.kernel.org/pub/scm/linux/kernel/git/stable/linux.git/ linux-5.15.y
git checkout FETCH_HEAD
git cherry-pick -x 317a215d493230da361028ea8a4675de334bfa1a
# <resolve conflicts, build, test, etc.>
git commit -s
git send-email --to '<stable(a)vger.kernel.org>' --in-reply-to '2024052508-overlap-uncured-11bd@gregkh' --subject-prefix 'PATCH 5.15.y' HEAD^..
Possible dependencies:
thanks,
greg k-h
------------------ original commit in Linus's tree ------------------
From 317a215d493230da361028ea8a4675de334bfa1a Mon Sep 17 00:00:00 2001
From: Ronald Wahl <ronald.wahl(a)raritan.com>
Date: Mon, 13 May 2024 16:39:22 +0200
Subject: [PATCH] net: ks8851: Fix another TX stall caused by wrong ISR flag
handling
Under some circumstances it may happen that the ks8851 Ethernet driver
stops sending data.
Currently the interrupt handler resets the interrupt status flags in the
hardware after handling TX. With this approach we may lose interrupts in
the time window between handling the TX interrupt and resetting the TX
interrupt status bit.
When all of the three following conditions are true then transmitting
data stops:
- TX queue is stopped to wait for room in the hardware TX buffer
- no queued SKBs in the driver (txq) that wait for being written to hw
- hardware TX buffer is empty and the last TX interrupt was lost
This is because reenabling the TX queue happens when handling the TX
interrupt status but if the TX status bit has already been cleared then
this interrupt will never come.
With this commit the interrupt status flags will be cleared before they
are handled. That way we stop losing interrupts.
The wrong handling of the ISR flags was there from the beginning but
with commit 3dc5d4454545 ("net: ks8851: Fix TX stall caused by TX
buffer overrun") the issue becomes apparent.
Fixes: 3dc5d4454545 ("net: ks8851: Fix TX stall caused by TX buffer overrun")
Cc: "David S. Miller" <davem(a)davemloft.net>
Cc: Eric Dumazet <edumazet(a)google.com>
Cc: Jakub Kicinski <kuba(a)kernel.org>
Cc: Paolo Abeni <pabeni(a)redhat.com>
Cc: Simon Horman <horms(a)kernel.org>
Cc: netdev(a)vger.kernel.org
Cc: stable(a)vger.kernel.org # 5.10+
Signed-off-by: Ronald Wahl <ronald.wahl(a)raritan.com>
Reviewed-by: Simon Horman <horms(a)kernel.org>
Signed-off-by: David S. Miller <davem(a)davemloft.net>
diff --git a/drivers/net/ethernet/micrel/ks8851_common.c b/drivers/net/ethernet/micrel/ks8851_common.c
index 502518cdb461..6453c92f0fa7 100644
--- a/drivers/net/ethernet/micrel/ks8851_common.c
+++ b/drivers/net/ethernet/micrel/ks8851_common.c
@@ -328,7 +328,6 @@ static irqreturn_t ks8851_irq(int irq, void *_ks)
{
struct ks8851_net *ks = _ks;
struct sk_buff_head rxq;
- unsigned handled = 0;
unsigned long flags;
unsigned int status;
struct sk_buff *skb;
@@ -336,24 +335,17 @@ static irqreturn_t ks8851_irq(int irq, void *_ks)
ks8851_lock(ks, &flags);
status = ks8851_rdreg16(ks, KS_ISR);
+ ks8851_wrreg16(ks, KS_ISR, status);
netif_dbg(ks, intr, ks->netdev,
"%s: status 0x%04x\n", __func__, status);
- if (status & IRQ_LCI)
- handled |= IRQ_LCI;
-
if (status & IRQ_LDI) {
u16 pmecr = ks8851_rdreg16(ks, KS_PMECR);
pmecr &= ~PMECR_WKEVT_MASK;
ks8851_wrreg16(ks, KS_PMECR, pmecr | PMECR_WKEVT_LINK);
-
- handled |= IRQ_LDI;
}
- if (status & IRQ_RXPSI)
- handled |= IRQ_RXPSI;
-
if (status & IRQ_TXI) {
unsigned short tx_space = ks8851_rdreg16(ks, KS_TXMIR);
@@ -365,20 +357,12 @@ static irqreturn_t ks8851_irq(int irq, void *_ks)
if (netif_queue_stopped(ks->netdev))
netif_wake_queue(ks->netdev);
spin_unlock(&ks->statelock);
-
- handled |= IRQ_TXI;
}
- if (status & IRQ_RXI)
- handled |= IRQ_RXI;
-
if (status & IRQ_SPIBEI) {
netdev_err(ks->netdev, "%s: spi bus error\n", __func__);
- handled |= IRQ_SPIBEI;
}
- ks8851_wrreg16(ks, KS_ISR, handled);
-
if (status & IRQ_RXI) {
/* the datasheet says to disable the rx interrupt during
* packet read-out, however we're masking the interrupt
The patch below does not apply to the 6.1-stable tree.
If someone wants it applied there, or to any other stable or longterm
tree, then please email the backport, including the original git commit
id to <stable(a)vger.kernel.org>.
To reproduce the conflict and resubmit, you may use the following commands:
git fetch https://git.kernel.org/pub/scm/linux/kernel/git/stable/linux.git/ linux-6.1.y
git checkout FETCH_HEAD
git cherry-pick -x 317a215d493230da361028ea8a4675de334bfa1a
# <resolve conflicts, build, test, etc.>
git commit -s
git send-email --to '<stable(a)vger.kernel.org>' --in-reply-to '2024052506-refining-rehab-7774@gregkh' --subject-prefix 'PATCH 6.1.y' HEAD^..
Possible dependencies:
thanks,
greg k-h
------------------ original commit in Linus's tree ------------------
From 317a215d493230da361028ea8a4675de334bfa1a Mon Sep 17 00:00:00 2001
From: Ronald Wahl <ronald.wahl(a)raritan.com>
Date: Mon, 13 May 2024 16:39:22 +0200
Subject: [PATCH] net: ks8851: Fix another TX stall caused by wrong ISR flag
handling
Under some circumstances it may happen that the ks8851 Ethernet driver
stops sending data.
Currently the interrupt handler resets the interrupt status flags in the
hardware after handling TX. With this approach we may lose interrupts in
the time window between handling the TX interrupt and resetting the TX
interrupt status bit.
When all of the three following conditions are true then transmitting
data stops:
- TX queue is stopped to wait for room in the hardware TX buffer
- no queued SKBs in the driver (txq) that wait for being written to hw
- hardware TX buffer is empty and the last TX interrupt was lost
This is because reenabling the TX queue happens when handling the TX
interrupt status but if the TX status bit has already been cleared then
this interrupt will never come.
With this commit the interrupt status flags will be cleared before they
are handled. That way we stop losing interrupts.
The wrong handling of the ISR flags was there from the beginning but
with commit 3dc5d4454545 ("net: ks8851: Fix TX stall caused by TX
buffer overrun") the issue becomes apparent.
Fixes: 3dc5d4454545 ("net: ks8851: Fix TX stall caused by TX buffer overrun")
Cc: "David S. Miller" <davem(a)davemloft.net>
Cc: Eric Dumazet <edumazet(a)google.com>
Cc: Jakub Kicinski <kuba(a)kernel.org>
Cc: Paolo Abeni <pabeni(a)redhat.com>
Cc: Simon Horman <horms(a)kernel.org>
Cc: netdev(a)vger.kernel.org
Cc: stable(a)vger.kernel.org # 5.10+
Signed-off-by: Ronald Wahl <ronald.wahl(a)raritan.com>
Reviewed-by: Simon Horman <horms(a)kernel.org>
Signed-off-by: David S. Miller <davem(a)davemloft.net>
diff --git a/drivers/net/ethernet/micrel/ks8851_common.c b/drivers/net/ethernet/micrel/ks8851_common.c
index 502518cdb461..6453c92f0fa7 100644
--- a/drivers/net/ethernet/micrel/ks8851_common.c
+++ b/drivers/net/ethernet/micrel/ks8851_common.c
@@ -328,7 +328,6 @@ static irqreturn_t ks8851_irq(int irq, void *_ks)
{
struct ks8851_net *ks = _ks;
struct sk_buff_head rxq;
- unsigned handled = 0;
unsigned long flags;
unsigned int status;
struct sk_buff *skb;
@@ -336,24 +335,17 @@ static irqreturn_t ks8851_irq(int irq, void *_ks)
ks8851_lock(ks, &flags);
status = ks8851_rdreg16(ks, KS_ISR);
+ ks8851_wrreg16(ks, KS_ISR, status);
netif_dbg(ks, intr, ks->netdev,
"%s: status 0x%04x\n", __func__, status);
- if (status & IRQ_LCI)
- handled |= IRQ_LCI;
-
if (status & IRQ_LDI) {
u16 pmecr = ks8851_rdreg16(ks, KS_PMECR);
pmecr &= ~PMECR_WKEVT_MASK;
ks8851_wrreg16(ks, KS_PMECR, pmecr | PMECR_WKEVT_LINK);
-
- handled |= IRQ_LDI;
}
- if (status & IRQ_RXPSI)
- handled |= IRQ_RXPSI;
-
if (status & IRQ_TXI) {
unsigned short tx_space = ks8851_rdreg16(ks, KS_TXMIR);
@@ -365,20 +357,12 @@ static irqreturn_t ks8851_irq(int irq, void *_ks)
if (netif_queue_stopped(ks->netdev))
netif_wake_queue(ks->netdev);
spin_unlock(&ks->statelock);
-
- handled |= IRQ_TXI;
}
- if (status & IRQ_RXI)
- handled |= IRQ_RXI;
-
if (status & IRQ_SPIBEI) {
netdev_err(ks->netdev, "%s: spi bus error\n", __func__);
- handled |= IRQ_SPIBEI;
}
- ks8851_wrreg16(ks, KS_ISR, handled);
-
if (status & IRQ_RXI) {
/* the datasheet says to disable the rx interrupt during
* packet read-out, however we're masking the interrupt
The patch below does not apply to the 6.6-stable tree.
If someone wants it applied there, or to any other stable or longterm
tree, then please email the backport, including the original git commit
id to <stable(a)vger.kernel.org>.
To reproduce the conflict and resubmit, you may use the following commands:
git fetch https://git.kernel.org/pub/scm/linux/kernel/git/stable/linux.git/ linux-6.6.y
git checkout FETCH_HEAD
git cherry-pick -x 317a215d493230da361028ea8a4675de334bfa1a
# <resolve conflicts, build, test, etc.>
git commit -s
git send-email --to '<stable(a)vger.kernel.org>' --in-reply-to '2024052505-confiding-stegosaur-016d@gregkh' --subject-prefix 'PATCH 6.6.y' HEAD^..
Possible dependencies:
thanks,
greg k-h
------------------ original commit in Linus's tree ------------------
From 317a215d493230da361028ea8a4675de334bfa1a Mon Sep 17 00:00:00 2001
From: Ronald Wahl <ronald.wahl(a)raritan.com>
Date: Mon, 13 May 2024 16:39:22 +0200
Subject: [PATCH] net: ks8851: Fix another TX stall caused by wrong ISR flag
handling
Under some circumstances it may happen that the ks8851 Ethernet driver
stops sending data.
Currently the interrupt handler resets the interrupt status flags in the
hardware after handling TX. With this approach we may lose interrupts in
the time window between handling the TX interrupt and resetting the TX
interrupt status bit.
When all of the three following conditions are true then transmitting
data stops:
- TX queue is stopped to wait for room in the hardware TX buffer
- no queued SKBs in the driver (txq) that wait for being written to hw
- hardware TX buffer is empty and the last TX interrupt was lost
This is because reenabling the TX queue happens when handling the TX
interrupt status but if the TX status bit has already been cleared then
this interrupt will never come.
With this commit the interrupt status flags will be cleared before they
are handled. That way we stop losing interrupts.
The wrong handling of the ISR flags was there from the beginning but
with commit 3dc5d4454545 ("net: ks8851: Fix TX stall caused by TX
buffer overrun") the issue becomes apparent.
Fixes: 3dc5d4454545 ("net: ks8851: Fix TX stall caused by TX buffer overrun")
Cc: "David S. Miller" <davem(a)davemloft.net>
Cc: Eric Dumazet <edumazet(a)google.com>
Cc: Jakub Kicinski <kuba(a)kernel.org>
Cc: Paolo Abeni <pabeni(a)redhat.com>
Cc: Simon Horman <horms(a)kernel.org>
Cc: netdev(a)vger.kernel.org
Cc: stable(a)vger.kernel.org # 5.10+
Signed-off-by: Ronald Wahl <ronald.wahl(a)raritan.com>
Reviewed-by: Simon Horman <horms(a)kernel.org>
Signed-off-by: David S. Miller <davem(a)davemloft.net>
diff --git a/drivers/net/ethernet/micrel/ks8851_common.c b/drivers/net/ethernet/micrel/ks8851_common.c
index 502518cdb461..6453c92f0fa7 100644
--- a/drivers/net/ethernet/micrel/ks8851_common.c
+++ b/drivers/net/ethernet/micrel/ks8851_common.c
@@ -328,7 +328,6 @@ static irqreturn_t ks8851_irq(int irq, void *_ks)
{
struct ks8851_net *ks = _ks;
struct sk_buff_head rxq;
- unsigned handled = 0;
unsigned long flags;
unsigned int status;
struct sk_buff *skb;
@@ -336,24 +335,17 @@ static irqreturn_t ks8851_irq(int irq, void *_ks)
ks8851_lock(ks, &flags);
status = ks8851_rdreg16(ks, KS_ISR);
+ ks8851_wrreg16(ks, KS_ISR, status);
netif_dbg(ks, intr, ks->netdev,
"%s: status 0x%04x\n", __func__, status);
- if (status & IRQ_LCI)
- handled |= IRQ_LCI;
-
if (status & IRQ_LDI) {
u16 pmecr = ks8851_rdreg16(ks, KS_PMECR);
pmecr &= ~PMECR_WKEVT_MASK;
ks8851_wrreg16(ks, KS_PMECR, pmecr | PMECR_WKEVT_LINK);
-
- handled |= IRQ_LDI;
}
- if (status & IRQ_RXPSI)
- handled |= IRQ_RXPSI;
-
if (status & IRQ_TXI) {
unsigned short tx_space = ks8851_rdreg16(ks, KS_TXMIR);
@@ -365,20 +357,12 @@ static irqreturn_t ks8851_irq(int irq, void *_ks)
if (netif_queue_stopped(ks->netdev))
netif_wake_queue(ks->netdev);
spin_unlock(&ks->statelock);
-
- handled |= IRQ_TXI;
}
- if (status & IRQ_RXI)
- handled |= IRQ_RXI;
-
if (status & IRQ_SPIBEI) {
netdev_err(ks->netdev, "%s: spi bus error\n", __func__);
- handled |= IRQ_SPIBEI;
}
- ks8851_wrreg16(ks, KS_ISR, handled);
-
if (status & IRQ_RXI) {
/* the datasheet says to disable the rx interrupt during
* packet read-out, however we're masking the interrupt
The patch below does not apply to the 6.8-stable tree.
If someone wants it applied there, or to any other stable or longterm
tree, then please email the backport, including the original git commit
id to <stable(a)vger.kernel.org>.
To reproduce the conflict and resubmit, you may use the following commands:
git fetch https://git.kernel.org/pub/scm/linux/kernel/git/stable/linux.git/ linux-6.8.y
git checkout FETCH_HEAD
git cherry-pick -x 317a215d493230da361028ea8a4675de334bfa1a
# <resolve conflicts, build, test, etc.>
git commit -s
git send-email --to '<stable(a)vger.kernel.org>' --in-reply-to '2024052504-same-handiness-8e6b@gregkh' --subject-prefix 'PATCH 6.8.y' HEAD^..
Possible dependencies:
thanks,
greg k-h
------------------ original commit in Linus's tree ------------------
From 317a215d493230da361028ea8a4675de334bfa1a Mon Sep 17 00:00:00 2001
From: Ronald Wahl <ronald.wahl(a)raritan.com>
Date: Mon, 13 May 2024 16:39:22 +0200
Subject: [PATCH] net: ks8851: Fix another TX stall caused by wrong ISR flag
handling
Under some circumstances it may happen that the ks8851 Ethernet driver
stops sending data.
Currently the interrupt handler resets the interrupt status flags in the
hardware after handling TX. With this approach we may lose interrupts in
the time window between handling the TX interrupt and resetting the TX
interrupt status bit.
When all of the three following conditions are true then transmitting
data stops:
- TX queue is stopped to wait for room in the hardware TX buffer
- no queued SKBs in the driver (txq) that wait for being written to hw
- hardware TX buffer is empty and the last TX interrupt was lost
This is because reenabling the TX queue happens when handling the TX
interrupt status but if the TX status bit has already been cleared then
this interrupt will never come.
With this commit the interrupt status flags will be cleared before they
are handled. That way we stop losing interrupts.
The wrong handling of the ISR flags was there from the beginning but
with commit 3dc5d4454545 ("net: ks8851: Fix TX stall caused by TX
buffer overrun") the issue becomes apparent.
Fixes: 3dc5d4454545 ("net: ks8851: Fix TX stall caused by TX buffer overrun")
Cc: "David S. Miller" <davem(a)davemloft.net>
Cc: Eric Dumazet <edumazet(a)google.com>
Cc: Jakub Kicinski <kuba(a)kernel.org>
Cc: Paolo Abeni <pabeni(a)redhat.com>
Cc: Simon Horman <horms(a)kernel.org>
Cc: netdev(a)vger.kernel.org
Cc: stable(a)vger.kernel.org # 5.10+
Signed-off-by: Ronald Wahl <ronald.wahl(a)raritan.com>
Reviewed-by: Simon Horman <horms(a)kernel.org>
Signed-off-by: David S. Miller <davem(a)davemloft.net>
diff --git a/drivers/net/ethernet/micrel/ks8851_common.c b/drivers/net/ethernet/micrel/ks8851_common.c
index 502518cdb461..6453c92f0fa7 100644
--- a/drivers/net/ethernet/micrel/ks8851_common.c
+++ b/drivers/net/ethernet/micrel/ks8851_common.c
@@ -328,7 +328,6 @@ static irqreturn_t ks8851_irq(int irq, void *_ks)
{
struct ks8851_net *ks = _ks;
struct sk_buff_head rxq;
- unsigned handled = 0;
unsigned long flags;
unsigned int status;
struct sk_buff *skb;
@@ -336,24 +335,17 @@ static irqreturn_t ks8851_irq(int irq, void *_ks)
ks8851_lock(ks, &flags);
status = ks8851_rdreg16(ks, KS_ISR);
+ ks8851_wrreg16(ks, KS_ISR, status);
netif_dbg(ks, intr, ks->netdev,
"%s: status 0x%04x\n", __func__, status);
- if (status & IRQ_LCI)
- handled |= IRQ_LCI;
-
if (status & IRQ_LDI) {
u16 pmecr = ks8851_rdreg16(ks, KS_PMECR);
pmecr &= ~PMECR_WKEVT_MASK;
ks8851_wrreg16(ks, KS_PMECR, pmecr | PMECR_WKEVT_LINK);
-
- handled |= IRQ_LDI;
}
- if (status & IRQ_RXPSI)
- handled |= IRQ_RXPSI;
-
if (status & IRQ_TXI) {
unsigned short tx_space = ks8851_rdreg16(ks, KS_TXMIR);
@@ -365,20 +357,12 @@ static irqreturn_t ks8851_irq(int irq, void *_ks)
if (netif_queue_stopped(ks->netdev))
netif_wake_queue(ks->netdev);
spin_unlock(&ks->statelock);
-
- handled |= IRQ_TXI;
}
- if (status & IRQ_RXI)
- handled |= IRQ_RXI;
-
if (status & IRQ_SPIBEI) {
netdev_err(ks->netdev, "%s: spi bus error\n", __func__);
- handled |= IRQ_SPIBEI;
}
- ks8851_wrreg16(ks, KS_ISR, handled);
-
if (status & IRQ_RXI) {
/* the datasheet says to disable the rx interrupt during
* packet read-out, however we're masking the interrupt
The patch below does not apply to the 6.9-stable tree.
If someone wants it applied there, or to any other stable or longterm
tree, then please email the backport, including the original git commit
id to <stable(a)vger.kernel.org>.
To reproduce the conflict and resubmit, you may use the following commands:
git fetch https://git.kernel.org/pub/scm/linux/kernel/git/stable/linux.git/ linux-6.9.y
git checkout FETCH_HEAD
git cherry-pick -x 317a215d493230da361028ea8a4675de334bfa1a
# <resolve conflicts, build, test, etc.>
git commit -s
git send-email --to '<stable(a)vger.kernel.org>' --in-reply-to '2024052503-purchase-hardcore-0032@gregkh' --subject-prefix 'PATCH 6.9.y' HEAD^..
Possible dependencies:
thanks,
greg k-h
------------------ original commit in Linus's tree ------------------
From 317a215d493230da361028ea8a4675de334bfa1a Mon Sep 17 00:00:00 2001
From: Ronald Wahl <ronald.wahl(a)raritan.com>
Date: Mon, 13 May 2024 16:39:22 +0200
Subject: [PATCH] net: ks8851: Fix another TX stall caused by wrong ISR flag
handling
Under some circumstances it may happen that the ks8851 Ethernet driver
stops sending data.
Currently the interrupt handler resets the interrupt status flags in the
hardware after handling TX. With this approach we may lose interrupts in
the time window between handling the TX interrupt and resetting the TX
interrupt status bit.
When all of the three following conditions are true then transmitting
data stops:
- TX queue is stopped to wait for room in the hardware TX buffer
- no queued SKBs in the driver (txq) that wait for being written to hw
- hardware TX buffer is empty and the last TX interrupt was lost
This is because reenabling the TX queue happens when handling the TX
interrupt status but if the TX status bit has already been cleared then
this interrupt will never come.
With this commit the interrupt status flags will be cleared before they
are handled. That way we stop losing interrupts.
The wrong handling of the ISR flags was there from the beginning but
with commit 3dc5d4454545 ("net: ks8851: Fix TX stall caused by TX
buffer overrun") the issue becomes apparent.
Fixes: 3dc5d4454545 ("net: ks8851: Fix TX stall caused by TX buffer overrun")
Cc: "David S. Miller" <davem(a)davemloft.net>
Cc: Eric Dumazet <edumazet(a)google.com>
Cc: Jakub Kicinski <kuba(a)kernel.org>
Cc: Paolo Abeni <pabeni(a)redhat.com>
Cc: Simon Horman <horms(a)kernel.org>
Cc: netdev(a)vger.kernel.org
Cc: stable(a)vger.kernel.org # 5.10+
Signed-off-by: Ronald Wahl <ronald.wahl(a)raritan.com>
Reviewed-by: Simon Horman <horms(a)kernel.org>
Signed-off-by: David S. Miller <davem(a)davemloft.net>
diff --git a/drivers/net/ethernet/micrel/ks8851_common.c b/drivers/net/ethernet/micrel/ks8851_common.c
index 502518cdb461..6453c92f0fa7 100644
--- a/drivers/net/ethernet/micrel/ks8851_common.c
+++ b/drivers/net/ethernet/micrel/ks8851_common.c
@@ -328,7 +328,6 @@ static irqreturn_t ks8851_irq(int irq, void *_ks)
{
struct ks8851_net *ks = _ks;
struct sk_buff_head rxq;
- unsigned handled = 0;
unsigned long flags;
unsigned int status;
struct sk_buff *skb;
@@ -336,24 +335,17 @@ static irqreturn_t ks8851_irq(int irq, void *_ks)
ks8851_lock(ks, &flags);
status = ks8851_rdreg16(ks, KS_ISR);
+ ks8851_wrreg16(ks, KS_ISR, status);
netif_dbg(ks, intr, ks->netdev,
"%s: status 0x%04x\n", __func__, status);
- if (status & IRQ_LCI)
- handled |= IRQ_LCI;
-
if (status & IRQ_LDI) {
u16 pmecr = ks8851_rdreg16(ks, KS_PMECR);
pmecr &= ~PMECR_WKEVT_MASK;
ks8851_wrreg16(ks, KS_PMECR, pmecr | PMECR_WKEVT_LINK);
-
- handled |= IRQ_LDI;
}
- if (status & IRQ_RXPSI)
- handled |= IRQ_RXPSI;
-
if (status & IRQ_TXI) {
unsigned short tx_space = ks8851_rdreg16(ks, KS_TXMIR);
@@ -365,20 +357,12 @@ static irqreturn_t ks8851_irq(int irq, void *_ks)
if (netif_queue_stopped(ks->netdev))
netif_wake_queue(ks->netdev);
spin_unlock(&ks->statelock);
-
- handled |= IRQ_TXI;
}
- if (status & IRQ_RXI)
- handled |= IRQ_RXI;
-
if (status & IRQ_SPIBEI) {
netdev_err(ks->netdev, "%s: spi bus error\n", __func__);
- handled |= IRQ_SPIBEI;
}
- ks8851_wrreg16(ks, KS_ISR, handled);
-
if (status & IRQ_RXI) {
/* the datasheet says to disable the rx interrupt during
* packet read-out, however we're masking the interrupt
The patch below does not apply to the 5.15-stable tree.
If someone wants it applied there, or to any other stable or longterm
tree, then please email the backport, including the original git commit
id to <stable(a)vger.kernel.org>.
To reproduce the conflict and resubmit, you may use the following commands:
git fetch https://git.kernel.org/pub/scm/linux/kernel/git/stable/linux.git/ linux-5.15.y
git checkout FETCH_HEAD
git cherry-pick -x 3d8f874bd620ce03f75a5512847586828ab86544
# <resolve conflicts, build, test, etc.>
git commit -s
git send-email --to '<stable(a)vger.kernel.org>' --in-reply-to '2024052547-overdraft-murmuring-02fd@gregkh' --subject-prefix 'PATCH 5.15.y' HEAD^..
Possible dependencies:
thanks,
greg k-h
------------------ original commit in Linus's tree ------------------
From 3d8f874bd620ce03f75a5512847586828ab86544 Mon Sep 17 00:00:00 2001
From: Ming Lei <ming.lei(a)redhat.com>
Date: Fri, 10 May 2024 11:50:27 +0800
Subject: [PATCH] io_uring: fail NOP if non-zero op flags is passed in
The NOP op flags should have been checked from beginning like any other
opcode, otherwise NOP may not be extended with the op flags.
Given both liburing and Rust io-uring crate always zeros SQE op flags, just
ignore users which play raw NOP uring interface without zeroing SQE, because
NOP is just for test purpose. Then we can save one NOP2 opcode.
Suggested-by: Jens Axboe <axboe(a)kernel.dk>
Fixes: 2b188cc1bb85 ("Add io_uring IO interface")
Cc: stable(a)vger.kernel.org
Signed-off-by: Ming Lei <ming.lei(a)redhat.com>
Link: https://lore.kernel.org/r/20240510035031.78874-2-ming.lei@redhat.com
Signed-off-by: Jens Axboe <axboe(a)kernel.dk>
diff --git a/io_uring/nop.c b/io_uring/nop.c
index d956599a3c1b..1a4e312dfe51 100644
--- a/io_uring/nop.c
+++ b/io_uring/nop.c
@@ -12,6 +12,8 @@
int io_nop_prep(struct io_kiocb *req, const struct io_uring_sqe *sqe)
{
+ if (READ_ONCE(sqe->rw_flags))
+ return -EINVAL;
return 0;
}
The patch below does not apply to the 4.19-stable tree.
If someone wants it applied there, or to any other stable or longterm
tree, then please email the backport, including the original git commit
id to <stable(a)vger.kernel.org>.
To reproduce the conflict and resubmit, you may use the following commands:
git fetch https://git.kernel.org/pub/scm/linux/kernel/git/stable/linux.git/ linux-4.19.y
git checkout FETCH_HEAD
git cherry-pick -x 0774d19038c496f0c3602fb505c43e1b2d8eed85
# <resolve conflicts, build, test, etc.>
git commit -s
git send-email --to '<stable(a)vger.kernel.org>' --in-reply-to '2024052513-reason-faceplate-ef1a@gregkh' --subject-prefix 'PATCH 4.19.y' HEAD^..
Possible dependencies:
thanks,
greg k-h
------------------ original commit in Linus's tree ------------------
From 0774d19038c496f0c3602fb505c43e1b2d8eed85 Mon Sep 17 00:00:00 2001
From: Dmitry Torokhov <dmitry.torokhov(a)gmail.com>
Date: Mon, 29 Apr 2024 14:50:41 -0700
Subject: [PATCH] Input: try trimming too long modalias strings
If an input device declares too many capability bits then modalias
string for such device may become too long and not fit into uevent
buffer, resulting in failure of sending said uevent. This, in turn,
may prevent userspace from recognizing existence of such devices.
This is typically not a concern for real hardware devices as they have
limited number of keys, but happen with synthetic devices such as
ones created by xen-kbdfront driver, which creates devices as being
capable of delivering all possible keys, since it doesn't know what
keys the backend may produce.
To deal with such devices input core will attempt to trim key data,
in the hope that the rest of modalias string will fit in the given
buffer. When trimming key data it will indicate that it is not
complete by placing "+," sign, resulting in conversions like this:
old: k71,72,73,74,78,7A,7B,7C,7D,8E,9E,A4,AD,E0,E1,E4,F8,174,
new: k71,72,73,74,78,7A,7B,7C,+,
This should allow existing udev rules continue to work with existing
devices, and will also allow writing more complex rules that would
recognize trimmed modalias and check input device characteristics by
other means (for example by parsing KEY= data in uevent or parsing
input device sysfs attributes).
Note that the driver core may try adding more uevent environment
variables once input core is done adding its own, so when forming
modalias we can not use the entire available buffer, so we reduce
it by somewhat an arbitrary amount (96 bytes).
Reported-by: Jason Andryuk <jandryuk(a)gmail.com>
Reviewed-by: Peter Hutterer <peter.hutterer(a)who-t.net>
Tested-by: Jason Andryuk <jandryuk(a)gmail.com>
Link: https://lore.kernel.org/r/ZjAWMQCJdrxZkvkB@google.com
Cc: stable(a)vger.kernel.org
Signed-off-by: Dmitry Torokhov <dmitry.torokhov(a)gmail.com>
diff --git a/drivers/input/input.c b/drivers/input/input.c
index 711485437567..fd4997ba263c 100644
--- a/drivers/input/input.c
+++ b/drivers/input/input.c
@@ -1378,19 +1378,19 @@ static int input_print_modalias_bits(char *buf, int size,
char name, const unsigned long *bm,
unsigned int min_bit, unsigned int max_bit)
{
- int len = 0, i;
+ int bit = min_bit;
+ int len = 0;
len += snprintf(buf, max(size, 0), "%c", name);
- for (i = min_bit; i < max_bit; i++)
- if (bm[BIT_WORD(i)] & BIT_MASK(i))
- len += snprintf(buf + len, max(size - len, 0), "%X,", i);
+ for_each_set_bit_from(bit, bm, max_bit)
+ len += snprintf(buf + len, max(size - len, 0), "%X,", bit);
return len;
}
-static int input_print_modalias(char *buf, int size, const struct input_dev *id,
- int add_cr)
+static int input_print_modalias_parts(char *buf, int size, int full_len,
+ const struct input_dev *id)
{
- int len;
+ int len, klen, remainder, space;
len = snprintf(buf, max(size, 0),
"input:b%04Xv%04Xp%04Xe%04X-",
@@ -1399,8 +1399,48 @@ static int input_print_modalias(char *buf, int size, const struct input_dev *id,
len += input_print_modalias_bits(buf + len, size - len,
'e', id->evbit, 0, EV_MAX);
- len += input_print_modalias_bits(buf + len, size - len,
+
+ /*
+ * Calculate the remaining space in the buffer making sure we
+ * have place for the terminating 0.
+ */
+ space = max(size - (len + 1), 0);
+
+ klen = input_print_modalias_bits(buf + len, size - len,
'k', id->keybit, KEY_MIN_INTERESTING, KEY_MAX);
+ len += klen;
+
+ /*
+ * If we have more data than we can fit in the buffer, check
+ * if we can trim key data to fit in the rest. We will indicate
+ * that key data is incomplete by adding "+" sign at the end, like
+ * this: * "k1,2,3,45,+,".
+ *
+ * Note that we shortest key info (if present) is "k+," so we
+ * can only try to trim if key data is longer than that.
+ */
+ if (full_len && size < full_len + 1 && klen > 3) {
+ remainder = full_len - len;
+ /*
+ * We can only trim if we have space for the remainder
+ * and also for at least "k+," which is 3 more characters.
+ */
+ if (remainder <= space - 3) {
+ /*
+ * We are guaranteed to have 'k' in the buffer, so
+ * we need at least 3 additional bytes for storing
+ * "+," in addition to the remainder.
+ */
+ for (int i = size - 1 - remainder - 3; i >= 0; i--) {
+ if (buf[i] == 'k' || buf[i] == ',') {
+ strcpy(buf + i + 1, "+,");
+ len = i + 3; /* Not counting '\0' */
+ break;
+ }
+ }
+ }
+ }
+
len += input_print_modalias_bits(buf + len, size - len,
'r', id->relbit, 0, REL_MAX);
len += input_print_modalias_bits(buf + len, size - len,
@@ -1416,12 +1456,25 @@ static int input_print_modalias(char *buf, int size, const struct input_dev *id,
len += input_print_modalias_bits(buf + len, size - len,
'w', id->swbit, 0, SW_MAX);
- if (add_cr)
- len += snprintf(buf + len, max(size - len, 0), "\n");
-
return len;
}
+static int input_print_modalias(char *buf, int size, const struct input_dev *id)
+{
+ int full_len;
+
+ /*
+ * Printing is done in 2 passes: first one figures out total length
+ * needed for the modalias string, second one will try to trim key
+ * data in case when buffer is too small for the entire modalias.
+ * If the buffer is too small regardless, it will fill as much as it
+ * can (without trimming key data) into the buffer and leave it to
+ * the caller to figure out what to do with the result.
+ */
+ full_len = input_print_modalias_parts(NULL, 0, 0, id);
+ return input_print_modalias_parts(buf, size, full_len, id);
+}
+
static ssize_t input_dev_show_modalias(struct device *dev,
struct device_attribute *attr,
char *buf)
@@ -1429,7 +1482,9 @@ static ssize_t input_dev_show_modalias(struct device *dev,
struct input_dev *id = to_input_dev(dev);
ssize_t len;
- len = input_print_modalias(buf, PAGE_SIZE, id, 1);
+ len = input_print_modalias(buf, PAGE_SIZE, id);
+ if (len < PAGE_SIZE - 2)
+ len += snprintf(buf + len, PAGE_SIZE - len, "\n");
return min_t(int, len, PAGE_SIZE);
}
@@ -1641,6 +1696,23 @@ static int input_add_uevent_bm_var(struct kobj_uevent_env *env,
return 0;
}
+/*
+ * This is a pretty gross hack. When building uevent data the driver core
+ * may try adding more environment variables to kobj_uevent_env without
+ * telling us, so we have no idea how much of the buffer we can use to
+ * avoid overflows/-ENOMEM elsewhere. To work around this let's artificially
+ * reduce amount of memory we will use for the modalias environment variable.
+ *
+ * The potential additions are:
+ *
+ * SEQNUM=18446744073709551615 - (%llu - 28 bytes)
+ * HOME=/ (6 bytes)
+ * PATH=/sbin:/bin:/usr/sbin:/usr/bin (34 bytes)
+ *
+ * 68 bytes total. Allow extra buffer - 96 bytes
+ */
+#define UEVENT_ENV_EXTRA_LEN 96
+
static int input_add_uevent_modalias_var(struct kobj_uevent_env *env,
const struct input_dev *dev)
{
@@ -1650,9 +1722,11 @@ static int input_add_uevent_modalias_var(struct kobj_uevent_env *env,
return -ENOMEM;
len = input_print_modalias(&env->buf[env->buflen - 1],
- sizeof(env->buf) - env->buflen,
- dev, 0);
- if (len >= (sizeof(env->buf) - env->buflen))
+ (int)sizeof(env->buf) - env->buflen -
+ UEVENT_ENV_EXTRA_LEN,
+ dev);
+ if (len >= ((int)sizeof(env->buf) - env->buflen -
+ UEVENT_ENV_EXTRA_LEN))
return -ENOMEM;
env->buflen += len;
The patch below does not apply to the 5.4-stable tree.
If someone wants it applied there, or to any other stable or longterm
tree, then please email the backport, including the original git commit
id to <stable(a)vger.kernel.org>.
To reproduce the conflict and resubmit, you may use the following commands:
git fetch https://git.kernel.org/pub/scm/linux/kernel/git/stable/linux.git/ linux-5.4.y
git checkout FETCH_HEAD
git cherry-pick -x 0774d19038c496f0c3602fb505c43e1b2d8eed85
# <resolve conflicts, build, test, etc.>
git commit -s
git send-email --to '<stable(a)vger.kernel.org>' --in-reply-to '2024052512-grunt-open-ce9d@gregkh' --subject-prefix 'PATCH 5.4.y' HEAD^..
Possible dependencies:
thanks,
greg k-h
------------------ original commit in Linus's tree ------------------
From 0774d19038c496f0c3602fb505c43e1b2d8eed85 Mon Sep 17 00:00:00 2001
From: Dmitry Torokhov <dmitry.torokhov(a)gmail.com>
Date: Mon, 29 Apr 2024 14:50:41 -0700
Subject: [PATCH] Input: try trimming too long modalias strings
If an input device declares too many capability bits then modalias
string for such device may become too long and not fit into uevent
buffer, resulting in failure of sending said uevent. This, in turn,
may prevent userspace from recognizing existence of such devices.
This is typically not a concern for real hardware devices as they have
limited number of keys, but happen with synthetic devices such as
ones created by xen-kbdfront driver, which creates devices as being
capable of delivering all possible keys, since it doesn't know what
keys the backend may produce.
To deal with such devices input core will attempt to trim key data,
in the hope that the rest of modalias string will fit in the given
buffer. When trimming key data it will indicate that it is not
complete by placing "+," sign, resulting in conversions like this:
old: k71,72,73,74,78,7A,7B,7C,7D,8E,9E,A4,AD,E0,E1,E4,F8,174,
new: k71,72,73,74,78,7A,7B,7C,+,
This should allow existing udev rules continue to work with existing
devices, and will also allow writing more complex rules that would
recognize trimmed modalias and check input device characteristics by
other means (for example by parsing KEY= data in uevent or parsing
input device sysfs attributes).
Note that the driver core may try adding more uevent environment
variables once input core is done adding its own, so when forming
modalias we can not use the entire available buffer, so we reduce
it by somewhat an arbitrary amount (96 bytes).
Reported-by: Jason Andryuk <jandryuk(a)gmail.com>
Reviewed-by: Peter Hutterer <peter.hutterer(a)who-t.net>
Tested-by: Jason Andryuk <jandryuk(a)gmail.com>
Link: https://lore.kernel.org/r/ZjAWMQCJdrxZkvkB@google.com
Cc: stable(a)vger.kernel.org
Signed-off-by: Dmitry Torokhov <dmitry.torokhov(a)gmail.com>
diff --git a/drivers/input/input.c b/drivers/input/input.c
index 711485437567..fd4997ba263c 100644
--- a/drivers/input/input.c
+++ b/drivers/input/input.c
@@ -1378,19 +1378,19 @@ static int input_print_modalias_bits(char *buf, int size,
char name, const unsigned long *bm,
unsigned int min_bit, unsigned int max_bit)
{
- int len = 0, i;
+ int bit = min_bit;
+ int len = 0;
len += snprintf(buf, max(size, 0), "%c", name);
- for (i = min_bit; i < max_bit; i++)
- if (bm[BIT_WORD(i)] & BIT_MASK(i))
- len += snprintf(buf + len, max(size - len, 0), "%X,", i);
+ for_each_set_bit_from(bit, bm, max_bit)
+ len += snprintf(buf + len, max(size - len, 0), "%X,", bit);
return len;
}
-static int input_print_modalias(char *buf, int size, const struct input_dev *id,
- int add_cr)
+static int input_print_modalias_parts(char *buf, int size, int full_len,
+ const struct input_dev *id)
{
- int len;
+ int len, klen, remainder, space;
len = snprintf(buf, max(size, 0),
"input:b%04Xv%04Xp%04Xe%04X-",
@@ -1399,8 +1399,48 @@ static int input_print_modalias(char *buf, int size, const struct input_dev *id,
len += input_print_modalias_bits(buf + len, size - len,
'e', id->evbit, 0, EV_MAX);
- len += input_print_modalias_bits(buf + len, size - len,
+
+ /*
+ * Calculate the remaining space in the buffer making sure we
+ * have place for the terminating 0.
+ */
+ space = max(size - (len + 1), 0);
+
+ klen = input_print_modalias_bits(buf + len, size - len,
'k', id->keybit, KEY_MIN_INTERESTING, KEY_MAX);
+ len += klen;
+
+ /*
+ * If we have more data than we can fit in the buffer, check
+ * if we can trim key data to fit in the rest. We will indicate
+ * that key data is incomplete by adding "+" sign at the end, like
+ * this: * "k1,2,3,45,+,".
+ *
+ * Note that we shortest key info (if present) is "k+," so we
+ * can only try to trim if key data is longer than that.
+ */
+ if (full_len && size < full_len + 1 && klen > 3) {
+ remainder = full_len - len;
+ /*
+ * We can only trim if we have space for the remainder
+ * and also for at least "k+," which is 3 more characters.
+ */
+ if (remainder <= space - 3) {
+ /*
+ * We are guaranteed to have 'k' in the buffer, so
+ * we need at least 3 additional bytes for storing
+ * "+," in addition to the remainder.
+ */
+ for (int i = size - 1 - remainder - 3; i >= 0; i--) {
+ if (buf[i] == 'k' || buf[i] == ',') {
+ strcpy(buf + i + 1, "+,");
+ len = i + 3; /* Not counting '\0' */
+ break;
+ }
+ }
+ }
+ }
+
len += input_print_modalias_bits(buf + len, size - len,
'r', id->relbit, 0, REL_MAX);
len += input_print_modalias_bits(buf + len, size - len,
@@ -1416,12 +1456,25 @@ static int input_print_modalias(char *buf, int size, const struct input_dev *id,
len += input_print_modalias_bits(buf + len, size - len,
'w', id->swbit, 0, SW_MAX);
- if (add_cr)
- len += snprintf(buf + len, max(size - len, 0), "\n");
-
return len;
}
+static int input_print_modalias(char *buf, int size, const struct input_dev *id)
+{
+ int full_len;
+
+ /*
+ * Printing is done in 2 passes: first one figures out total length
+ * needed for the modalias string, second one will try to trim key
+ * data in case when buffer is too small for the entire modalias.
+ * If the buffer is too small regardless, it will fill as much as it
+ * can (without trimming key data) into the buffer and leave it to
+ * the caller to figure out what to do with the result.
+ */
+ full_len = input_print_modalias_parts(NULL, 0, 0, id);
+ return input_print_modalias_parts(buf, size, full_len, id);
+}
+
static ssize_t input_dev_show_modalias(struct device *dev,
struct device_attribute *attr,
char *buf)
@@ -1429,7 +1482,9 @@ static ssize_t input_dev_show_modalias(struct device *dev,
struct input_dev *id = to_input_dev(dev);
ssize_t len;
- len = input_print_modalias(buf, PAGE_SIZE, id, 1);
+ len = input_print_modalias(buf, PAGE_SIZE, id);
+ if (len < PAGE_SIZE - 2)
+ len += snprintf(buf + len, PAGE_SIZE - len, "\n");
return min_t(int, len, PAGE_SIZE);
}
@@ -1641,6 +1696,23 @@ static int input_add_uevent_bm_var(struct kobj_uevent_env *env,
return 0;
}
+/*
+ * This is a pretty gross hack. When building uevent data the driver core
+ * may try adding more environment variables to kobj_uevent_env without
+ * telling us, so we have no idea how much of the buffer we can use to
+ * avoid overflows/-ENOMEM elsewhere. To work around this let's artificially
+ * reduce amount of memory we will use for the modalias environment variable.
+ *
+ * The potential additions are:
+ *
+ * SEQNUM=18446744073709551615 - (%llu - 28 bytes)
+ * HOME=/ (6 bytes)
+ * PATH=/sbin:/bin:/usr/sbin:/usr/bin (34 bytes)
+ *
+ * 68 bytes total. Allow extra buffer - 96 bytes
+ */
+#define UEVENT_ENV_EXTRA_LEN 96
+
static int input_add_uevent_modalias_var(struct kobj_uevent_env *env,
const struct input_dev *dev)
{
@@ -1650,9 +1722,11 @@ static int input_add_uevent_modalias_var(struct kobj_uevent_env *env,
return -ENOMEM;
len = input_print_modalias(&env->buf[env->buflen - 1],
- sizeof(env->buf) - env->buflen,
- dev, 0);
- if (len >= (sizeof(env->buf) - env->buflen))
+ (int)sizeof(env->buf) - env->buflen -
+ UEVENT_ENV_EXTRA_LEN,
+ dev);
+ if (len >= ((int)sizeof(env->buf) - env->buflen -
+ UEVENT_ENV_EXTRA_LEN))
return -ENOMEM;
env->buflen += len;
The patch below does not apply to the 5.10-stable tree.
If someone wants it applied there, or to any other stable or longterm
tree, then please email the backport, including the original git commit
id to <stable(a)vger.kernel.org>.
To reproduce the conflict and resubmit, you may use the following commands:
git fetch https://git.kernel.org/pub/scm/linux/kernel/git/stable/linux.git/ linux-5.10.y
git checkout FETCH_HEAD
git cherry-pick -x 0774d19038c496f0c3602fb505c43e1b2d8eed85
# <resolve conflicts, build, test, etc.>
git commit -s
git send-email --to '<stable(a)vger.kernel.org>' --in-reply-to '2024052511-greedless-jukebox-5abd@gregkh' --subject-prefix 'PATCH 5.10.y' HEAD^..
Possible dependencies:
thanks,
greg k-h
------------------ original commit in Linus's tree ------------------
From 0774d19038c496f0c3602fb505c43e1b2d8eed85 Mon Sep 17 00:00:00 2001
From: Dmitry Torokhov <dmitry.torokhov(a)gmail.com>
Date: Mon, 29 Apr 2024 14:50:41 -0700
Subject: [PATCH] Input: try trimming too long modalias strings
If an input device declares too many capability bits then modalias
string for such device may become too long and not fit into uevent
buffer, resulting in failure of sending said uevent. This, in turn,
may prevent userspace from recognizing existence of such devices.
This is typically not a concern for real hardware devices as they have
limited number of keys, but happen with synthetic devices such as
ones created by xen-kbdfront driver, which creates devices as being
capable of delivering all possible keys, since it doesn't know what
keys the backend may produce.
To deal with such devices input core will attempt to trim key data,
in the hope that the rest of modalias string will fit in the given
buffer. When trimming key data it will indicate that it is not
complete by placing "+," sign, resulting in conversions like this:
old: k71,72,73,74,78,7A,7B,7C,7D,8E,9E,A4,AD,E0,E1,E4,F8,174,
new: k71,72,73,74,78,7A,7B,7C,+,
This should allow existing udev rules continue to work with existing
devices, and will also allow writing more complex rules that would
recognize trimmed modalias and check input device characteristics by
other means (for example by parsing KEY= data in uevent or parsing
input device sysfs attributes).
Note that the driver core may try adding more uevent environment
variables once input core is done adding its own, so when forming
modalias we can not use the entire available buffer, so we reduce
it by somewhat an arbitrary amount (96 bytes).
Reported-by: Jason Andryuk <jandryuk(a)gmail.com>
Reviewed-by: Peter Hutterer <peter.hutterer(a)who-t.net>
Tested-by: Jason Andryuk <jandryuk(a)gmail.com>
Link: https://lore.kernel.org/r/ZjAWMQCJdrxZkvkB@google.com
Cc: stable(a)vger.kernel.org
Signed-off-by: Dmitry Torokhov <dmitry.torokhov(a)gmail.com>
diff --git a/drivers/input/input.c b/drivers/input/input.c
index 711485437567..fd4997ba263c 100644
--- a/drivers/input/input.c
+++ b/drivers/input/input.c
@@ -1378,19 +1378,19 @@ static int input_print_modalias_bits(char *buf, int size,
char name, const unsigned long *bm,
unsigned int min_bit, unsigned int max_bit)
{
- int len = 0, i;
+ int bit = min_bit;
+ int len = 0;
len += snprintf(buf, max(size, 0), "%c", name);
- for (i = min_bit; i < max_bit; i++)
- if (bm[BIT_WORD(i)] & BIT_MASK(i))
- len += snprintf(buf + len, max(size - len, 0), "%X,", i);
+ for_each_set_bit_from(bit, bm, max_bit)
+ len += snprintf(buf + len, max(size - len, 0), "%X,", bit);
return len;
}
-static int input_print_modalias(char *buf, int size, const struct input_dev *id,
- int add_cr)
+static int input_print_modalias_parts(char *buf, int size, int full_len,
+ const struct input_dev *id)
{
- int len;
+ int len, klen, remainder, space;
len = snprintf(buf, max(size, 0),
"input:b%04Xv%04Xp%04Xe%04X-",
@@ -1399,8 +1399,48 @@ static int input_print_modalias(char *buf, int size, const struct input_dev *id,
len += input_print_modalias_bits(buf + len, size - len,
'e', id->evbit, 0, EV_MAX);
- len += input_print_modalias_bits(buf + len, size - len,
+
+ /*
+ * Calculate the remaining space in the buffer making sure we
+ * have place for the terminating 0.
+ */
+ space = max(size - (len + 1), 0);
+
+ klen = input_print_modalias_bits(buf + len, size - len,
'k', id->keybit, KEY_MIN_INTERESTING, KEY_MAX);
+ len += klen;
+
+ /*
+ * If we have more data than we can fit in the buffer, check
+ * if we can trim key data to fit in the rest. We will indicate
+ * that key data is incomplete by adding "+" sign at the end, like
+ * this: * "k1,2,3,45,+,".
+ *
+ * Note that we shortest key info (if present) is "k+," so we
+ * can only try to trim if key data is longer than that.
+ */
+ if (full_len && size < full_len + 1 && klen > 3) {
+ remainder = full_len - len;
+ /*
+ * We can only trim if we have space for the remainder
+ * and also for at least "k+," which is 3 more characters.
+ */
+ if (remainder <= space - 3) {
+ /*
+ * We are guaranteed to have 'k' in the buffer, so
+ * we need at least 3 additional bytes for storing
+ * "+," in addition to the remainder.
+ */
+ for (int i = size - 1 - remainder - 3; i >= 0; i--) {
+ if (buf[i] == 'k' || buf[i] == ',') {
+ strcpy(buf + i + 1, "+,");
+ len = i + 3; /* Not counting '\0' */
+ break;
+ }
+ }
+ }
+ }
+
len += input_print_modalias_bits(buf + len, size - len,
'r', id->relbit, 0, REL_MAX);
len += input_print_modalias_bits(buf + len, size - len,
@@ -1416,12 +1456,25 @@ static int input_print_modalias(char *buf, int size, const struct input_dev *id,
len += input_print_modalias_bits(buf + len, size - len,
'w', id->swbit, 0, SW_MAX);
- if (add_cr)
- len += snprintf(buf + len, max(size - len, 0), "\n");
-
return len;
}
+static int input_print_modalias(char *buf, int size, const struct input_dev *id)
+{
+ int full_len;
+
+ /*
+ * Printing is done in 2 passes: first one figures out total length
+ * needed for the modalias string, second one will try to trim key
+ * data in case when buffer is too small for the entire modalias.
+ * If the buffer is too small regardless, it will fill as much as it
+ * can (without trimming key data) into the buffer and leave it to
+ * the caller to figure out what to do with the result.
+ */
+ full_len = input_print_modalias_parts(NULL, 0, 0, id);
+ return input_print_modalias_parts(buf, size, full_len, id);
+}
+
static ssize_t input_dev_show_modalias(struct device *dev,
struct device_attribute *attr,
char *buf)
@@ -1429,7 +1482,9 @@ static ssize_t input_dev_show_modalias(struct device *dev,
struct input_dev *id = to_input_dev(dev);
ssize_t len;
- len = input_print_modalias(buf, PAGE_SIZE, id, 1);
+ len = input_print_modalias(buf, PAGE_SIZE, id);
+ if (len < PAGE_SIZE - 2)
+ len += snprintf(buf + len, PAGE_SIZE - len, "\n");
return min_t(int, len, PAGE_SIZE);
}
@@ -1641,6 +1696,23 @@ static int input_add_uevent_bm_var(struct kobj_uevent_env *env,
return 0;
}
+/*
+ * This is a pretty gross hack. When building uevent data the driver core
+ * may try adding more environment variables to kobj_uevent_env without
+ * telling us, so we have no idea how much of the buffer we can use to
+ * avoid overflows/-ENOMEM elsewhere. To work around this let's artificially
+ * reduce amount of memory we will use for the modalias environment variable.
+ *
+ * The potential additions are:
+ *
+ * SEQNUM=18446744073709551615 - (%llu - 28 bytes)
+ * HOME=/ (6 bytes)
+ * PATH=/sbin:/bin:/usr/sbin:/usr/bin (34 bytes)
+ *
+ * 68 bytes total. Allow extra buffer - 96 bytes
+ */
+#define UEVENT_ENV_EXTRA_LEN 96
+
static int input_add_uevent_modalias_var(struct kobj_uevent_env *env,
const struct input_dev *dev)
{
@@ -1650,9 +1722,11 @@ static int input_add_uevent_modalias_var(struct kobj_uevent_env *env,
return -ENOMEM;
len = input_print_modalias(&env->buf[env->buflen - 1],
- sizeof(env->buf) - env->buflen,
- dev, 0);
- if (len >= (sizeof(env->buf) - env->buflen))
+ (int)sizeof(env->buf) - env->buflen -
+ UEVENT_ENV_EXTRA_LEN,
+ dev);
+ if (len >= ((int)sizeof(env->buf) - env->buflen -
+ UEVENT_ENV_EXTRA_LEN))
return -ENOMEM;
env->buflen += len;
The patch below does not apply to the 5.15-stable tree.
If someone wants it applied there, or to any other stable or longterm
tree, then please email the backport, including the original git commit
id to <stable(a)vger.kernel.org>.
To reproduce the conflict and resubmit, you may use the following commands:
git fetch https://git.kernel.org/pub/scm/linux/kernel/git/stable/linux.git/ linux-5.15.y
git checkout FETCH_HEAD
git cherry-pick -x 0774d19038c496f0c3602fb505c43e1b2d8eed85
# <resolve conflicts, build, test, etc.>
git commit -s
git send-email --to '<stable(a)vger.kernel.org>' --in-reply-to '2024052509-curvature-profane-0ac3@gregkh' --subject-prefix 'PATCH 5.15.y' HEAD^..
Possible dependencies:
thanks,
greg k-h
------------------ original commit in Linus's tree ------------------
From 0774d19038c496f0c3602fb505c43e1b2d8eed85 Mon Sep 17 00:00:00 2001
From: Dmitry Torokhov <dmitry.torokhov(a)gmail.com>
Date: Mon, 29 Apr 2024 14:50:41 -0700
Subject: [PATCH] Input: try trimming too long modalias strings
If an input device declares too many capability bits then modalias
string for such device may become too long and not fit into uevent
buffer, resulting in failure of sending said uevent. This, in turn,
may prevent userspace from recognizing existence of such devices.
This is typically not a concern for real hardware devices as they have
limited number of keys, but happen with synthetic devices such as
ones created by xen-kbdfront driver, which creates devices as being
capable of delivering all possible keys, since it doesn't know what
keys the backend may produce.
To deal with such devices input core will attempt to trim key data,
in the hope that the rest of modalias string will fit in the given
buffer. When trimming key data it will indicate that it is not
complete by placing "+," sign, resulting in conversions like this:
old: k71,72,73,74,78,7A,7B,7C,7D,8E,9E,A4,AD,E0,E1,E4,F8,174,
new: k71,72,73,74,78,7A,7B,7C,+,
This should allow existing udev rules continue to work with existing
devices, and will also allow writing more complex rules that would
recognize trimmed modalias and check input device characteristics by
other means (for example by parsing KEY= data in uevent or parsing
input device sysfs attributes).
Note that the driver core may try adding more uevent environment
variables once input core is done adding its own, so when forming
modalias we can not use the entire available buffer, so we reduce
it by somewhat an arbitrary amount (96 bytes).
Reported-by: Jason Andryuk <jandryuk(a)gmail.com>
Reviewed-by: Peter Hutterer <peter.hutterer(a)who-t.net>
Tested-by: Jason Andryuk <jandryuk(a)gmail.com>
Link: https://lore.kernel.org/r/ZjAWMQCJdrxZkvkB@google.com
Cc: stable(a)vger.kernel.org
Signed-off-by: Dmitry Torokhov <dmitry.torokhov(a)gmail.com>
diff --git a/drivers/input/input.c b/drivers/input/input.c
index 711485437567..fd4997ba263c 100644
--- a/drivers/input/input.c
+++ b/drivers/input/input.c
@@ -1378,19 +1378,19 @@ static int input_print_modalias_bits(char *buf, int size,
char name, const unsigned long *bm,
unsigned int min_bit, unsigned int max_bit)
{
- int len = 0, i;
+ int bit = min_bit;
+ int len = 0;
len += snprintf(buf, max(size, 0), "%c", name);
- for (i = min_bit; i < max_bit; i++)
- if (bm[BIT_WORD(i)] & BIT_MASK(i))
- len += snprintf(buf + len, max(size - len, 0), "%X,", i);
+ for_each_set_bit_from(bit, bm, max_bit)
+ len += snprintf(buf + len, max(size - len, 0), "%X,", bit);
return len;
}
-static int input_print_modalias(char *buf, int size, const struct input_dev *id,
- int add_cr)
+static int input_print_modalias_parts(char *buf, int size, int full_len,
+ const struct input_dev *id)
{
- int len;
+ int len, klen, remainder, space;
len = snprintf(buf, max(size, 0),
"input:b%04Xv%04Xp%04Xe%04X-",
@@ -1399,8 +1399,48 @@ static int input_print_modalias(char *buf, int size, const struct input_dev *id,
len += input_print_modalias_bits(buf + len, size - len,
'e', id->evbit, 0, EV_MAX);
- len += input_print_modalias_bits(buf + len, size - len,
+
+ /*
+ * Calculate the remaining space in the buffer making sure we
+ * have place for the terminating 0.
+ */
+ space = max(size - (len + 1), 0);
+
+ klen = input_print_modalias_bits(buf + len, size - len,
'k', id->keybit, KEY_MIN_INTERESTING, KEY_MAX);
+ len += klen;
+
+ /*
+ * If we have more data than we can fit in the buffer, check
+ * if we can trim key data to fit in the rest. We will indicate
+ * that key data is incomplete by adding "+" sign at the end, like
+ * this: * "k1,2,3,45,+,".
+ *
+ * Note that we shortest key info (if present) is "k+," so we
+ * can only try to trim if key data is longer than that.
+ */
+ if (full_len && size < full_len + 1 && klen > 3) {
+ remainder = full_len - len;
+ /*
+ * We can only trim if we have space for the remainder
+ * and also for at least "k+," which is 3 more characters.
+ */
+ if (remainder <= space - 3) {
+ /*
+ * We are guaranteed to have 'k' in the buffer, so
+ * we need at least 3 additional bytes for storing
+ * "+," in addition to the remainder.
+ */
+ for (int i = size - 1 - remainder - 3; i >= 0; i--) {
+ if (buf[i] == 'k' || buf[i] == ',') {
+ strcpy(buf + i + 1, "+,");
+ len = i + 3; /* Not counting '\0' */
+ break;
+ }
+ }
+ }
+ }
+
len += input_print_modalias_bits(buf + len, size - len,
'r', id->relbit, 0, REL_MAX);
len += input_print_modalias_bits(buf + len, size - len,
@@ -1416,12 +1456,25 @@ static int input_print_modalias(char *buf, int size, const struct input_dev *id,
len += input_print_modalias_bits(buf + len, size - len,
'w', id->swbit, 0, SW_MAX);
- if (add_cr)
- len += snprintf(buf + len, max(size - len, 0), "\n");
-
return len;
}
+static int input_print_modalias(char *buf, int size, const struct input_dev *id)
+{
+ int full_len;
+
+ /*
+ * Printing is done in 2 passes: first one figures out total length
+ * needed for the modalias string, second one will try to trim key
+ * data in case when buffer is too small for the entire modalias.
+ * If the buffer is too small regardless, it will fill as much as it
+ * can (without trimming key data) into the buffer and leave it to
+ * the caller to figure out what to do with the result.
+ */
+ full_len = input_print_modalias_parts(NULL, 0, 0, id);
+ return input_print_modalias_parts(buf, size, full_len, id);
+}
+
static ssize_t input_dev_show_modalias(struct device *dev,
struct device_attribute *attr,
char *buf)
@@ -1429,7 +1482,9 @@ static ssize_t input_dev_show_modalias(struct device *dev,
struct input_dev *id = to_input_dev(dev);
ssize_t len;
- len = input_print_modalias(buf, PAGE_SIZE, id, 1);
+ len = input_print_modalias(buf, PAGE_SIZE, id);
+ if (len < PAGE_SIZE - 2)
+ len += snprintf(buf + len, PAGE_SIZE - len, "\n");
return min_t(int, len, PAGE_SIZE);
}
@@ -1641,6 +1696,23 @@ static int input_add_uevent_bm_var(struct kobj_uevent_env *env,
return 0;
}
+/*
+ * This is a pretty gross hack. When building uevent data the driver core
+ * may try adding more environment variables to kobj_uevent_env without
+ * telling us, so we have no idea how much of the buffer we can use to
+ * avoid overflows/-ENOMEM elsewhere. To work around this let's artificially
+ * reduce amount of memory we will use for the modalias environment variable.
+ *
+ * The potential additions are:
+ *
+ * SEQNUM=18446744073709551615 - (%llu - 28 bytes)
+ * HOME=/ (6 bytes)
+ * PATH=/sbin:/bin:/usr/sbin:/usr/bin (34 bytes)
+ *
+ * 68 bytes total. Allow extra buffer - 96 bytes
+ */
+#define UEVENT_ENV_EXTRA_LEN 96
+
static int input_add_uevent_modalias_var(struct kobj_uevent_env *env,
const struct input_dev *dev)
{
@@ -1650,9 +1722,11 @@ static int input_add_uevent_modalias_var(struct kobj_uevent_env *env,
return -ENOMEM;
len = input_print_modalias(&env->buf[env->buflen - 1],
- sizeof(env->buf) - env->buflen,
- dev, 0);
- if (len >= (sizeof(env->buf) - env->buflen))
+ (int)sizeof(env->buf) - env->buflen -
+ UEVENT_ENV_EXTRA_LEN,
+ dev);
+ if (len >= ((int)sizeof(env->buf) - env->buflen -
+ UEVENT_ENV_EXTRA_LEN))
return -ENOMEM;
env->buflen += len;
The patch below does not apply to the 6.1-stable tree.
If someone wants it applied there, or to any other stable or longterm
tree, then please email the backport, including the original git commit
id to <stable(a)vger.kernel.org>.
To reproduce the conflict and resubmit, you may use the following commands:
git fetch https://git.kernel.org/pub/scm/linux/kernel/git/stable/linux.git/ linux-6.1.y
git checkout FETCH_HEAD
git cherry-pick -x 0774d19038c496f0c3602fb505c43e1b2d8eed85
# <resolve conflicts, build, test, etc.>
git commit -s
git send-email --to '<stable(a)vger.kernel.org>' --in-reply-to '2024052508-stoke-gibberish-c092@gregkh' --subject-prefix 'PATCH 6.1.y' HEAD^..
Possible dependencies:
thanks,
greg k-h
------------------ original commit in Linus's tree ------------------
From 0774d19038c496f0c3602fb505c43e1b2d8eed85 Mon Sep 17 00:00:00 2001
From: Dmitry Torokhov <dmitry.torokhov(a)gmail.com>
Date: Mon, 29 Apr 2024 14:50:41 -0700
Subject: [PATCH] Input: try trimming too long modalias strings
If an input device declares too many capability bits then modalias
string for such device may become too long and not fit into uevent
buffer, resulting in failure of sending said uevent. This, in turn,
may prevent userspace from recognizing existence of such devices.
This is typically not a concern for real hardware devices as they have
limited number of keys, but happen with synthetic devices such as
ones created by xen-kbdfront driver, which creates devices as being
capable of delivering all possible keys, since it doesn't know what
keys the backend may produce.
To deal with such devices input core will attempt to trim key data,
in the hope that the rest of modalias string will fit in the given
buffer. When trimming key data it will indicate that it is not
complete by placing "+," sign, resulting in conversions like this:
old: k71,72,73,74,78,7A,7B,7C,7D,8E,9E,A4,AD,E0,E1,E4,F8,174,
new: k71,72,73,74,78,7A,7B,7C,+,
This should allow existing udev rules continue to work with existing
devices, and will also allow writing more complex rules that would
recognize trimmed modalias and check input device characteristics by
other means (for example by parsing KEY= data in uevent or parsing
input device sysfs attributes).
Note that the driver core may try adding more uevent environment
variables once input core is done adding its own, so when forming
modalias we can not use the entire available buffer, so we reduce
it by somewhat an arbitrary amount (96 bytes).
Reported-by: Jason Andryuk <jandryuk(a)gmail.com>
Reviewed-by: Peter Hutterer <peter.hutterer(a)who-t.net>
Tested-by: Jason Andryuk <jandryuk(a)gmail.com>
Link: https://lore.kernel.org/r/ZjAWMQCJdrxZkvkB@google.com
Cc: stable(a)vger.kernel.org
Signed-off-by: Dmitry Torokhov <dmitry.torokhov(a)gmail.com>
diff --git a/drivers/input/input.c b/drivers/input/input.c
index 711485437567..fd4997ba263c 100644
--- a/drivers/input/input.c
+++ b/drivers/input/input.c
@@ -1378,19 +1378,19 @@ static int input_print_modalias_bits(char *buf, int size,
char name, const unsigned long *bm,
unsigned int min_bit, unsigned int max_bit)
{
- int len = 0, i;
+ int bit = min_bit;
+ int len = 0;
len += snprintf(buf, max(size, 0), "%c", name);
- for (i = min_bit; i < max_bit; i++)
- if (bm[BIT_WORD(i)] & BIT_MASK(i))
- len += snprintf(buf + len, max(size - len, 0), "%X,", i);
+ for_each_set_bit_from(bit, bm, max_bit)
+ len += snprintf(buf + len, max(size - len, 0), "%X,", bit);
return len;
}
-static int input_print_modalias(char *buf, int size, const struct input_dev *id,
- int add_cr)
+static int input_print_modalias_parts(char *buf, int size, int full_len,
+ const struct input_dev *id)
{
- int len;
+ int len, klen, remainder, space;
len = snprintf(buf, max(size, 0),
"input:b%04Xv%04Xp%04Xe%04X-",
@@ -1399,8 +1399,48 @@ static int input_print_modalias(char *buf, int size, const struct input_dev *id,
len += input_print_modalias_bits(buf + len, size - len,
'e', id->evbit, 0, EV_MAX);
- len += input_print_modalias_bits(buf + len, size - len,
+
+ /*
+ * Calculate the remaining space in the buffer making sure we
+ * have place for the terminating 0.
+ */
+ space = max(size - (len + 1), 0);
+
+ klen = input_print_modalias_bits(buf + len, size - len,
'k', id->keybit, KEY_MIN_INTERESTING, KEY_MAX);
+ len += klen;
+
+ /*
+ * If we have more data than we can fit in the buffer, check
+ * if we can trim key data to fit in the rest. We will indicate
+ * that key data is incomplete by adding "+" sign at the end, like
+ * this: * "k1,2,3,45,+,".
+ *
+ * Note that we shortest key info (if present) is "k+," so we
+ * can only try to trim if key data is longer than that.
+ */
+ if (full_len && size < full_len + 1 && klen > 3) {
+ remainder = full_len - len;
+ /*
+ * We can only trim if we have space for the remainder
+ * and also for at least "k+," which is 3 more characters.
+ */
+ if (remainder <= space - 3) {
+ /*
+ * We are guaranteed to have 'k' in the buffer, so
+ * we need at least 3 additional bytes for storing
+ * "+," in addition to the remainder.
+ */
+ for (int i = size - 1 - remainder - 3; i >= 0; i--) {
+ if (buf[i] == 'k' || buf[i] == ',') {
+ strcpy(buf + i + 1, "+,");
+ len = i + 3; /* Not counting '\0' */
+ break;
+ }
+ }
+ }
+ }
+
len += input_print_modalias_bits(buf + len, size - len,
'r', id->relbit, 0, REL_MAX);
len += input_print_modalias_bits(buf + len, size - len,
@@ -1416,12 +1456,25 @@ static int input_print_modalias(char *buf, int size, const struct input_dev *id,
len += input_print_modalias_bits(buf + len, size - len,
'w', id->swbit, 0, SW_MAX);
- if (add_cr)
- len += snprintf(buf + len, max(size - len, 0), "\n");
-
return len;
}
+static int input_print_modalias(char *buf, int size, const struct input_dev *id)
+{
+ int full_len;
+
+ /*
+ * Printing is done in 2 passes: first one figures out total length
+ * needed for the modalias string, second one will try to trim key
+ * data in case when buffer is too small for the entire modalias.
+ * If the buffer is too small regardless, it will fill as much as it
+ * can (without trimming key data) into the buffer and leave it to
+ * the caller to figure out what to do with the result.
+ */
+ full_len = input_print_modalias_parts(NULL, 0, 0, id);
+ return input_print_modalias_parts(buf, size, full_len, id);
+}
+
static ssize_t input_dev_show_modalias(struct device *dev,
struct device_attribute *attr,
char *buf)
@@ -1429,7 +1482,9 @@ static ssize_t input_dev_show_modalias(struct device *dev,
struct input_dev *id = to_input_dev(dev);
ssize_t len;
- len = input_print_modalias(buf, PAGE_SIZE, id, 1);
+ len = input_print_modalias(buf, PAGE_SIZE, id);
+ if (len < PAGE_SIZE - 2)
+ len += snprintf(buf + len, PAGE_SIZE - len, "\n");
return min_t(int, len, PAGE_SIZE);
}
@@ -1641,6 +1696,23 @@ static int input_add_uevent_bm_var(struct kobj_uevent_env *env,
return 0;
}
+/*
+ * This is a pretty gross hack. When building uevent data the driver core
+ * may try adding more environment variables to kobj_uevent_env without
+ * telling us, so we have no idea how much of the buffer we can use to
+ * avoid overflows/-ENOMEM elsewhere. To work around this let's artificially
+ * reduce amount of memory we will use for the modalias environment variable.
+ *
+ * The potential additions are:
+ *
+ * SEQNUM=18446744073709551615 - (%llu - 28 bytes)
+ * HOME=/ (6 bytes)
+ * PATH=/sbin:/bin:/usr/sbin:/usr/bin (34 bytes)
+ *
+ * 68 bytes total. Allow extra buffer - 96 bytes
+ */
+#define UEVENT_ENV_EXTRA_LEN 96
+
static int input_add_uevent_modalias_var(struct kobj_uevent_env *env,
const struct input_dev *dev)
{
@@ -1650,9 +1722,11 @@ static int input_add_uevent_modalias_var(struct kobj_uevent_env *env,
return -ENOMEM;
len = input_print_modalias(&env->buf[env->buflen - 1],
- sizeof(env->buf) - env->buflen,
- dev, 0);
- if (len >= (sizeof(env->buf) - env->buflen))
+ (int)sizeof(env->buf) - env->buflen -
+ UEVENT_ENV_EXTRA_LEN,
+ dev);
+ if (len >= ((int)sizeof(env->buf) - env->buflen -
+ UEVENT_ENV_EXTRA_LEN))
return -ENOMEM;
env->buflen += len;
The patch below does not apply to the 4.19-stable tree.
If someone wants it applied there, or to any other stable or longterm
tree, then please email the backport, including the original git commit
id to <stable(a)vger.kernel.org>.
To reproduce the conflict and resubmit, you may use the following commands:
git fetch https://git.kernel.org/pub/scm/linux/kernel/git/stable/linux.git/ linux-4.19.y
git checkout FETCH_HEAD
git cherry-pick -x 8492bd91aa055907c67ef04f2b56f6dadd1f44bf
# <resolve conflicts, build, test, etc.>
git commit -s
git send-email --to '<stable(a)vger.kernel.org>' --in-reply-to '2024052509-agreement-quilt-f251@gregkh' --subject-prefix 'PATCH 4.19.y' HEAD^..
Possible dependencies:
thanks,
greg k-h
------------------ original commit in Linus's tree ------------------
From 8492bd91aa055907c67ef04f2b56f6dadd1f44bf Mon Sep 17 00:00:00 2001
From: Hugo Villeneuve <hvilleneuve(a)dimonoff.com>
Date: Tue, 30 Apr 2024 16:04:30 -0400
Subject: [PATCH] serial: sc16is7xx: fix bug in sc16is7xx_set_baud() when using
prescaler
When using a high speed clock with a low baud rate, the 4x prescaler is
automatically selected if required. In that case, sc16is7xx_set_baud()
properly configures the chip registers, but returns an incorrect baud
rate by not taking into account the prescaler value. This incorrect baud
rate is then fed to uart_update_timeout().
For example, with an input clock of 80MHz, and a selected baud rate of 50,
sc16is7xx_set_baud() will return 200 instead of 50.
Fix this by first changing the prescaler variable to hold the selected
prescaler value instead of the MCR bitfield. Then properly take into
account the selected prescaler value in the return value computation.
Also add better documentation about the divisor value computation.
Fixes: dfeae619d781 ("serial: sc16is7xx")
Cc: stable(a)vger.kernel.org
Signed-off-by: Hugo Villeneuve <hvilleneuve(a)dimonoff.com>
Reviewed-by: Jiri Slaby <jirislaby(a)kernel.org>
Link: https://lore.kernel.org/r/20240430200431.4102923-1-hugo@hugovil.com
Signed-off-by: Greg Kroah-Hartman <gregkh(a)linuxfoundation.org>
diff --git a/drivers/tty/serial/sc16is7xx.c b/drivers/tty/serial/sc16is7xx.c
index 929206a9a6e1..12915fffac27 100644
--- a/drivers/tty/serial/sc16is7xx.c
+++ b/drivers/tty/serial/sc16is7xx.c
@@ -554,16 +554,28 @@ static bool sc16is7xx_regmap_noinc(struct device *dev, unsigned int reg)
return reg == SC16IS7XX_RHR_REG;
}
+/*
+ * Configure programmable baud rate generator (divisor) according to the
+ * desired baud rate.
+ *
+ * From the datasheet, the divisor is computed according to:
+ *
+ * XTAL1 input frequency
+ * -----------------------
+ * prescaler
+ * divisor = ---------------------------
+ * baud-rate x sampling-rate
+ */
static int sc16is7xx_set_baud(struct uart_port *port, int baud)
{
struct sc16is7xx_one *one = to_sc16is7xx_one(port, port);
u8 lcr;
- u8 prescaler = 0;
+ unsigned int prescaler = 1;
unsigned long clk = port->uartclk, div = clk / 16 / baud;
if (div >= BIT(16)) {
- prescaler = SC16IS7XX_MCR_CLKSEL_BIT;
- div /= 4;
+ prescaler = 4;
+ div /= prescaler;
}
/* Enable enhanced features */
@@ -573,9 +585,10 @@ static int sc16is7xx_set_baud(struct uart_port *port, int baud)
SC16IS7XX_EFR_ENABLE_BIT);
sc16is7xx_efr_unlock(port);
+ /* If bit MCR_CLKSEL is set, the divide by 4 prescaler is activated. */
sc16is7xx_port_update(port, SC16IS7XX_MCR_REG,
SC16IS7XX_MCR_CLKSEL_BIT,
- prescaler);
+ prescaler == 1 ? 0 : SC16IS7XX_MCR_CLKSEL_BIT);
/* Backup LCR and access special register set (DLL/DLH) */
lcr = sc16is7xx_port_read(port, SC16IS7XX_LCR_REG);
@@ -591,7 +604,7 @@ static int sc16is7xx_set_baud(struct uart_port *port, int baud)
/* Restore LCR and access to general register set */
sc16is7xx_port_write(port, SC16IS7XX_LCR_REG, lcr);
- return DIV_ROUND_CLOSEST(clk / 16, div);
+ return DIV_ROUND_CLOSEST((clk / prescaler) / 16, div);
}
static void sc16is7xx_handle_rx(struct uart_port *port, unsigned int rxlen,
The patch below does not apply to the 5.4-stable tree.
If someone wants it applied there, or to any other stable or longterm
tree, then please email the backport, including the original git commit
id to <stable(a)vger.kernel.org>.
To reproduce the conflict and resubmit, you may use the following commands:
git fetch https://git.kernel.org/pub/scm/linux/kernel/git/stable/linux.git/ linux-5.4.y
git checkout FETCH_HEAD
git cherry-pick -x 8492bd91aa055907c67ef04f2b56f6dadd1f44bf
# <resolve conflicts, build, test, etc.>
git commit -s
git send-email --to '<stable(a)vger.kernel.org>' --in-reply-to '2024052508-corporate-mayday-15b7@gregkh' --subject-prefix 'PATCH 5.4.y' HEAD^..
Possible dependencies:
thanks,
greg k-h
------------------ original commit in Linus's tree ------------------
From 8492bd91aa055907c67ef04f2b56f6dadd1f44bf Mon Sep 17 00:00:00 2001
From: Hugo Villeneuve <hvilleneuve(a)dimonoff.com>
Date: Tue, 30 Apr 2024 16:04:30 -0400
Subject: [PATCH] serial: sc16is7xx: fix bug in sc16is7xx_set_baud() when using
prescaler
When using a high speed clock with a low baud rate, the 4x prescaler is
automatically selected if required. In that case, sc16is7xx_set_baud()
properly configures the chip registers, but returns an incorrect baud
rate by not taking into account the prescaler value. This incorrect baud
rate is then fed to uart_update_timeout().
For example, with an input clock of 80MHz, and a selected baud rate of 50,
sc16is7xx_set_baud() will return 200 instead of 50.
Fix this by first changing the prescaler variable to hold the selected
prescaler value instead of the MCR bitfield. Then properly take into
account the selected prescaler value in the return value computation.
Also add better documentation about the divisor value computation.
Fixes: dfeae619d781 ("serial: sc16is7xx")
Cc: stable(a)vger.kernel.org
Signed-off-by: Hugo Villeneuve <hvilleneuve(a)dimonoff.com>
Reviewed-by: Jiri Slaby <jirislaby(a)kernel.org>
Link: https://lore.kernel.org/r/20240430200431.4102923-1-hugo@hugovil.com
Signed-off-by: Greg Kroah-Hartman <gregkh(a)linuxfoundation.org>
diff --git a/drivers/tty/serial/sc16is7xx.c b/drivers/tty/serial/sc16is7xx.c
index 929206a9a6e1..12915fffac27 100644
--- a/drivers/tty/serial/sc16is7xx.c
+++ b/drivers/tty/serial/sc16is7xx.c
@@ -554,16 +554,28 @@ static bool sc16is7xx_regmap_noinc(struct device *dev, unsigned int reg)
return reg == SC16IS7XX_RHR_REG;
}
+/*
+ * Configure programmable baud rate generator (divisor) according to the
+ * desired baud rate.
+ *
+ * From the datasheet, the divisor is computed according to:
+ *
+ * XTAL1 input frequency
+ * -----------------------
+ * prescaler
+ * divisor = ---------------------------
+ * baud-rate x sampling-rate
+ */
static int sc16is7xx_set_baud(struct uart_port *port, int baud)
{
struct sc16is7xx_one *one = to_sc16is7xx_one(port, port);
u8 lcr;
- u8 prescaler = 0;
+ unsigned int prescaler = 1;
unsigned long clk = port->uartclk, div = clk / 16 / baud;
if (div >= BIT(16)) {
- prescaler = SC16IS7XX_MCR_CLKSEL_BIT;
- div /= 4;
+ prescaler = 4;
+ div /= prescaler;
}
/* Enable enhanced features */
@@ -573,9 +585,10 @@ static int sc16is7xx_set_baud(struct uart_port *port, int baud)
SC16IS7XX_EFR_ENABLE_BIT);
sc16is7xx_efr_unlock(port);
+ /* If bit MCR_CLKSEL is set, the divide by 4 prescaler is activated. */
sc16is7xx_port_update(port, SC16IS7XX_MCR_REG,
SC16IS7XX_MCR_CLKSEL_BIT,
- prescaler);
+ prescaler == 1 ? 0 : SC16IS7XX_MCR_CLKSEL_BIT);
/* Backup LCR and access special register set (DLL/DLH) */
lcr = sc16is7xx_port_read(port, SC16IS7XX_LCR_REG);
@@ -591,7 +604,7 @@ static int sc16is7xx_set_baud(struct uart_port *port, int baud)
/* Restore LCR and access to general register set */
sc16is7xx_port_write(port, SC16IS7XX_LCR_REG, lcr);
- return DIV_ROUND_CLOSEST(clk / 16, div);
+ return DIV_ROUND_CLOSEST((clk / prescaler) / 16, div);
}
static void sc16is7xx_handle_rx(struct uart_port *port, unsigned int rxlen,
The patch below does not apply to the 5.10-stable tree.
If someone wants it applied there, or to any other stable or longterm
tree, then please email the backport, including the original git commit
id to <stable(a)vger.kernel.org>.
To reproduce the conflict and resubmit, you may use the following commands:
git fetch https://git.kernel.org/pub/scm/linux/kernel/git/stable/linux.git/ linux-5.10.y
git checkout FETCH_HEAD
git cherry-pick -x 8492bd91aa055907c67ef04f2b56f6dadd1f44bf
# <resolve conflicts, build, test, etc.>
git commit -s
git send-email --to '<stable(a)vger.kernel.org>' --in-reply-to '2024052507-catty-penniless-a423@gregkh' --subject-prefix 'PATCH 5.10.y' HEAD^..
Possible dependencies:
thanks,
greg k-h
------------------ original commit in Linus's tree ------------------
From 8492bd91aa055907c67ef04f2b56f6dadd1f44bf Mon Sep 17 00:00:00 2001
From: Hugo Villeneuve <hvilleneuve(a)dimonoff.com>
Date: Tue, 30 Apr 2024 16:04:30 -0400
Subject: [PATCH] serial: sc16is7xx: fix bug in sc16is7xx_set_baud() when using
prescaler
When using a high speed clock with a low baud rate, the 4x prescaler is
automatically selected if required. In that case, sc16is7xx_set_baud()
properly configures the chip registers, but returns an incorrect baud
rate by not taking into account the prescaler value. This incorrect baud
rate is then fed to uart_update_timeout().
For example, with an input clock of 80MHz, and a selected baud rate of 50,
sc16is7xx_set_baud() will return 200 instead of 50.
Fix this by first changing the prescaler variable to hold the selected
prescaler value instead of the MCR bitfield. Then properly take into
account the selected prescaler value in the return value computation.
Also add better documentation about the divisor value computation.
Fixes: dfeae619d781 ("serial: sc16is7xx")
Cc: stable(a)vger.kernel.org
Signed-off-by: Hugo Villeneuve <hvilleneuve(a)dimonoff.com>
Reviewed-by: Jiri Slaby <jirislaby(a)kernel.org>
Link: https://lore.kernel.org/r/20240430200431.4102923-1-hugo@hugovil.com
Signed-off-by: Greg Kroah-Hartman <gregkh(a)linuxfoundation.org>
diff --git a/drivers/tty/serial/sc16is7xx.c b/drivers/tty/serial/sc16is7xx.c
index 929206a9a6e1..12915fffac27 100644
--- a/drivers/tty/serial/sc16is7xx.c
+++ b/drivers/tty/serial/sc16is7xx.c
@@ -554,16 +554,28 @@ static bool sc16is7xx_regmap_noinc(struct device *dev, unsigned int reg)
return reg == SC16IS7XX_RHR_REG;
}
+/*
+ * Configure programmable baud rate generator (divisor) according to the
+ * desired baud rate.
+ *
+ * From the datasheet, the divisor is computed according to:
+ *
+ * XTAL1 input frequency
+ * -----------------------
+ * prescaler
+ * divisor = ---------------------------
+ * baud-rate x sampling-rate
+ */
static int sc16is7xx_set_baud(struct uart_port *port, int baud)
{
struct sc16is7xx_one *one = to_sc16is7xx_one(port, port);
u8 lcr;
- u8 prescaler = 0;
+ unsigned int prescaler = 1;
unsigned long clk = port->uartclk, div = clk / 16 / baud;
if (div >= BIT(16)) {
- prescaler = SC16IS7XX_MCR_CLKSEL_BIT;
- div /= 4;
+ prescaler = 4;
+ div /= prescaler;
}
/* Enable enhanced features */
@@ -573,9 +585,10 @@ static int sc16is7xx_set_baud(struct uart_port *port, int baud)
SC16IS7XX_EFR_ENABLE_BIT);
sc16is7xx_efr_unlock(port);
+ /* If bit MCR_CLKSEL is set, the divide by 4 prescaler is activated. */
sc16is7xx_port_update(port, SC16IS7XX_MCR_REG,
SC16IS7XX_MCR_CLKSEL_BIT,
- prescaler);
+ prescaler == 1 ? 0 : SC16IS7XX_MCR_CLKSEL_BIT);
/* Backup LCR and access special register set (DLL/DLH) */
lcr = sc16is7xx_port_read(port, SC16IS7XX_LCR_REG);
@@ -591,7 +604,7 @@ static int sc16is7xx_set_baud(struct uart_port *port, int baud)
/* Restore LCR and access to general register set */
sc16is7xx_port_write(port, SC16IS7XX_LCR_REG, lcr);
- return DIV_ROUND_CLOSEST(clk / 16, div);
+ return DIV_ROUND_CLOSEST((clk / prescaler) / 16, div);
}
static void sc16is7xx_handle_rx(struct uart_port *port, unsigned int rxlen,
The patch below does not apply to the 5.15-stable tree.
If someone wants it applied there, or to any other stable or longterm
tree, then please email the backport, including the original git commit
id to <stable(a)vger.kernel.org>.
To reproduce the conflict and resubmit, you may use the following commands:
git fetch https://git.kernel.org/pub/scm/linux/kernel/git/stable/linux.git/ linux-5.15.y
git checkout FETCH_HEAD
git cherry-pick -x 8492bd91aa055907c67ef04f2b56f6dadd1f44bf
# <resolve conflicts, build, test, etc.>
git commit -s
git send-email --to '<stable(a)vger.kernel.org>' --in-reply-to '2024052506-afternoon-exponent-b101@gregkh' --subject-prefix 'PATCH 5.15.y' HEAD^..
Possible dependencies:
thanks,
greg k-h
------------------ original commit in Linus's tree ------------------
From 8492bd91aa055907c67ef04f2b56f6dadd1f44bf Mon Sep 17 00:00:00 2001
From: Hugo Villeneuve <hvilleneuve(a)dimonoff.com>
Date: Tue, 30 Apr 2024 16:04:30 -0400
Subject: [PATCH] serial: sc16is7xx: fix bug in sc16is7xx_set_baud() when using
prescaler
When using a high speed clock with a low baud rate, the 4x prescaler is
automatically selected if required. In that case, sc16is7xx_set_baud()
properly configures the chip registers, but returns an incorrect baud
rate by not taking into account the prescaler value. This incorrect baud
rate is then fed to uart_update_timeout().
For example, with an input clock of 80MHz, and a selected baud rate of 50,
sc16is7xx_set_baud() will return 200 instead of 50.
Fix this by first changing the prescaler variable to hold the selected
prescaler value instead of the MCR bitfield. Then properly take into
account the selected prescaler value in the return value computation.
Also add better documentation about the divisor value computation.
Fixes: dfeae619d781 ("serial: sc16is7xx")
Cc: stable(a)vger.kernel.org
Signed-off-by: Hugo Villeneuve <hvilleneuve(a)dimonoff.com>
Reviewed-by: Jiri Slaby <jirislaby(a)kernel.org>
Link: https://lore.kernel.org/r/20240430200431.4102923-1-hugo@hugovil.com
Signed-off-by: Greg Kroah-Hartman <gregkh(a)linuxfoundation.org>
diff --git a/drivers/tty/serial/sc16is7xx.c b/drivers/tty/serial/sc16is7xx.c
index 929206a9a6e1..12915fffac27 100644
--- a/drivers/tty/serial/sc16is7xx.c
+++ b/drivers/tty/serial/sc16is7xx.c
@@ -554,16 +554,28 @@ static bool sc16is7xx_regmap_noinc(struct device *dev, unsigned int reg)
return reg == SC16IS7XX_RHR_REG;
}
+/*
+ * Configure programmable baud rate generator (divisor) according to the
+ * desired baud rate.
+ *
+ * From the datasheet, the divisor is computed according to:
+ *
+ * XTAL1 input frequency
+ * -----------------------
+ * prescaler
+ * divisor = ---------------------------
+ * baud-rate x sampling-rate
+ */
static int sc16is7xx_set_baud(struct uart_port *port, int baud)
{
struct sc16is7xx_one *one = to_sc16is7xx_one(port, port);
u8 lcr;
- u8 prescaler = 0;
+ unsigned int prescaler = 1;
unsigned long clk = port->uartclk, div = clk / 16 / baud;
if (div >= BIT(16)) {
- prescaler = SC16IS7XX_MCR_CLKSEL_BIT;
- div /= 4;
+ prescaler = 4;
+ div /= prescaler;
}
/* Enable enhanced features */
@@ -573,9 +585,10 @@ static int sc16is7xx_set_baud(struct uart_port *port, int baud)
SC16IS7XX_EFR_ENABLE_BIT);
sc16is7xx_efr_unlock(port);
+ /* If bit MCR_CLKSEL is set, the divide by 4 prescaler is activated. */
sc16is7xx_port_update(port, SC16IS7XX_MCR_REG,
SC16IS7XX_MCR_CLKSEL_BIT,
- prescaler);
+ prescaler == 1 ? 0 : SC16IS7XX_MCR_CLKSEL_BIT);
/* Backup LCR and access special register set (DLL/DLH) */
lcr = sc16is7xx_port_read(port, SC16IS7XX_LCR_REG);
@@ -591,7 +604,7 @@ static int sc16is7xx_set_baud(struct uart_port *port, int baud)
/* Restore LCR and access to general register set */
sc16is7xx_port_write(port, SC16IS7XX_LCR_REG, lcr);
- return DIV_ROUND_CLOSEST(clk / 16, div);
+ return DIV_ROUND_CLOSEST((clk / prescaler) / 16, div);
}
static void sc16is7xx_handle_rx(struct uart_port *port, unsigned int rxlen,
The patch below does not apply to the 6.6-stable tree.
If someone wants it applied there, or to any other stable or longterm
tree, then please email the backport, including the original git commit
id to <stable(a)vger.kernel.org>.
To reproduce the conflict and resubmit, you may use the following commands:
git fetch https://git.kernel.org/pub/scm/linux/kernel/git/stable/linux.git/ linux-6.6.y
git checkout FETCH_HEAD
git cherry-pick -x 8492bd91aa055907c67ef04f2b56f6dadd1f44bf
# <resolve conflicts, build, test, etc.>
git commit -s
git send-email --to '<stable(a)vger.kernel.org>' --in-reply-to '2024052504-hungrily-verdict-1471@gregkh' --subject-prefix 'PATCH 6.6.y' HEAD^..
Possible dependencies:
thanks,
greg k-h
------------------ original commit in Linus's tree ------------------
From 8492bd91aa055907c67ef04f2b56f6dadd1f44bf Mon Sep 17 00:00:00 2001
From: Hugo Villeneuve <hvilleneuve(a)dimonoff.com>
Date: Tue, 30 Apr 2024 16:04:30 -0400
Subject: [PATCH] serial: sc16is7xx: fix bug in sc16is7xx_set_baud() when using
prescaler
When using a high speed clock with a low baud rate, the 4x prescaler is
automatically selected if required. In that case, sc16is7xx_set_baud()
properly configures the chip registers, but returns an incorrect baud
rate by not taking into account the prescaler value. This incorrect baud
rate is then fed to uart_update_timeout().
For example, with an input clock of 80MHz, and a selected baud rate of 50,
sc16is7xx_set_baud() will return 200 instead of 50.
Fix this by first changing the prescaler variable to hold the selected
prescaler value instead of the MCR bitfield. Then properly take into
account the selected prescaler value in the return value computation.
Also add better documentation about the divisor value computation.
Fixes: dfeae619d781 ("serial: sc16is7xx")
Cc: stable(a)vger.kernel.org
Signed-off-by: Hugo Villeneuve <hvilleneuve(a)dimonoff.com>
Reviewed-by: Jiri Slaby <jirislaby(a)kernel.org>
Link: https://lore.kernel.org/r/20240430200431.4102923-1-hugo@hugovil.com
Signed-off-by: Greg Kroah-Hartman <gregkh(a)linuxfoundation.org>
diff --git a/drivers/tty/serial/sc16is7xx.c b/drivers/tty/serial/sc16is7xx.c
index 929206a9a6e1..12915fffac27 100644
--- a/drivers/tty/serial/sc16is7xx.c
+++ b/drivers/tty/serial/sc16is7xx.c
@@ -554,16 +554,28 @@ static bool sc16is7xx_regmap_noinc(struct device *dev, unsigned int reg)
return reg == SC16IS7XX_RHR_REG;
}
+/*
+ * Configure programmable baud rate generator (divisor) according to the
+ * desired baud rate.
+ *
+ * From the datasheet, the divisor is computed according to:
+ *
+ * XTAL1 input frequency
+ * -----------------------
+ * prescaler
+ * divisor = ---------------------------
+ * baud-rate x sampling-rate
+ */
static int sc16is7xx_set_baud(struct uart_port *port, int baud)
{
struct sc16is7xx_one *one = to_sc16is7xx_one(port, port);
u8 lcr;
- u8 prescaler = 0;
+ unsigned int prescaler = 1;
unsigned long clk = port->uartclk, div = clk / 16 / baud;
if (div >= BIT(16)) {
- prescaler = SC16IS7XX_MCR_CLKSEL_BIT;
- div /= 4;
+ prescaler = 4;
+ div /= prescaler;
}
/* Enable enhanced features */
@@ -573,9 +585,10 @@ static int sc16is7xx_set_baud(struct uart_port *port, int baud)
SC16IS7XX_EFR_ENABLE_BIT);
sc16is7xx_efr_unlock(port);
+ /* If bit MCR_CLKSEL is set, the divide by 4 prescaler is activated. */
sc16is7xx_port_update(port, SC16IS7XX_MCR_REG,
SC16IS7XX_MCR_CLKSEL_BIT,
- prescaler);
+ prescaler == 1 ? 0 : SC16IS7XX_MCR_CLKSEL_BIT);
/* Backup LCR and access special register set (DLL/DLH) */
lcr = sc16is7xx_port_read(port, SC16IS7XX_LCR_REG);
@@ -591,7 +604,7 @@ static int sc16is7xx_set_baud(struct uart_port *port, int baud)
/* Restore LCR and access to general register set */
sc16is7xx_port_write(port, SC16IS7XX_LCR_REG, lcr);
- return DIV_ROUND_CLOSEST(clk / 16, div);
+ return DIV_ROUND_CLOSEST((clk / prescaler) / 16, div);
}
static void sc16is7xx_handle_rx(struct uart_port *port, unsigned int rxlen,
The patch below does not apply to the 6.1-stable tree.
If someone wants it applied there, or to any other stable or longterm
tree, then please email the backport, including the original git commit
id to <stable(a)vger.kernel.org>.
To reproduce the conflict and resubmit, you may use the following commands:
git fetch https://git.kernel.org/pub/scm/linux/kernel/git/stable/linux.git/ linux-6.1.y
git checkout FETCH_HEAD
git cherry-pick -x 8492bd91aa055907c67ef04f2b56f6dadd1f44bf
# <resolve conflicts, build, test, etc.>
git commit -s
git send-email --to '<stable(a)vger.kernel.org>' --in-reply-to '2024052505-conceded-backstage-2637@gregkh' --subject-prefix 'PATCH 6.1.y' HEAD^..
Possible dependencies:
thanks,
greg k-h
------------------ original commit in Linus's tree ------------------
From 8492bd91aa055907c67ef04f2b56f6dadd1f44bf Mon Sep 17 00:00:00 2001
From: Hugo Villeneuve <hvilleneuve(a)dimonoff.com>
Date: Tue, 30 Apr 2024 16:04:30 -0400
Subject: [PATCH] serial: sc16is7xx: fix bug in sc16is7xx_set_baud() when using
prescaler
When using a high speed clock with a low baud rate, the 4x prescaler is
automatically selected if required. In that case, sc16is7xx_set_baud()
properly configures the chip registers, but returns an incorrect baud
rate by not taking into account the prescaler value. This incorrect baud
rate is then fed to uart_update_timeout().
For example, with an input clock of 80MHz, and a selected baud rate of 50,
sc16is7xx_set_baud() will return 200 instead of 50.
Fix this by first changing the prescaler variable to hold the selected
prescaler value instead of the MCR bitfield. Then properly take into
account the selected prescaler value in the return value computation.
Also add better documentation about the divisor value computation.
Fixes: dfeae619d781 ("serial: sc16is7xx")
Cc: stable(a)vger.kernel.org
Signed-off-by: Hugo Villeneuve <hvilleneuve(a)dimonoff.com>
Reviewed-by: Jiri Slaby <jirislaby(a)kernel.org>
Link: https://lore.kernel.org/r/20240430200431.4102923-1-hugo@hugovil.com
Signed-off-by: Greg Kroah-Hartman <gregkh(a)linuxfoundation.org>
diff --git a/drivers/tty/serial/sc16is7xx.c b/drivers/tty/serial/sc16is7xx.c
index 929206a9a6e1..12915fffac27 100644
--- a/drivers/tty/serial/sc16is7xx.c
+++ b/drivers/tty/serial/sc16is7xx.c
@@ -554,16 +554,28 @@ static bool sc16is7xx_regmap_noinc(struct device *dev, unsigned int reg)
return reg == SC16IS7XX_RHR_REG;
}
+/*
+ * Configure programmable baud rate generator (divisor) according to the
+ * desired baud rate.
+ *
+ * From the datasheet, the divisor is computed according to:
+ *
+ * XTAL1 input frequency
+ * -----------------------
+ * prescaler
+ * divisor = ---------------------------
+ * baud-rate x sampling-rate
+ */
static int sc16is7xx_set_baud(struct uart_port *port, int baud)
{
struct sc16is7xx_one *one = to_sc16is7xx_one(port, port);
u8 lcr;
- u8 prescaler = 0;
+ unsigned int prescaler = 1;
unsigned long clk = port->uartclk, div = clk / 16 / baud;
if (div >= BIT(16)) {
- prescaler = SC16IS7XX_MCR_CLKSEL_BIT;
- div /= 4;
+ prescaler = 4;
+ div /= prescaler;
}
/* Enable enhanced features */
@@ -573,9 +585,10 @@ static int sc16is7xx_set_baud(struct uart_port *port, int baud)
SC16IS7XX_EFR_ENABLE_BIT);
sc16is7xx_efr_unlock(port);
+ /* If bit MCR_CLKSEL is set, the divide by 4 prescaler is activated. */
sc16is7xx_port_update(port, SC16IS7XX_MCR_REG,
SC16IS7XX_MCR_CLKSEL_BIT,
- prescaler);
+ prescaler == 1 ? 0 : SC16IS7XX_MCR_CLKSEL_BIT);
/* Backup LCR and access special register set (DLL/DLH) */
lcr = sc16is7xx_port_read(port, SC16IS7XX_LCR_REG);
@@ -591,7 +604,7 @@ static int sc16is7xx_set_baud(struct uart_port *port, int baud)
/* Restore LCR and access to general register set */
sc16is7xx_port_write(port, SC16IS7XX_LCR_REG, lcr);
- return DIV_ROUND_CLOSEST(clk / 16, div);
+ return DIV_ROUND_CLOSEST((clk / prescaler) / 16, div);
}
static void sc16is7xx_handle_rx(struct uart_port *port, unsigned int rxlen,
The patch below does not apply to the 4.19-stable tree.
If someone wants it applied there, or to any other stable or longterm
tree, then please email the backport, including the original git commit
id to <stable(a)vger.kernel.org>.
To reproduce the conflict and resubmit, you may use the following commands:
git fetch https://git.kernel.org/pub/scm/linux/kernel/git/stable/linux.git/ linux-4.19.y
git checkout FETCH_HEAD
git cherry-pick -x 47388e807f85948eefc403a8a5fdc5b406a65d5a
# <resolve conflicts, build, test, etc.>
git commit -s
git send-email --to '<stable(a)vger.kernel.org>' --in-reply-to '2024052539-pacific-rejoin-8bca@gregkh' --subject-prefix 'PATCH 4.19.y' HEAD^..
Possible dependencies:
thanks,
greg k-h
------------------ original commit in Linus's tree ------------------
From 47388e807f85948eefc403a8a5fdc5b406a65d5a Mon Sep 17 00:00:00 2001
From: Daniel Starke <daniel.starke(a)siemens.com>
Date: Wed, 24 Apr 2024 07:48:41 +0200
Subject: [PATCH] tty: n_gsm: fix possible out-of-bounds in gsm0_receive()
Assuming the following:
- side A configures the n_gsm in basic option mode
- side B sends the header of a basic option mode frame with data length 1
- side A switches to advanced option mode
- side B sends 2 data bytes which exceeds gsm->len
Reason: gsm->len is not used in advanced option mode.
- side A switches to basic option mode
- side B keeps sending until gsm0_receive() writes past gsm->buf
Reason: Neither gsm->state nor gsm->len have been reset after
reconfiguration.
Fix this by changing gsm->count to gsm->len comparison from equal to less
than. Also add upper limit checks against the constant MAX_MRU in
gsm0_receive() and gsm1_receive() to harden against memory corruption of
gsm->len and gsm->mru.
All other checks remain as we still need to limit the data according to the
user configuration and actual payload size.
Reported-by: j51569436(a)gmail.com
Closes: https://bugzilla.kernel.org/show_bug.cgi?id=218708
Tested-by: j51569436(a)gmail.com
Fixes: e1eaea46bb40 ("tty: n_gsm line discipline")
Cc: stable(a)vger.kernel.org
Signed-off-by: Daniel Starke <daniel.starke(a)siemens.com>
Link: https://lore.kernel.org/r/20240424054842.7741-1-daniel.starke@siemens.com
Signed-off-by: Greg Kroah-Hartman <gregkh(a)linuxfoundation.org>
diff --git a/drivers/tty/n_gsm.c b/drivers/tty/n_gsm.c
index 4036566febcb..72b82bf1c280 100644
--- a/drivers/tty/n_gsm.c
+++ b/drivers/tty/n_gsm.c
@@ -2913,7 +2913,10 @@ static void gsm0_receive(struct gsm_mux *gsm, u8 c)
break;
case GSM_DATA: /* Data */
gsm->buf[gsm->count++] = c;
- if (gsm->count == gsm->len) {
+ if (gsm->count >= MAX_MRU) {
+ gsm->bad_size++;
+ gsm->state = GSM_SEARCH;
+ } else if (gsm->count >= gsm->len) {
/* Calculate final FCS for UI frames over all data */
if ((gsm->control & ~PF) != UIH) {
gsm->fcs = gsm_fcs_add_block(gsm->fcs, gsm->buf,
@@ -3026,7 +3029,7 @@ static void gsm1_receive(struct gsm_mux *gsm, u8 c)
gsm->state = GSM_DATA;
break;
case GSM_DATA: /* Data */
- if (gsm->count > gsm->mru) { /* Allow one for the FCS */
+ if (gsm->count > gsm->mru || gsm->count > MAX_MRU) { /* Allow one for the FCS */
gsm->state = GSM_OVERRUN;
gsm->bad_size++;
} else
The patch below does not apply to the 5.4-stable tree.
If someone wants it applied there, or to any other stable or longterm
tree, then please email the backport, including the original git commit
id to <stable(a)vger.kernel.org>.
To reproduce the conflict and resubmit, you may use the following commands:
git fetch https://git.kernel.org/pub/scm/linux/kernel/git/stable/linux.git/ linux-5.4.y
git checkout FETCH_HEAD
git cherry-pick -x 47388e807f85948eefc403a8a5fdc5b406a65d5a
# <resolve conflicts, build, test, etc.>
git commit -s
git send-email --to '<stable(a)vger.kernel.org>' --in-reply-to '2024052538-octagon-unleash-663f@gregkh' --subject-prefix 'PATCH 5.4.y' HEAD^..
Possible dependencies:
thanks,
greg k-h
------------------ original commit in Linus's tree ------------------
From 47388e807f85948eefc403a8a5fdc5b406a65d5a Mon Sep 17 00:00:00 2001
From: Daniel Starke <daniel.starke(a)siemens.com>
Date: Wed, 24 Apr 2024 07:48:41 +0200
Subject: [PATCH] tty: n_gsm: fix possible out-of-bounds in gsm0_receive()
Assuming the following:
- side A configures the n_gsm in basic option mode
- side B sends the header of a basic option mode frame with data length 1
- side A switches to advanced option mode
- side B sends 2 data bytes which exceeds gsm->len
Reason: gsm->len is not used in advanced option mode.
- side A switches to basic option mode
- side B keeps sending until gsm0_receive() writes past gsm->buf
Reason: Neither gsm->state nor gsm->len have been reset after
reconfiguration.
Fix this by changing gsm->count to gsm->len comparison from equal to less
than. Also add upper limit checks against the constant MAX_MRU in
gsm0_receive() and gsm1_receive() to harden against memory corruption of
gsm->len and gsm->mru.
All other checks remain as we still need to limit the data according to the
user configuration and actual payload size.
Reported-by: j51569436(a)gmail.com
Closes: https://bugzilla.kernel.org/show_bug.cgi?id=218708
Tested-by: j51569436(a)gmail.com
Fixes: e1eaea46bb40 ("tty: n_gsm line discipline")
Cc: stable(a)vger.kernel.org
Signed-off-by: Daniel Starke <daniel.starke(a)siemens.com>
Link: https://lore.kernel.org/r/20240424054842.7741-1-daniel.starke@siemens.com
Signed-off-by: Greg Kroah-Hartman <gregkh(a)linuxfoundation.org>
diff --git a/drivers/tty/n_gsm.c b/drivers/tty/n_gsm.c
index 4036566febcb..72b82bf1c280 100644
--- a/drivers/tty/n_gsm.c
+++ b/drivers/tty/n_gsm.c
@@ -2913,7 +2913,10 @@ static void gsm0_receive(struct gsm_mux *gsm, u8 c)
break;
case GSM_DATA: /* Data */
gsm->buf[gsm->count++] = c;
- if (gsm->count == gsm->len) {
+ if (gsm->count >= MAX_MRU) {
+ gsm->bad_size++;
+ gsm->state = GSM_SEARCH;
+ } else if (gsm->count >= gsm->len) {
/* Calculate final FCS for UI frames over all data */
if ((gsm->control & ~PF) != UIH) {
gsm->fcs = gsm_fcs_add_block(gsm->fcs, gsm->buf,
@@ -3026,7 +3029,7 @@ static void gsm1_receive(struct gsm_mux *gsm, u8 c)
gsm->state = GSM_DATA;
break;
case GSM_DATA: /* Data */
- if (gsm->count > gsm->mru) { /* Allow one for the FCS */
+ if (gsm->count > gsm->mru || gsm->count > MAX_MRU) { /* Allow one for the FCS */
gsm->state = GSM_OVERRUN;
gsm->bad_size++;
} else
The patch below does not apply to the 5.10-stable tree.
If someone wants it applied there, or to any other stable or longterm
tree, then please email the backport, including the original git commit
id to <stable(a)vger.kernel.org>.
To reproduce the conflict and resubmit, you may use the following commands:
git fetch https://git.kernel.org/pub/scm/linux/kernel/git/stable/linux.git/ linux-5.10.y
git checkout FETCH_HEAD
git cherry-pick -x 47388e807f85948eefc403a8a5fdc5b406a65d5a
# <resolve conflicts, build, test, etc.>
git commit -s
git send-email --to '<stable(a)vger.kernel.org>' --in-reply-to '2024052537-payday-enjoyably-a5a8@gregkh' --subject-prefix 'PATCH 5.10.y' HEAD^..
Possible dependencies:
thanks,
greg k-h
------------------ original commit in Linus's tree ------------------
From 47388e807f85948eefc403a8a5fdc5b406a65d5a Mon Sep 17 00:00:00 2001
From: Daniel Starke <daniel.starke(a)siemens.com>
Date: Wed, 24 Apr 2024 07:48:41 +0200
Subject: [PATCH] tty: n_gsm: fix possible out-of-bounds in gsm0_receive()
Assuming the following:
- side A configures the n_gsm in basic option mode
- side B sends the header of a basic option mode frame with data length 1
- side A switches to advanced option mode
- side B sends 2 data bytes which exceeds gsm->len
Reason: gsm->len is not used in advanced option mode.
- side A switches to basic option mode
- side B keeps sending until gsm0_receive() writes past gsm->buf
Reason: Neither gsm->state nor gsm->len have been reset after
reconfiguration.
Fix this by changing gsm->count to gsm->len comparison from equal to less
than. Also add upper limit checks against the constant MAX_MRU in
gsm0_receive() and gsm1_receive() to harden against memory corruption of
gsm->len and gsm->mru.
All other checks remain as we still need to limit the data according to the
user configuration and actual payload size.
Reported-by: j51569436(a)gmail.com
Closes: https://bugzilla.kernel.org/show_bug.cgi?id=218708
Tested-by: j51569436(a)gmail.com
Fixes: e1eaea46bb40 ("tty: n_gsm line discipline")
Cc: stable(a)vger.kernel.org
Signed-off-by: Daniel Starke <daniel.starke(a)siemens.com>
Link: https://lore.kernel.org/r/20240424054842.7741-1-daniel.starke@siemens.com
Signed-off-by: Greg Kroah-Hartman <gregkh(a)linuxfoundation.org>
diff --git a/drivers/tty/n_gsm.c b/drivers/tty/n_gsm.c
index 4036566febcb..72b82bf1c280 100644
--- a/drivers/tty/n_gsm.c
+++ b/drivers/tty/n_gsm.c
@@ -2913,7 +2913,10 @@ static void gsm0_receive(struct gsm_mux *gsm, u8 c)
break;
case GSM_DATA: /* Data */
gsm->buf[gsm->count++] = c;
- if (gsm->count == gsm->len) {
+ if (gsm->count >= MAX_MRU) {
+ gsm->bad_size++;
+ gsm->state = GSM_SEARCH;
+ } else if (gsm->count >= gsm->len) {
/* Calculate final FCS for UI frames over all data */
if ((gsm->control & ~PF) != UIH) {
gsm->fcs = gsm_fcs_add_block(gsm->fcs, gsm->buf,
@@ -3026,7 +3029,7 @@ static void gsm1_receive(struct gsm_mux *gsm, u8 c)
gsm->state = GSM_DATA;
break;
case GSM_DATA: /* Data */
- if (gsm->count > gsm->mru) { /* Allow one for the FCS */
+ if (gsm->count > gsm->mru || gsm->count > MAX_MRU) { /* Allow one for the FCS */
gsm->state = GSM_OVERRUN;
gsm->bad_size++;
} else
I'm announcing the release of the 6.8.11 kernel.
All users of the 6.8 kernel series must upgrade.
The updated 6.8.y git tree can be found at:
git://git.kernel.org/pub/scm/linux/kernel/git/stable/linux-stable.git linux-6.8.y
and can be browsed at the normal kernel.org git web browser:
https://git.kernel.org/?p=linux/kernel/git/stable/linux-stable.git;a=summary
thanks,
greg k-h
------------
Documentation/ABI/stable/sysfs-block | 10 +
Documentation/admin-guide/hw-vuln/core-scheduling.rst | 4
Documentation/admin-guide/mm/damon/usage.rst | 2
Documentation/sphinx/kernel_include.py | 1
Makefile | 2
block/genhd.c | 15 +-
block/partitions/core.c | 5
drivers/android/binder.c | 2
drivers/android/binder_internal.h | 2
drivers/gpu/drm/amd/display/dc/dsc/dc_dsc.c | 7 -
drivers/net/ethernet/intel/ice/ice_virtchnl.c | 22 +--
drivers/net/ethernet/intel/ice/ice_virtchnl_fdir.c | 3
drivers/net/ethernet/micrel/ks8851_common.c | 18 --
drivers/net/usb/ax88179_178a.c | 37 +++--
drivers/remoteproc/mtk_scp.c | 10 +
drivers/tty/serial/kgdboc.c | 30 ++++
drivers/usb/dwc3/gadget.c | 4
drivers/usb/typec/tipd/core.c | 51 +++++--
drivers/usb/typec/tipd/tps6598x.h | 11 +
drivers/usb/typec/ucsi/displayport.c | 4
fs/erofs/internal.h | 7 -
fs/erofs/super.c | 124 +++++++-----------
include/linux/blkdev.h | 13 +
include/net/bluetooth/hci.h | 9 +
include/net/bluetooth/hci_core.h | 1
net/bluetooth/hci_conn.c | 71 +++++++---
net/bluetooth/hci_event.c | 31 ++--
net/bluetooth/iso.c | 2
net/bluetooth/l2cap_core.c | 38 +----
net/bluetooth/sco.c | 6
security/keys/trusted-keys/trusted_tpm2.c | 25 ++-
31 files changed, 338 insertions(+), 229 deletions(-)
Akira Yokosawa (1):
docs: kernel_include.py: Cope with docutils 0.21
AngeloGioacchino Del Regno (1):
remoteproc: mediatek: Make sure IPI buffer fits in L2TCM
Baokun Li (1):
erofs: get rid of erofs_fs_context
Carlos Llamas (1):
binder: fix max_thread type inconsistency
Christian Brauner (1):
erofs: reliably distinguish block based and fscache mode
Christoph Hellwig (2):
block: add a disk_has_partscan helper
block: add a partscan sysfs attribute for disks
Daniel Thompson (1):
serial: kgdboc: Fix NMI-safety problems from keyboard reset code
Greg Kroah-Hartman (1):
Linux 6.8.11
Heikki Krogerus (1):
usb: typec: ucsi: displayport: Fix potential deadlock
Jacob Keller (2):
ice: pass VSI pointer into ice_vc_isvalid_q_id
ice: remove unnecessary duplicate checks for VF VSI ID
Jarkko Sakkinen (2):
KEYS: trusted: Fix memory leak in tpm2_key_encode()
KEYS: trusted: Do not use WARN when encode fails
Javier Carrasco (2):
usb: typec: tipd: fix event checking for tps25750
usb: typec: tipd: fix event checking for tps6598x
Jose Fernandez (1):
drm/amd/display: Fix division by zero in setup_dsc_config
Jose Ignacio Tornos Martinez (1):
net: usb: ax88179_178a: fix link status when link is set to down/up
Prashanth K (1):
usb: dwc3: Wait unconditionally after issuing EndXfer command
Ronald Wahl (1):
net: ks8851: Fix another TX stall caused by wrong ISR flag handling
SeongJae Park (1):
Docs/admin-guide/mm/damon/usage: fix wrong example of DAMOS filter matching sysfs file
Sungwoo Kim (2):
Bluetooth: L2CAP: Fix slab-use-after-free in l2cap_connect()
Bluetooth: L2CAP: Fix div-by-zero in l2cap_le_flowctl_init()
Thomas Weißschuh (1):
admin-guide/hw-vuln/core-scheduling: fix return type of PR_SCHED_CORE_GET
I'm announcing the release of the 6.6.32 kernel.
All users of the 6.6 kernel series must upgrade.
The updated 6.6.y git tree can be found at:
git://git.kernel.org/pub/scm/linux/kernel/git/stable/linux-stable.git linux-6.6.y
and can be browsed at the normal kernel.org git web browser:
https://git.kernel.org/?p=linux/kernel/git/stable/linux-stable.git;a=summary
thanks,
greg k-h
------------
Documentation/ABI/stable/sysfs-block | 10
Documentation/admin-guide/hw-vuln/core-scheduling.rst | 4
Documentation/admin-guide/mm/damon/usage.rst | 2
Documentation/sphinx/kernel_include.py | 1
Makefile | 2
block/genhd.c | 15
block/partitions/core.c | 5
drivers/android/binder.c | 2
drivers/android/binder_internal.h | 2
drivers/gpu/drm/amd/amdgpu/amdgpu_ras.c | 3
drivers/gpu/drm/amd/display/dc/dsc/dc_dsc.c | 7
drivers/mmc/core/mmc.c | 9
drivers/net/ethernet/intel/ice/ice_virtchnl.c | 22
drivers/net/ethernet/intel/ice/ice_virtchnl_fdir.c | 3
drivers/net/ethernet/micrel/ks8851_common.c | 18
drivers/net/usb/ax88179_178a.c | 37
drivers/remoteproc/mtk_scp.c | 10
drivers/tty/serial/kgdboc.c | 30
drivers/usb/dwc3/gadget.c | 4
drivers/usb/typec/tipd/core.c | 45
drivers/usb/typec/tipd/tps6598x.h | 11
drivers/usb/typec/ucsi/displayport.c | 4
fs/erofs/internal.h | 7
fs/erofs/super.c | 124 -
fs/smb/client/Makefile | 2
fs/smb/client/cached_dir.c | 24
fs/smb/client/cifs_debug.c | 38
fs/smb/client/cifsfs.c | 10
fs/smb/client/cifsglob.h | 93 -
fs/smb/client/cifsproto.h | 39
fs/smb/client/cifssmb.c | 18
fs/smb/client/connect.c | 57
fs/smb/client/dir.c | 14
fs/smb/client/file.c | 39
fs/smb/client/fs_context.c | 43
fs/smb/client/fs_context.h | 13
fs/smb/client/fscache.c | 7
fs/smb/client/inode.c | 235 +--
fs/smb/client/ioctl.c | 6
fs/smb/client/link.c | 41
fs/smb/client/misc.c | 47
fs/smb/client/ntlmssp.h | 4
fs/smb/client/readdir.c | 32
fs/smb/client/reparse.c | 532 ++++++
fs/smb/client/reparse.h | 113 +
fs/smb/client/sess.c | 73
fs/smb/client/smb1ops.c | 80 -
fs/smb/client/smb2glob.h | 27
fs/smb/client/smb2inode.c | 1396 +++++++++++-------
fs/smb/client/smb2maperror.c | 2
fs/smb/client/smb2misc.c | 10
fs/smb/client/smb2ops.c | 589 ++-----
fs/smb/client/smb2pdu.c | 336 +++-
fs/smb/client/smb2pdu.h | 46
fs/smb/client/smb2proto.h | 37
fs/smb/client/smb2status.h | 2
fs/smb/client/smb2transport.c | 2
fs/smb/client/smbdirect.c | 4
fs/smb/client/smbencrypt.c | 7
fs/smb/client/trace.h | 137 +
fs/smb/common/smb2pdu.h | 116 -
fs/smb/common/smbfsctl.h | 6
fs/smb/server/auth.c | 14
fs/smb/server/ksmbd_netlink.h | 36
fs/smb/server/mgmt/user_session.c | 28
fs/smb/server/mgmt/user_session.h | 3
fs/smb/server/misc.c | 1
fs/smb/server/oplock.c | 96 +
fs/smb/server/oplock.h | 7
fs/smb/server/smb2misc.c | 26
fs/smb/server/smb2ops.c | 6
fs/smb/server/smb2pdu.c | 338 +++-
fs/smb/server/smb2pdu.h | 31
fs/smb/server/transport_tcp.c | 2
fs/smb/server/vfs.c | 28
fs/smb/server/vfs_cache.c | 137 +
fs/smb/server/vfs_cache.h | 9
include/linux/blkdev.h | 13
include/linux/bpf_types.h | 3
include/net/bluetooth/hci.h | 9
include/net/bluetooth/hci_core.h | 1
net/bluetooth/hci_conn.c | 71
net/bluetooth/hci_event.c | 31
net/bluetooth/iso.c | 2
net/bluetooth/l2cap_core.c | 38
net/bluetooth/sco.c | 6
security/keys/trusted-keys/trusted_tpm2.c | 25
tools/testing/selftests/kselftest.h | 14
88 files changed, 3907 insertions(+), 1722 deletions(-)
Akira Yokosawa (1):
docs: kernel_include.py: Cope with docutils 0.21
Alexey Dobriyan (1):
smb: client: delete "true", "false" defines
AngeloGioacchino Del Regno (1):
remoteproc: mediatek: Make sure IPI buffer fits in L2TCM
Baokun Li (1):
erofs: get rid of erofs_fs_context
Bharath SM (2):
cifs: defer close file handles having RH lease
cifs: remove redundant variable assignment
Carlos Llamas (1):
binder: fix max_thread type inconsistency
Christian Brauner (1):
erofs: reliably distinguish block based and fscache mode
Christoph Hellwig (2):
block: add a disk_has_partscan helper
block: add a partscan sysfs attribute for disks
Colin Ian King (2):
cifs: remove redundant variable tcon_exist
ksmbd: Fix spelling mistake "connction" -> "connection"
Dan Carpenter (1):
smb: client: Fix a NULL vs IS_ERR() check in wsl_set_xattrs()
Daniel Thompson (1):
serial: kgdboc: Fix NMI-safety problems from keyboard reset code
David Howells (2):
cifs: Pass unbyteswapped eof value into SMB2_set_eof()
cifs: Add tracing for the cifs_tcon struct refcounting
Enzo Matsumiya (3):
smb: client: negotiate compression algorithms
smb: common: fix fields sizes in compression_pattern_payload_v1
smb: common: simplify compression headers
Eric Biggers (1):
smb: use crypto_shash_digest() in symlink_hash()
Greg Kroah-Hartman (1):
Linux 6.6.32
Gustavo A. R. Silva (1):
smb: smb2pdu.h: Avoid -Wflex-array-member-not-at-end warnings
Heikki Krogerus (1):
usb: typec: ucsi: displayport: Fix potential deadlock
Jacob Keller (2):
ice: pass VSI pointer into ice_vc_isvalid_q_id
ice: remove unnecessary duplicate checks for VF VSI ID
Jarkko Sakkinen (2):
KEYS: trusted: Fix memory leak in tpm2_key_encode()
KEYS: trusted: Do not use WARN when encode fails
Javier Carrasco (1):
usb: typec: tipd: fix event checking for tps6598x
Jiri Olsa (1):
bpf: Add missing BPF_LINK_TYPE invocations
Jose Fernandez (1):
drm/amd/display: Fix division by zero in setup_dsc_config
Jose Ignacio Tornos Martinez (1):
net: usb: ax88179_178a: fix link status when link is set to down/up
Marios Makassikis (1):
ksmbd: fix possible null-deref in smb_lazy_parent_lease_break_close
Mark Brown (1):
kselftest: Add a ksft_perror() helper
Markus Elfring (1):
smb3: Improve exception handling in allocate_mr_list()
Meetakshi Setiya (4):
cifs: Add client version details to NTLM authenticate message
smb: client: reuse file lease key in compound operations
smb: client: retry compound request without reusing lease
cifs: fixes for get_inode_info
Mengqi Zhang (1):
mmc: core: Add HS400 tuning in HS400es initialization
Namjae Jeon (5):
ksmbd: mark SMB2_SESSION_EXPIRED to session when destroying previous session
ksmbd: add support for durable handles v1/v2
ksmbd: fix slab-out-of-bounds in smb_strndup_from_utf16()
ksmbd: fix potencial out-of-bounds when buffer offset is invalid
ksmbd: add continuous availability share parameter
Paulo Alcantara (17):
smb: client: allow creating symlinks via reparse points
smb: client: cleanup smb2_query_reparse_point()
smb: client: handle special files and symlinks in SMB3 POSIX
cifs: get rid of dup length check in parse_reparse_point()
smb: client: don't clobber ->i_rdev from cached reparse points
smb: client: handle path separator of created SMB symlinks
smb: client: get rid of smb311_posix_query_path_info()
smb: client: introduce reparse mount option
smb: client: move most of reparse point handling code to common file
smb: client: fix potential broken compound request
smb: client: reduce number of parameters in smb2_compound_op()
smb: client: add support for WSL reparse points
smb: client: parse uid, gid, mode and dev from WSL reparse points
smb: client: set correct d_type for reparse DFS/DFSR and mount point
smb: client: return reparse type in /proc/mounts
smb: client: fix NULL ptr deref in cifs_mark_open_handles_for_deleted_file()
smb: client: instantiate when creating SFU files
Pierre Mariani (1):
smb: client: Fix minor whitespace errors and warnings
Prashanth K (1):
usb: dwc3: Wait unconditionally after issuing EndXfer command
Randy Dunlap (2):
ksmbd: auth: fix most kernel-doc warnings
ksmbd: vfs: fix all kernel-doc warnings
Ritvik Budhiraja (1):
cifs: fix use after free for iface while disabling secondary channels
Ronald Wahl (1):
net: ks8851: Fix another TX stall caused by wrong ISR flag handling
SeongJae Park (1):
Docs/admin-guide/mm/damon/usage: fix wrong example of DAMOS filter matching sysfs file
Shyam Prasad N (6):
cifs: print server capabilities in DebugData
cifs: pick channel for tcon and tdis
cifs: new nt status codes from MS-SMB2
cifs: new mount option called retrans
cifs: commands that are retried should have replay flag set
cifs: set replay flag for retries of write command
Srinivasan Shanmugam (1):
drm/amdgpu: Fix possible NULL dereference in amdgpu_ras_query_error_status_helper()
Steve French (23):
SMB3: clarify some of the unused CreateOption flags
Add definition for new smb3.1.1 command type
smb3: minor RDMA cleanup
smb3: more minor cleanups for session handling routines
smb3: minor cleanup of session handling code
Missing field not being returned in ioctl CIFS_IOC_GET_MNT_INFO
smb: client: introduce cifs_sfu_make_node()
smb: client: extend smb2_compound_op() to accept more commands
smb: client: allow creating special files via reparse points
smb: client: optimise reparse point querying
cifs: fix in logging in cifs_chan_update_iface
cifs: remove unneeded return statement
cifs: minor comment cleanup
cifs: update the same create_guid on replay
smb3: update allocation size more accurately on write completion
smb: client: parse owner/group when creating reparse points
smb: client: do not defer close open handles to deleted files
smb: client: introduce SMB2_OP_QUERY_WSL_EA
smb3: add dynamic trace point for ioctls
cifs: Move some extern decls from .c files to .h
smb311: correct incorrect offset field in compression header
smb311: additional compression flag defined in updated protocol spec
smb3: add trace event for mknod
Sungwoo Kim (2):
Bluetooth: L2CAP: Fix slab-use-after-free in l2cap_connect()
Bluetooth: L2CAP: Fix div-by-zero in l2cap_le_flowctl_init()
Thomas Weißschuh (1):
admin-guide/hw-vuln/core-scheduling: fix return type of PR_SCHED_CORE_GET
Yang Li (2):
smb: Fix some kernel-doc comments
ksmbd: Add kernel-doc for ksmbd_extract_sharename() function
I'm announcing the release of the 4.19.315 kernel.
All users of the 4.19 kernel series must upgrade.
The updated 4.19.y git tree can be found at:
git://git.kernel.org/pub/scm/linux/kernel/git/stable/linux-stable.git linux-4.19.y
and can be browsed at the normal kernel.org git web browser:
https://git.kernel.org/?p=linux/kernel/git/stable/linux-stable.git;a=summary
thanks,
greg k-h
------------
Documentation/sphinx/kernel_include.py | 1
Makefile | 2
drivers/md/dm-core.h | 2
drivers/md/dm-ioctl.c | 3
drivers/md/dm-table.c | 9
drivers/tty/serial/kgdboc.c | 30
fs/btrfs/volumes.c | 1
include/linux/string.h | 20
include/linux/trace_events.h | 2
kernel/trace/Kconfig | 4
kernel/trace/Makefile | 1
kernel/trace/trace.c | 26
kernel/trace/trace_dynevent.c | 210 ++++++
kernel/trace/trace_dynevent.h | 119 +++
kernel/trace/trace_events.c | 32
kernel/trace/trace_events_hist.c | 1048 ++++++++++++++++++-------------
kernel/trace/trace_probe.c | 2
kernel/trace/trace_stack.c | 2
tools/testing/selftests/vm/map_hugetlb.c | 7
19 files changed, 1050 insertions(+), 471 deletions(-)
Akira Yokosawa (1):
docs: kernel_include.py: Cope with docutils 0.21
Daniel Thompson (1):
serial: kgdboc: Fix NMI-safety problems from keyboard reset code
Dominique Martinet (1):
btrfs: add missing mutex_unlock in btrfs_relocate_sys_chunks()
Greg Kroah-Hartman (1):
Linux 4.19.315
Harshit Mogalapalli (1):
Revert "selftests: mm: fix map_hugetlb failure on 64K page size systems"
Masami Hiramatsu (4):
tracing: Simplify creation and deletion of synthetic events
tracing: Add unified dynamic event framework
tracing: Use dyn_event framework for synthetic events
tracing: Remove unneeded synth_event_mutex
Mikulas Patocka (1):
dm: limit the number of targets and parameter size area
Steven Rostedt (VMware) (5):
tracing: Consolidate trace_add/remove_event_call back to the nolock functions
string.h: Add str_has_prefix() helper function
tracing: Use str_has_prefix() helper for histogram code
tracing: Use str_has_prefix() instead of using fixed sizes
tracing: Have the historgram use the result of str_has_prefix() for len of prefix
Tom Zanussi (4):
tracing: Refactor hist trigger action code
tracing: Split up onmatch action data
tracing: Generalize hist trigger onmax and save action
tracing: Remove unnecessary var_ref destroy in track_data_destroy()
Hey,
I got encouraged to send another email here from
https://github.com/tpwrules/nixos-apple-silicon/issues/200.
"arm64/fpsimd: Avoid erroneous elide of user state reload" /
https://git.kernel.org/pub/scm/linux/kernel/git/stable/linux.git/commit/?id…
fixes a data corruption issue with dm-crypt on aarch64, reproducible on
the mainline Linux kernel (not just asahi specific!).
This list has been included as Cc on this commit, but it'd be very nice
to make sure this already lands in 6.9.2, due to its data corruption
nature.
Thanks,
Florian
The following commit has been merged into the irq/urgent branch of tip:
Commit-ID: b84a8aba806261d2f759ccedf4a2a6a80a5e55ba
Gitweb: https://git.kernel.org/tip/b84a8aba806261d2f759ccedf4a2a6a80a5e55ba
Author: dicken.ding <dicken.ding(a)mediatek.com>
AuthorDate: Fri, 24 May 2024 17:17:39 +08:00
Committer: Thomas Gleixner <tglx(a)linutronix.de>
CommitterDate: Fri, 24 May 2024 12:49:35 +02:00
genirq/irqdesc: Prevent use-after-free in irq_find_at_or_after()
irq_find_at_or_after() dereferences the interrupt descriptor which is
returned by mt_find() while neither holding sparse_irq_lock nor RCU read
lock, which means the descriptor can be freed between mt_find() and the
dereference:
CPU0 CPU1
desc = mt_find()
delayed_free_desc(desc)
irq_desc_get_irq(desc)
The use-after-free is reported by KASAN:
Call trace:
irq_get_next_irq+0x58/0x84
show_stat+0x638/0x824
seq_read_iter+0x158/0x4ec
proc_reg_read_iter+0x94/0x12c
vfs_read+0x1e0/0x2c8
Freed by task 4471:
slab_free_freelist_hook+0x174/0x1e0
__kmem_cache_free+0xa4/0x1dc
kfree+0x64/0x128
irq_kobj_release+0x28/0x3c
kobject_put+0xcc/0x1e0
delayed_free_desc+0x14/0x2c
rcu_do_batch+0x214/0x720
Guard the access with a RCU read lock section.
Fixes: 721255b9826b ("genirq: Use a maple tree for interrupt descriptor management")
Signed-off-by: dicken.ding <dicken.ding(a)mediatek.com>
Signed-off-by: Thomas Gleixner <tglx(a)linutronix.de>
Cc: stable(a)vger.kernel.org
Link: https://lore.kernel.org/r/20240524091739.31611-1-dicken.ding@mediatek.com
---
kernel/irq/irqdesc.c | 5 ++++-
1 file changed, 4 insertions(+), 1 deletion(-)
diff --git a/kernel/irq/irqdesc.c b/kernel/irq/irqdesc.c
index 88ac365..07e99c9 100644
--- a/kernel/irq/irqdesc.c
+++ b/kernel/irq/irqdesc.c
@@ -160,7 +160,10 @@ static int irq_find_free_area(unsigned int from, unsigned int cnt)
static unsigned int irq_find_at_or_after(unsigned int offset)
{
unsigned long index = offset;
- struct irq_desc *desc = mt_find(&sparse_irqs, &index, nr_irqs);
+ struct irq_desc *desc;
+
+ guard(rcu)();
+ desc = mt_find(&sparse_irqs, &index, nr_irqs);
return desc ? irq_desc_get_irq(desc) : nr_irqs;
}
It appears that we don't allowed a vcpu to be restored in AArch32
System mode, as we *never* included it in the list of valid modes.
Just add it to the list of allowed modes.
Fixes: 0d854a60b1d7 ("arm64: KVM: enable initialization of a 32bit vcpu")
Signed-off-by: Marc Zyngier <maz(a)kernel.org>
Cc: stable(a)vger.kernel.org
---
arch/arm64/kvm/guest.c | 1 +
1 file changed, 1 insertion(+)
diff --git a/arch/arm64/kvm/guest.c b/arch/arm64/kvm/guest.c
index d9617b11f7a8..11098eb7eb44 100644
--- a/arch/arm64/kvm/guest.c
+++ b/arch/arm64/kvm/guest.c
@@ -251,6 +251,7 @@ static int set_core_reg(struct kvm_vcpu *vcpu, const struct kvm_one_reg *reg)
case PSR_AA32_MODE_SVC:
case PSR_AA32_MODE_ABT:
case PSR_AA32_MODE_UND:
+ case PSR_AA32_MODE_SYS:
if (!vcpu_el1_is_32bit(vcpu))
return -EINVAL;
break;
--
2.39.2
When userspace writes to once of the core registers, we make
sure to narrow the corresponding GPRs if PSTATE indicates
an AArch32 context.
The code tries to check whether the context is EL0 or EL1 so
that it narrows the correct registers. But it does so by checking
the full PSTATE instead of PSTATE.M.
As a consequence, and if we are restoring an AArch32 EL0 context
in a 64bit guest, and that PSTATE has *any* bit set outside of
PSTATE.M, we narrow *all* registers instead of only the first 15,
destroying the 64bit state.
Obviously, this is not something the guest is likely to enjoy.
Correctly masking PSTATE to only evaluate PSTATE.M fixes it.
Fixes: 90c1f934ed71 ("KVM: arm64: Get rid of the AArch32 register mapping code")
Reported-by: Nina Schoetterl-Glausch <nsg(a)linux.ibm.com>
Signed-off-by: Marc Zyngier <maz(a)kernel.org>
Cc: stable(a)vger.kernel.org
---
arch/arm64/kvm/guest.c | 2 +-
1 file changed, 1 insertion(+), 1 deletion(-)
diff --git a/arch/arm64/kvm/guest.c b/arch/arm64/kvm/guest.c
index e2f762d959bb..d9617b11f7a8 100644
--- a/arch/arm64/kvm/guest.c
+++ b/arch/arm64/kvm/guest.c
@@ -276,7 +276,7 @@ static int set_core_reg(struct kvm_vcpu *vcpu, const struct kvm_one_reg *reg)
if (*vcpu_cpsr(vcpu) & PSR_MODE32_BIT) {
int i, nr_reg;
- switch (*vcpu_cpsr(vcpu)) {
+ switch (*vcpu_cpsr(vcpu) & PSR_AA32_MODE_MASK) {
/*
* Either we are dealing with user mode, and only the
* first 15 registers (+ PC) must be narrowed to 32bit.
--
2.39.2
The quilt patch titled
Subject: mm/memory-failure: fix handling of dissolved but not taken off from buddy pages
has been removed from the -mm tree. Its filename was
mm-memory-failure-fix-handling-of-dissolved-but-not-taken-off-from-buddy-pages.patch
This patch was dropped because it was merged into the mm-hotfixes-stable branch
of git://git.kernel.org/pub/scm/linux/kernel/git/akpm/mm
------------------------------------------------------
From: Miaohe Lin <linmiaohe(a)huawei.com>
Subject: mm/memory-failure: fix handling of dissolved but not taken off from buddy pages
Date: Thu, 23 May 2024 15:12:17 +0800
When I did memory failure tests recently, below panic occurs:
page: refcount:0 mapcount:0 mapping:0000000000000000 index:0x0 pfn:0x8cee00
flags: 0x6fffe0000000000(node=1|zone=2|lastcpupid=0x7fff)
raw: 06fffe0000000000 dead000000000100 dead000000000122 0000000000000000
raw: 0000000000000000 0000000000000009 00000000ffffffff 0000000000000000
page dumped because: VM_BUG_ON_PAGE(!PageBuddy(page))
------------[ cut here ]------------
kernel BUG at include/linux/page-flags.h:1009!
invalid opcode: 0000 [#1] PREEMPT SMP NOPTI
RIP: 0010:__del_page_from_free_list+0x151/0x180
RSP: 0018:ffffa49c90437998 EFLAGS: 00000046
RAX: 0000000000000035 RBX: 0000000000000009 RCX: ffff8dd8dfd1c9c8
RDX: 0000000000000000 RSI: 0000000000000027 RDI: ffff8dd8dfd1c9c0
RBP: ffffd901233b8000 R08: ffffffffab5511f8 R09: 0000000000008c69
R10: 0000000000003c15 R11: ffffffffab5511f8 R12: ffff8dd8fffc0c80
R13: 0000000000000001 R14: ffff8dd8fffc0c80 R15: 0000000000000009
FS: 00007ff916304740(0000) GS:ffff8dd8dfd00000(0000) knlGS:0000000000000000
CS: 0010 DS: 0000 ES: 0000 CR0: 0000000080050033
CR2: 000055eae50124c8 CR3: 00000008479e0000 CR4: 00000000000006f0
Call Trace:
<TASK>
__rmqueue_pcplist+0x23b/0x520
get_page_from_freelist+0x26b/0xe40
__alloc_pages_noprof+0x113/0x1120
__folio_alloc_noprof+0x11/0xb0
alloc_buddy_hugetlb_folio.isra.0+0x5a/0x130
__alloc_fresh_hugetlb_folio+0xe7/0x140
alloc_pool_huge_folio+0x68/0x100
set_max_huge_pages+0x13d/0x340
hugetlb_sysctl_handler_common+0xe8/0x110
proc_sys_call_handler+0x194/0x280
vfs_write+0x387/0x550
ksys_write+0x64/0xe0
do_syscall_64+0xc2/0x1d0
entry_SYSCALL_64_after_hwframe+0x77/0x7f
RIP: 0033:0x7ff916114887
RSP: 002b:00007ffec8a2fd78 EFLAGS: 00000246 ORIG_RAX: 0000000000000001
RAX: ffffffffffffffda RBX: 000055eae500e350 RCX: 00007ff916114887
RDX: 0000000000000004 RSI: 000055eae500e390 RDI: 0000000000000003
RBP: 000055eae50104c0 R08: 0000000000000000 R09: 000055eae50104c0
R10: 0000000000000077 R11: 0000000000000246 R12: 0000000000000004
R13: 0000000000000004 R14: 00007ff916216b80 R15: 00007ff916216a00
</TASK>
Modules linked in: mce_inject hwpoison_inject
---[ end trace 0000000000000000 ]---
And before the panic, there had an warning about bad page state:
BUG: Bad page state in process page-types pfn:8cee00
page: refcount:0 mapcount:0 mapping:0000000000000000 index:0x0 pfn:0x8cee00
flags: 0x6fffe0000000000(node=1|zone=2|lastcpupid=0x7fff)
page_type: 0xffffff7f(buddy)
raw: 06fffe0000000000 ffffd901241c0008 ffffd901240f8008 0000000000000000
raw: 0000000000000000 0000000000000009 00000000ffffff7f 0000000000000000
page dumped because: nonzero mapcount
Modules linked in: mce_inject hwpoison_inject
CPU: 8 PID: 154211 Comm: page-types Not tainted 6.9.0-rc4-00499-g5544ec3178e2-dirty #22
Call Trace:
<TASK>
dump_stack_lvl+0x83/0xa0
bad_page+0x63/0xf0
free_unref_page+0x36e/0x5c0
unpoison_memory+0x50b/0x630
simple_attr_write_xsigned.constprop.0.isra.0+0xb3/0x110
debugfs_attr_write+0x42/0x60
full_proxy_write+0x5b/0x80
vfs_write+0xcd/0x550
ksys_write+0x64/0xe0
do_syscall_64+0xc2/0x1d0
entry_SYSCALL_64_after_hwframe+0x77/0x7f
RIP: 0033:0x7f189a514887
RSP: 002b:00007ffdcd899718 EFLAGS: 00000246 ORIG_RAX: 0000000000000001
RAX: ffffffffffffffda RBX: 0000000000000000 RCX: 00007f189a514887
RDX: 0000000000000009 RSI: 00007ffdcd899730 RDI: 0000000000000003
RBP: 00007ffdcd8997a0 R08: 0000000000000000 R09: 00007ffdcd8994b2
R10: 0000000000000000 R11: 0000000000000246 R12: 00007ffdcda199a8
R13: 0000000000404af1 R14: 000000000040ad78 R15: 00007f189a7a5040
</TASK>
The root cause should be the below race:
memory_failure
try_memory_failure_hugetlb
me_huge_page
__page_handle_poison
dissolve_free_hugetlb_folio
drain_all_pages -- Buddy page can be isolated e.g. for compaction.
take_page_off_buddy -- Failed as page is not in the buddy list.
-- Page can be putback into buddy after compaction.
page_ref_inc -- Leads to buddy page with refcnt = 1.
Then unpoison_memory() can unpoison the page and send the buddy page back
into buddy list again leading to the above bad page state warning. And
bad_page() will call page_mapcount_reset() to remove PageBuddy from buddy
page leading to later VM_BUG_ON_PAGE(!PageBuddy(page)) when trying to
allocate this page.
Fix this issue by only treating __page_handle_poison() as successful when
it returns 1.
Link: https://lkml.kernel.org/r/20240523071217.1696196-1-linmiaohe@huawei.com
Fixes: ceaf8fbea79a ("mm, hwpoison: skip raw hwpoison page in freeing 1GB hugepage")
Signed-off-by: Miaohe Lin <linmiaohe(a)huawei.com>
Cc: Naoya Horiguchi <nao.horiguchi(a)gmail.com>
Cc: <stable(a)vger.kernel.org>
Signed-off-by: Andrew Morton <akpm(a)linux-foundation.org>
---
mm/memory-failure.c | 4 ++--
1 file changed, 2 insertions(+), 2 deletions(-)
--- a/mm/memory-failure.c~mm-memory-failure-fix-handling-of-dissolved-but-not-taken-off-from-buddy-pages
+++ a/mm/memory-failure.c
@@ -1221,7 +1221,7 @@ static int me_huge_page(struct page_stat
* subpages.
*/
folio_put(folio);
- if (__page_handle_poison(p) >= 0) {
+ if (__page_handle_poison(p) > 0) {
page_ref_inc(p);
res = MF_RECOVERED;
} else {
@@ -2091,7 +2091,7 @@ retry:
*/
if (res == 0) {
folio_unlock(folio);
- if (__page_handle_poison(p) >= 0) {
+ if (__page_handle_poison(p) > 0) {
page_ref_inc(p);
res = MF_RECOVERED;
} else {
_
Patches currently in -mm which might be from linmiaohe(a)huawei.com are
The quilt patch titled
Subject: mm: /proc/pid/smaps_rollup: avoid skipping vma after getting mmap_lock again
has been removed from the -mm tree. Its filename was
mm-proc-pid-smaps_rollup-avoid-skipping-vma-after-getting-mmap_lock-again.patch
This patch was dropped because it was merged into the mm-hotfixes-stable branch
of git://git.kernel.org/pub/scm/linux/kernel/git/akpm/mm
------------------------------------------------------
From: Yuanyuan Zhong <yzhong(a)purestorage.com>
Subject: mm: /proc/pid/smaps_rollup: avoid skipping vma after getting mmap_lock again
Date: Thu, 23 May 2024 12:35:31 -0600
After switching smaps_rollup to use VMA iterator, searching for next entry
is part of the condition expression of the do-while loop. So the current
VMA needs to be addressed before the continue statement.
Otherwise, with some VMAs skipped, userspace observed memory
consumption from /proc/pid/smaps_rollup will be smaller than the sum of
the corresponding fields from /proc/pid/smaps.
Link: https://lkml.kernel.org/r/20240523183531.2535436-1-yzhong@purestorage.com
Fixes: c4c84f06285e ("fs/proc/task_mmu: stop using linked list and highest_vm_end")
Signed-off-by: Yuanyuan Zhong <yzhong(a)purestorage.com>
Reviewed-by: Mohamed Khalfella <mkhalfella(a)purestorage.com>
Cc: David Hildenbrand <david(a)redhat.com>
Cc: Matthew Wilcox (Oracle) <willy(a)infradead.org>
Cc: <stable(a)vger.kernel.org>
Signed-off-by: Andrew Morton <akpm(a)linux-foundation.org>
---
fs/proc/task_mmu.c | 9 +++++++--
1 file changed, 7 insertions(+), 2 deletions(-)
--- a/fs/proc/task_mmu.c~mm-proc-pid-smaps_rollup-avoid-skipping-vma-after-getting-mmap_lock-again
+++ a/fs/proc/task_mmu.c
@@ -970,12 +970,17 @@ static int show_smaps_rollup(struct seq_
break;
/* Case 1 and 2 above */
- if (vma->vm_start >= last_vma_end)
+ if (vma->vm_start >= last_vma_end) {
+ smap_gather_stats(vma, &mss, 0);
+ last_vma_end = vma->vm_end;
continue;
+ }
/* Case 4 above */
- if (vma->vm_end > last_vma_end)
+ if (vma->vm_end > last_vma_end) {
smap_gather_stats(vma, &mss, last_vma_end);
+ last_vma_end = vma->vm_end;
+ }
}
} for_each_vma(vmi, vma);
_
Patches currently in -mm which might be from yzhong(a)purestorage.com are
The quilt patch titled
Subject: nilfs2: fix potential hang in nilfs_detach_log_writer()
has been removed from the -mm tree. Its filename was
nilfs2-fix-potential-hang-in-nilfs_detach_log_writer.patch
This patch was dropped because it was merged into the mm-hotfixes-stable branch
of git://git.kernel.org/pub/scm/linux/kernel/git/akpm/mm
------------------------------------------------------
From: Ryusuke Konishi <konishi.ryusuke(a)gmail.com>
Subject: nilfs2: fix potential hang in nilfs_detach_log_writer()
Date: Mon, 20 May 2024 22:26:21 +0900
Syzbot has reported a potential hang in nilfs_detach_log_writer() called
during nilfs2 unmount.
Analysis revealed that this is because nilfs_segctor_sync(), which
synchronizes with the log writer thread, can be called after
nilfs_segctor_destroy() terminates that thread, as shown in the call trace
below:
nilfs_detach_log_writer
nilfs_segctor_destroy
nilfs_segctor_kill_thread --> Shut down log writer thread
flush_work
nilfs_iput_work_func
nilfs_dispose_list
iput
nilfs_evict_inode
nilfs_transaction_commit
nilfs_construct_segment (if inode needs sync)
nilfs_segctor_sync --> Attempt to synchronize with
log writer thread
*** DEADLOCK ***
Fix this issue by changing nilfs_segctor_sync() so that the log writer
thread returns normally without synchronizing after it terminates, and by
forcing tasks that are already waiting to complete once after the thread
terminates.
The skipped inode metadata flushout will then be processed together in the
subsequent cleanup work in nilfs_segctor_destroy().
Link: https://lkml.kernel.org/r/20240520132621.4054-4-konishi.ryusuke@gmail.com
Signed-off-by: Ryusuke Konishi <konishi.ryusuke(a)gmail.com>
Reported-by: syzbot+e3973c409251e136fdd0(a)syzkaller.appspotmail.com
Closes: https://syzkaller.appspot.com/bug?extid=e3973c409251e136fdd0
Tested-by: Ryusuke Konishi <konishi.ryusuke(a)gmail.com>
Cc: <stable(a)vger.kernel.org>
Cc: "Bai, Shuangpeng" <sjb7183(a)psu.edu>
Signed-off-by: Andrew Morton <akpm(a)linux-foundation.org>
---
fs/nilfs2/segment.c | 21 ++++++++++++++++++---
1 file changed, 18 insertions(+), 3 deletions(-)
--- a/fs/nilfs2/segment.c~nilfs2-fix-potential-hang-in-nilfs_detach_log_writer
+++ a/fs/nilfs2/segment.c
@@ -2190,6 +2190,14 @@ static int nilfs_segctor_sync(struct nil
for (;;) {
set_current_state(TASK_INTERRUPTIBLE);
+ /*
+ * Synchronize only while the log writer thread is alive.
+ * Leave flushing out after the log writer thread exits to
+ * the cleanup work in nilfs_segctor_destroy().
+ */
+ if (!sci->sc_task)
+ break;
+
if (atomic_read(&wait_req.done)) {
err = wait_req.err;
break;
@@ -2205,7 +2213,7 @@ static int nilfs_segctor_sync(struct nil
return err;
}
-static void nilfs_segctor_wakeup(struct nilfs_sc_info *sci, int err)
+static void nilfs_segctor_wakeup(struct nilfs_sc_info *sci, int err, bool force)
{
struct nilfs_segctor_wait_request *wrq, *n;
unsigned long flags;
@@ -2213,7 +2221,7 @@ static void nilfs_segctor_wakeup(struct
spin_lock_irqsave(&sci->sc_wait_request.lock, flags);
list_for_each_entry_safe(wrq, n, &sci->sc_wait_request.head, wq.entry) {
if (!atomic_read(&wrq->done) &&
- nilfs_cnt32_ge(sci->sc_seq_done, wrq->seq)) {
+ (force || nilfs_cnt32_ge(sci->sc_seq_done, wrq->seq))) {
wrq->err = err;
atomic_set(&wrq->done, 1);
}
@@ -2362,7 +2370,7 @@ static void nilfs_segctor_notify(struct
if (mode == SC_LSEG_SR) {
sci->sc_state &= ~NILFS_SEGCTOR_COMMIT;
sci->sc_seq_done = sci->sc_seq_accepted;
- nilfs_segctor_wakeup(sci, err);
+ nilfs_segctor_wakeup(sci, err, false);
sci->sc_flush_request = 0;
} else {
if (mode == SC_FLUSH_FILE)
@@ -2746,6 +2754,13 @@ static void nilfs_segctor_destroy(struct
|| sci->sc_seq_request != sci->sc_seq_done);
spin_unlock(&sci->sc_state_lock);
+ /*
+ * Forcibly wake up tasks waiting in nilfs_segctor_sync(), which can
+ * be called from delayed iput() via nilfs_evict_inode() and can race
+ * with the above log writer thread termination.
+ */
+ nilfs_segctor_wakeup(sci, 0, true);
+
if (flush_work(&sci->sc_iput_work))
flag = true;
_
Patches currently in -mm which might be from konishi.ryusuke(a)gmail.com are
The quilt patch titled
Subject: nilfs2: fix unexpected freezing of nilfs_segctor_sync()
has been removed from the -mm tree. Its filename was
nilfs2-fix-unexpected-freezing-of-nilfs_segctor_sync.patch
This patch was dropped because it was merged into the mm-hotfixes-stable branch
of git://git.kernel.org/pub/scm/linux/kernel/git/akpm/mm
------------------------------------------------------
From: Ryusuke Konishi <konishi.ryusuke(a)gmail.com>
Subject: nilfs2: fix unexpected freezing of nilfs_segctor_sync()
Date: Mon, 20 May 2024 22:26:20 +0900
A potential and reproducible race issue has been identified where
nilfs_segctor_sync() would block even after the log writer thread writes a
checkpoint, unless there is an interrupt or other trigger to resume log
writing.
This turned out to be because, depending on the execution timing of the
log writer thread running in parallel, the log writer thread may skip
responding to nilfs_segctor_sync(), which causes a call to schedule()
waiting for completion within nilfs_segctor_sync() to lose the opportunity
to wake up.
The reason why waking up the task waiting in nilfs_segctor_sync() may be
skipped is that updating the request generation issued using a shared
sequence counter and adding an wait queue entry to the request wait queue
to the log writer, are not done atomically. There is a possibility that
log writing and request completion notification by nilfs_segctor_wakeup()
may occur between the two operations, and in that case, the wait queue
entry is not yet visible to nilfs_segctor_wakeup() and the wake-up of
nilfs_segctor_sync() will be carried over until the next request occurs.
Fix this issue by performing these two operations simultaneously within
the lock section of sc_state_lock. Also, following the memory barrier
guidelines for event waiting loops, move the call to set_current_state()
in the same location into the event waiting loop to ensure that a memory
barrier is inserted just before the event condition determination.
Link: https://lkml.kernel.org/r/20240520132621.4054-3-konishi.ryusuke@gmail.com
Fixes: 9ff05123e3bf ("nilfs2: segment constructor")
Signed-off-by: Ryusuke Konishi <konishi.ryusuke(a)gmail.com>
Tested-by: Ryusuke Konishi <konishi.ryusuke(a)gmail.com>
Cc: <stable(a)vger.kernel.org>
Cc: "Bai, Shuangpeng" <sjb7183(a)psu.edu>
Signed-off-by: Andrew Morton <akpm(a)linux-foundation.org>
---
fs/nilfs2/segment.c | 17 +++++++++++++----
1 file changed, 13 insertions(+), 4 deletions(-)
--- a/fs/nilfs2/segment.c~nilfs2-fix-unexpected-freezing-of-nilfs_segctor_sync
+++ a/fs/nilfs2/segment.c
@@ -2168,19 +2168,28 @@ static int nilfs_segctor_sync(struct nil
struct nilfs_segctor_wait_request wait_req;
int err = 0;
- spin_lock(&sci->sc_state_lock);
init_wait(&wait_req.wq);
wait_req.err = 0;
atomic_set(&wait_req.done, 0);
+ init_waitqueue_entry(&wait_req.wq, current);
+
+ /*
+ * To prevent a race issue where completion notifications from the
+ * log writer thread are missed, increment the request sequence count
+ * "sc_seq_request" and insert a wait queue entry using the current
+ * sequence number into the "sc_wait_request" queue at the same time
+ * within the lock section of "sc_state_lock".
+ */
+ spin_lock(&sci->sc_state_lock);
wait_req.seq = ++sci->sc_seq_request;
+ add_wait_queue(&sci->sc_wait_request, &wait_req.wq);
spin_unlock(&sci->sc_state_lock);
- init_waitqueue_entry(&wait_req.wq, current);
- add_wait_queue(&sci->sc_wait_request, &wait_req.wq);
- set_current_state(TASK_INTERRUPTIBLE);
wake_up(&sci->sc_wait_daemon);
for (;;) {
+ set_current_state(TASK_INTERRUPTIBLE);
+
if (atomic_read(&wait_req.done)) {
err = wait_req.err;
break;
_
Patches currently in -mm which might be from konishi.ryusuke(a)gmail.com are
The quilt patch titled
Subject: nilfs2: fix use-after-free of timer for log writer thread
has been removed from the -mm tree. Its filename was
nilfs2-fix-use-after-free-of-timer-for-log-writer-thread.patch
This patch was dropped because it was merged into the mm-hotfixes-stable branch
of git://git.kernel.org/pub/scm/linux/kernel/git/akpm/mm
------------------------------------------------------
From: Ryusuke Konishi <konishi.ryusuke(a)gmail.com>
Subject: nilfs2: fix use-after-free of timer for log writer thread
Date: Mon, 20 May 2024 22:26:19 +0900
Patch series "nilfs2: fix log writer related issues".
This bug fix series covers three nilfs2 log writer-related issues,
including a timer use-after-free issue and potential deadlock issue on
unmount, and a potential freeze issue in event synchronization found
during their analysis. Details are described in each commit log.
This patch (of 3):
A use-after-free issue has been reported regarding the timer sc_timer on
the nilfs_sc_info structure.
The problem is that even though it is used to wake up a sleeping log
writer thread, sc_timer is not shut down until the nilfs_sc_info structure
is about to be freed, and is used regardless of the thread's lifetime.
Fix this issue by limiting the use of sc_timer only while the log writer
thread is alive.
Link: https://lkml.kernel.org/r/20240520132621.4054-1-konishi.ryusuke@gmail.com
Link: https://lkml.kernel.org/r/20240520132621.4054-2-konishi.ryusuke@gmail.com
Fixes: fdce895ea5dd ("nilfs2: change sc_timer from a pointer to an embedded one in struct nilfs_sc_info")
Signed-off-by: Ryusuke Konishi <konishi.ryusuke(a)gmail.com>
Reported-by: "Bai, Shuangpeng" <sjb7183(a)psu.edu>
Closes: https://groups.google.com/g/syzkaller/c/MK_LYqtt8ko/m/8rgdWeseAwAJ
Tested-by: Ryusuke Konishi <konishi.ryusuke(a)gmail.com>
Cc: <stable(a)vger.kernel.org>
Signed-off-by: Andrew Morton <akpm(a)linux-foundation.org>
---
fs/nilfs2/segment.c | 25 +++++++++++++++++++------
1 file changed, 19 insertions(+), 6 deletions(-)
--- a/fs/nilfs2/segment.c~nilfs2-fix-use-after-free-of-timer-for-log-writer-thread
+++ a/fs/nilfs2/segment.c
@@ -2118,8 +2118,10 @@ static void nilfs_segctor_start_timer(st
{
spin_lock(&sci->sc_state_lock);
if (!(sci->sc_state & NILFS_SEGCTOR_COMMIT)) {
- sci->sc_timer.expires = jiffies + sci->sc_interval;
- add_timer(&sci->sc_timer);
+ if (sci->sc_task) {
+ sci->sc_timer.expires = jiffies + sci->sc_interval;
+ add_timer(&sci->sc_timer);
+ }
sci->sc_state |= NILFS_SEGCTOR_COMMIT;
}
spin_unlock(&sci->sc_state_lock);
@@ -2320,10 +2322,21 @@ int nilfs_construct_dsync_segment(struct
*/
static void nilfs_segctor_accept(struct nilfs_sc_info *sci)
{
+ bool thread_is_alive;
+
spin_lock(&sci->sc_state_lock);
sci->sc_seq_accepted = sci->sc_seq_request;
+ thread_is_alive = (bool)sci->sc_task;
spin_unlock(&sci->sc_state_lock);
- del_timer_sync(&sci->sc_timer);
+
+ /*
+ * This function does not race with the log writer thread's
+ * termination. Therefore, deleting sc_timer, which should not be
+ * done after the log writer thread exits, can be done safely outside
+ * the area protected by sc_state_lock.
+ */
+ if (thread_is_alive)
+ del_timer_sync(&sci->sc_timer);
}
/**
@@ -2349,7 +2362,7 @@ static void nilfs_segctor_notify(struct
sci->sc_flush_request &= ~FLUSH_DAT_BIT;
/* re-enable timer if checkpoint creation was not done */
- if ((sci->sc_state & NILFS_SEGCTOR_COMMIT) &&
+ if ((sci->sc_state & NILFS_SEGCTOR_COMMIT) && sci->sc_task &&
time_before(jiffies, sci->sc_timer.expires))
add_timer(&sci->sc_timer);
}
@@ -2539,6 +2552,7 @@ static int nilfs_segctor_thread(void *ar
int timeout = 0;
sci->sc_timer_task = current;
+ timer_setup(&sci->sc_timer, nilfs_construction_timeout, 0);
/* start sync. */
sci->sc_task = current;
@@ -2606,6 +2620,7 @@ static int nilfs_segctor_thread(void *ar
end_thread:
/* end sync. */
sci->sc_task = NULL;
+ timer_shutdown_sync(&sci->sc_timer);
wake_up(&sci->sc_wait_task); /* for nilfs_segctor_kill_thread() */
spin_unlock(&sci->sc_state_lock);
return 0;
@@ -2669,7 +2684,6 @@ static struct nilfs_sc_info *nilfs_segct
INIT_LIST_HEAD(&sci->sc_gc_inodes);
INIT_LIST_HEAD(&sci->sc_iput_queue);
INIT_WORK(&sci->sc_iput_work, nilfs_iput_work_func);
- timer_setup(&sci->sc_timer, nilfs_construction_timeout, 0);
sci->sc_interval = HZ * NILFS_SC_DEFAULT_TIMEOUT;
sci->sc_mjcp_freq = HZ * NILFS_SC_DEFAULT_SR_FREQ;
@@ -2748,7 +2762,6 @@ static void nilfs_segctor_destroy(struct
down_write(&nilfs->ns_segctor_sem);
- timer_shutdown_sync(&sci->sc_timer);
kfree(sci);
}
_
Patches currently in -mm which might be from konishi.ryusuke(a)gmail.com are
The quilt patch titled
Subject: selftests/mm: fix build warnings on ppc64
has been removed from the -mm tree. Its filename was
selftests-mm-fix-build-warnings-on-ppc64.patch
This patch was dropped because it was merged into the mm-hotfixes-stable branch
of git://git.kernel.org/pub/scm/linux/kernel/git/akpm/mm
------------------------------------------------------
From: Michael Ellerman <mpe(a)ellerman.id.au>
Subject: selftests/mm: fix build warnings on ppc64
Date: Tue, 21 May 2024 13:02:19 +1000
Fix warnings like:
In file included from uffd-unit-tests.c:8:
uffd-unit-tests.c: In function `uffd_poison_handle_fault':
uffd-common.h:45:33: warning: format `%llu' expects argument of type
`long long unsigned int', but argument 3 has type `__u64' {aka `long
unsigned int'} [-Wformat=]
By switching to unsigned long long for u64 for ppc64 builds.
Link: https://lkml.kernel.org/r/20240521030219.57439-1-mpe@ellerman.id.au
Signed-off-by: Michael Ellerman <mpe(a)ellerman.id.au>
Cc: Shuah Khan <skhan(a)linuxfoundation.org>
Cc: <stable(a)vger.kernel.org>
Signed-off-by: Andrew Morton <akpm(a)linux-foundation.org>
---
tools/testing/selftests/mm/gup_test.c | 1 +
tools/testing/selftests/mm/uffd-common.h | 1 +
2 files changed, 2 insertions(+)
--- a/tools/testing/selftests/mm/gup_test.c~selftests-mm-fix-build-warnings-on-ppc64
+++ a/tools/testing/selftests/mm/gup_test.c
@@ -1,3 +1,4 @@
+#define __SANE_USERSPACE_TYPES__ // Use ll64
#include <fcntl.h>
#include <errno.h>
#include <stdio.h>
--- a/tools/testing/selftests/mm/uffd-common.h~selftests-mm-fix-build-warnings-on-ppc64
+++ a/tools/testing/selftests/mm/uffd-common.h
@@ -8,6 +8,7 @@
#define __UFFD_COMMON_H__
#define _GNU_SOURCE
+#define __SANE_USERSPACE_TYPES__ // Use ll64
#include <stdio.h>
#include <errno.h>
#include <unistd.h>
_
Patches currently in -mm which might be from mpe(a)ellerman.id.au are
The quilt patch titled
Subject: selftests/mm: compaction_test: fix bogus test success and reduce probability of OOM-killer invocation
has been removed from the -mm tree. Its filename was
selftests-mm-compaction_test-fix-bogus-test-success-and-reduce-probability-of-oom-killer-invocation.patch
This patch was dropped because it was merged into the mm-hotfixes-stable branch
of git://git.kernel.org/pub/scm/linux/kernel/git/akpm/mm
------------------------------------------------------
From: Dev Jain <dev.jain(a)arm.com>
Subject: selftests/mm: compaction_test: fix bogus test success and reduce probability of OOM-killer invocation
Date: Tue, 21 May 2024 13:13:58 +0530
Reset nr_hugepages to zero before the start of the test.
If a non-zero number of hugepages is already set before the start of the
test, the following problems arise:
- The probability of the test getting OOM-killed increases. Proof:
The test wants to run on 80% of available memory to prevent OOM-killing
(see original code comments). Let the value of mem_free at the start
of the test, when nr_hugepages = 0, be x. In the other case, when
nr_hugepages > 0, let the memory consumed by hugepages be y. In the
former case, the test operates on 0.8 * x of memory. In the latter,
the test operates on 0.8 * (x - y) of memory, with y already filled,
hence, memory consumed is y + 0.8 * (x - y) = 0.8 * x + 0.2 * y > 0.8 *
x. Q.E.D
- The probability of a bogus test success increases. Proof: Let the
memory consumed by hugepages be greater than 25% of x, with x and y
defined as above. The definition of compaction_index is c_index = (x -
y)/z where z is the memory consumed by hugepages after trying to
increase them again. In check_compaction(), we set the number of
hugepages to zero, and then increase them back; the probability that
they will be set back to consume at least y amount of memory again is
very high (since there is not much delay between the two attempts of
changing nr_hugepages). Hence, z >= y > (x/4) (by the 25% assumption).
Therefore, c_index = (x - y)/z <= (x - y)/y = x/y - 1 < 4 - 1 = 3
hence, c_index can always be forced to be less than 3, thereby the test
succeeding always. Q.E.D
Link: https://lkml.kernel.org/r/20240521074358.675031-4-dev.jain@arm.com
Fixes: bd67d5c15cc1 ("Test compaction of mlocked memory")
Signed-off-by: Dev Jain <dev.jain(a)arm.com>
Cc: <stable(a)vger.kernel.org>
Cc: Anshuman Khandual <anshuman.khandual(a)arm.com>
Cc: Shuah Khan <shuah(a)kernel.org>
Cc: Sri Jayaramappa <sjayaram(a)akamai.com>
Signed-off-by: Andrew Morton <akpm(a)linux-foundation.org>
---
tools/testing/selftests/mm/compaction_test.c | 71 +++++++++++------
1 file changed, 49 insertions(+), 22 deletions(-)
--- a/tools/testing/selftests/mm/compaction_test.c~selftests-mm-compaction_test-fix-bogus-test-success-and-reduce-probability-of-oom-killer-invocation
+++ a/tools/testing/selftests/mm/compaction_test.c
@@ -82,13 +82,16 @@ int prereq(void)
return -1;
}
-int check_compaction(unsigned long mem_free, unsigned long hugepage_size)
+int check_compaction(unsigned long mem_free, unsigned long hugepage_size,
+ unsigned long initial_nr_hugepages)
{
unsigned long nr_hugepages_ul;
int fd, ret = -1;
int compaction_index = 0;
- char initial_nr_hugepages[20] = {0};
char nr_hugepages[20] = {0};
+ char init_nr_hugepages[20] = {0};
+
+ sprintf(init_nr_hugepages, "%lu", initial_nr_hugepages);
/* We want to test with 80% of available memory. Else, OOM killer comes
in to play */
@@ -102,23 +105,6 @@ int check_compaction(unsigned long mem_f
goto out;
}
- if (read(fd, initial_nr_hugepages, sizeof(initial_nr_hugepages)) <= 0) {
- ksft_print_msg("Failed to read from /proc/sys/vm/nr_hugepages: %s\n",
- strerror(errno));
- goto close_fd;
- }
-
- lseek(fd, 0, SEEK_SET);
-
- /* Start with the initial condition of 0 huge pages*/
- if (write(fd, "0", sizeof(char)) != sizeof(char)) {
- ksft_print_msg("Failed to write 0 to /proc/sys/vm/nr_hugepages: %s\n",
- strerror(errno));
- goto close_fd;
- }
-
- lseek(fd, 0, SEEK_SET);
-
/* Request a large number of huge pages. The Kernel will allocate
as much as it can */
if (write(fd, "100000", (6*sizeof(char))) != (6*sizeof(char))) {
@@ -146,8 +132,8 @@ int check_compaction(unsigned long mem_f
lseek(fd, 0, SEEK_SET);
- if (write(fd, initial_nr_hugepages, strlen(initial_nr_hugepages))
- != strlen(initial_nr_hugepages)) {
+ if (write(fd, init_nr_hugepages, strlen(init_nr_hugepages))
+ != strlen(init_nr_hugepages)) {
ksft_print_msg("Failed to write value to /proc/sys/vm/nr_hugepages: %s\n",
strerror(errno));
goto close_fd;
@@ -171,6 +157,41 @@ int check_compaction(unsigned long mem_f
return ret;
}
+int set_zero_hugepages(unsigned long *initial_nr_hugepages)
+{
+ int fd, ret = -1;
+ char nr_hugepages[20] = {0};
+
+ fd = open("/proc/sys/vm/nr_hugepages", O_RDWR | O_NONBLOCK);
+ if (fd < 0) {
+ ksft_print_msg("Failed to open /proc/sys/vm/nr_hugepages: %s\n",
+ strerror(errno));
+ goto out;
+ }
+ if (read(fd, nr_hugepages, sizeof(nr_hugepages)) <= 0) {
+ ksft_print_msg("Failed to read from /proc/sys/vm/nr_hugepages: %s\n",
+ strerror(errno));
+ goto close_fd;
+ }
+
+ lseek(fd, 0, SEEK_SET);
+
+ /* Start with the initial condition of 0 huge pages */
+ if (write(fd, "0", sizeof(char)) != sizeof(char)) {
+ ksft_print_msg("Failed to write 0 to /proc/sys/vm/nr_hugepages: %s\n",
+ strerror(errno));
+ goto close_fd;
+ }
+
+ *initial_nr_hugepages = strtoul(nr_hugepages, NULL, 10);
+ ret = 0;
+
+ close_fd:
+ close(fd);
+
+ out:
+ return ret;
+}
int main(int argc, char **argv)
{
@@ -181,6 +202,7 @@ int main(int argc, char **argv)
unsigned long mem_free = 0;
unsigned long hugepage_size = 0;
long mem_fragmentable_MB = 0;
+ unsigned long initial_nr_hugepages;
ksft_print_header();
@@ -189,6 +211,10 @@ int main(int argc, char **argv)
ksft_set_plan(1);
+ /* Start the test without hugepages reducing mem_free */
+ if (set_zero_hugepages(&initial_nr_hugepages))
+ ksft_exit_fail();
+
lim.rlim_cur = RLIM_INFINITY;
lim.rlim_max = RLIM_INFINITY;
if (setrlimit(RLIMIT_MEMLOCK, &lim))
@@ -232,7 +258,8 @@ int main(int argc, char **argv)
entry = entry->next;
}
- if (check_compaction(mem_free, hugepage_size) == 0)
+ if (check_compaction(mem_free, hugepage_size,
+ initial_nr_hugepages) == 0)
ksft_exit_pass();
ksft_exit_fail();
_
Patches currently in -mm which might be from dev.jain(a)arm.com are
selftests-mm-va_high_addr_switch-reduce-test-noise.patch
selftests-mm-va_high_addr_switch-dynamically-initialize-testcases-to-enable-lpa2-testing.patch
The quilt patch titled
Subject: selftests/mm: compaction_test: fix incorrect write of zero to nr_hugepages
has been removed from the -mm tree. Its filename was
selftests-mm-compaction_test-fix-incorrect-write-of-zero-to-nr_hugepages.patch
This patch was dropped because it was merged into the mm-hotfixes-stable branch
of git://git.kernel.org/pub/scm/linux/kernel/git/akpm/mm
------------------------------------------------------
From: Dev Jain <dev.jain(a)arm.com>
Subject: selftests/mm: compaction_test: fix incorrect write of zero to nr_hugepages
Date: Tue, 21 May 2024 13:13:57 +0530
Currently, the test tries to set nr_hugepages to zero, but that is not
actually done because the file offset is not reset after read(). Fix that
using lseek().
Link: https://lkml.kernel.org/r/20240521074358.675031-3-dev.jain@arm.com
Fixes: bd67d5c15cc1 ("Test compaction of mlocked memory")
Signed-off-by: Dev Jain <dev.jain(a)arm.com>
Cc: <stable(a)vger.kernel.org>
Cc: Anshuman Khandual <anshuman.khandual(a)arm.com>
Cc: Shuah Khan <shuah(a)kernel.org>
Cc: Sri Jayaramappa <sjayaram(a)akamai.com>
Signed-off-by: Andrew Morton <akpm(a)linux-foundation.org>
---
tools/testing/selftests/mm/compaction_test.c | 2 ++
1 file changed, 2 insertions(+)
--- a/tools/testing/selftests/mm/compaction_test.c~selftests-mm-compaction_test-fix-incorrect-write-of-zero-to-nr_hugepages
+++ a/tools/testing/selftests/mm/compaction_test.c
@@ -108,6 +108,8 @@ int check_compaction(unsigned long mem_f
goto close_fd;
}
+ lseek(fd, 0, SEEK_SET);
+
/* Start with the initial condition of 0 huge pages*/
if (write(fd, "0", sizeof(char)) != sizeof(char)) {
ksft_print_msg("Failed to write 0 to /proc/sys/vm/nr_hugepages: %s\n",
_
Patches currently in -mm which might be from dev.jain(a)arm.com are
selftests-mm-va_high_addr_switch-reduce-test-noise.patch
selftests-mm-va_high_addr_switch-dynamically-initialize-testcases-to-enable-lpa2-testing.patch
The quilt patch titled
Subject: selftests/mm: compaction_test: fix bogus test success on Aarch64
has been removed from the -mm tree. Its filename was
selftests-mm-compaction_test-fix-bogus-test-success-on-aarch64.patch
This patch was dropped because it was merged into the mm-hotfixes-stable branch
of git://git.kernel.org/pub/scm/linux/kernel/git/akpm/mm
------------------------------------------------------
From: Dev Jain <dev.jain(a)arm.com>
Subject: selftests/mm: compaction_test: fix bogus test success on Aarch64
Date: Tue, 21 May 2024 13:13:56 +0530
Patch series "Fixes for compaction_test", v2.
The compaction_test memory selftest introduces fragmentation in memory
and then tries to allocate as many hugepages as possible. This series
addresses some problems.
On Aarch64, if nr_hugepages == 0, then the test trivially succeeds since
compaction_index becomes 0, which is less than 3, due to no division by
zero exception being raised. We fix that by checking for division by
zero.
Secondly, correctly set the number of hugepages to zero before trying
to set a large number of them.
Now, consider a situation in which, at the start of the test, a non-zero
number of hugepages have been already set (while running the entire
selftests/mm suite, or manually by the admin). The test operates on 80%
of memory to avoid OOM-killer invocation, and because some memory is
already blocked by hugepages, it would increase the chance of OOM-killing.
Also, since mem_free used in check_compaction() is the value before we
set nr_hugepages to zero, the chance that the compaction_index will
be small is very high if the preset nr_hugepages was high, leading to a
bogus test success.
This patch (of 3):
Currently, if at runtime we are not able to allocate a huge page, the test
will trivially pass on Aarch64 due to no exception being raised on
division by zero while computing compaction_index. Fix that by checking
for nr_hugepages == 0. Anyways, in general, avoid a division by zero by
exiting the program beforehand. While at it, fix a typo, and handle the
case where the number of hugepages may overflow an integer.
Link: https://lkml.kernel.org/r/20240521074358.675031-1-dev.jain@arm.com
Link: https://lkml.kernel.org/r/20240521074358.675031-2-dev.jain@arm.com
Fixes: bd67d5c15cc1 ("Test compaction of mlocked memory")
Signed-off-by: Dev Jain <dev.jain(a)arm.com>
Cc: Anshuman Khandual <anshuman.khandual(a)arm.com>
Cc: Shuah Khan <shuah(a)kernel.org>
Cc: Sri Jayaramappa <sjayaram(a)akamai.com>
Cc: <stable(a)vger.kernel.org>
Signed-off-by: Andrew Morton <akpm(a)linux-foundation.org>
---
tools/testing/selftests/mm/compaction_test.c | 20 +++++++++++------
1 file changed, 13 insertions(+), 7 deletions(-)
--- a/tools/testing/selftests/mm/compaction_test.c~selftests-mm-compaction_test-fix-bogus-test-success-on-aarch64
+++ a/tools/testing/selftests/mm/compaction_test.c
@@ -82,12 +82,13 @@ int prereq(void)
return -1;
}
-int check_compaction(unsigned long mem_free, unsigned int hugepage_size)
+int check_compaction(unsigned long mem_free, unsigned long hugepage_size)
{
+ unsigned long nr_hugepages_ul;
int fd, ret = -1;
int compaction_index = 0;
- char initial_nr_hugepages[10] = {0};
- char nr_hugepages[10] = {0};
+ char initial_nr_hugepages[20] = {0};
+ char nr_hugepages[20] = {0};
/* We want to test with 80% of available memory. Else, OOM killer comes
in to play */
@@ -134,7 +135,12 @@ int check_compaction(unsigned long mem_f
/* We should have been able to request at least 1/3 rd of the memory in
huge pages */
- compaction_index = mem_free/(atoi(nr_hugepages) * hugepage_size);
+ nr_hugepages_ul = strtoul(nr_hugepages, NULL, 10);
+ if (!nr_hugepages_ul) {
+ ksft_print_msg("ERROR: No memory is available as huge pages\n");
+ goto close_fd;
+ }
+ compaction_index = mem_free/(nr_hugepages_ul * hugepage_size);
lseek(fd, 0, SEEK_SET);
@@ -145,11 +151,11 @@ int check_compaction(unsigned long mem_f
goto close_fd;
}
- ksft_print_msg("Number of huge pages allocated = %d\n",
- atoi(nr_hugepages));
+ ksft_print_msg("Number of huge pages allocated = %lu\n",
+ nr_hugepages_ul);
if (compaction_index > 3) {
- ksft_print_msg("ERROR: Less that 1/%d of memory is available\n"
+ ksft_print_msg("ERROR: Less than 1/%d of memory is available\n"
"as huge pages\n", compaction_index);
goto close_fd;
}
_
Patches currently in -mm which might be from dev.jain(a)arm.com are
selftests-mm-va_high_addr_switch-reduce-test-noise.patch
selftests-mm-va_high_addr_switch-dynamically-initialize-testcases-to-enable-lpa2-testing.patch
The quilt patch titled
Subject: mm/vmalloc: fix vmalloc which may return null if called with __GFP_NOFAIL
has been removed from the -mm tree. Its filename was
mm-vmalloc-fix-vmalloc-which-may-return-null-if-called-with-__gfp_nofail.patch
This patch was dropped because it was merged into the mm-hotfixes-stable branch
of git://git.kernel.org/pub/scm/linux/kernel/git/akpm/mm
------------------------------------------------------
From: "Hailong.Liu" <hailong.liu(a)oppo.com>
Subject: mm/vmalloc: fix vmalloc which may return null if called with __GFP_NOFAIL
Date: Fri, 10 May 2024 18:01:31 +0800
commit a421ef303008 ("mm: allow !GFP_KERNEL allocations for kvmalloc")
includes support for __GFP_NOFAIL, but it presents a conflict with commit
dd544141b9eb ("vmalloc: back off when the current task is OOM-killed"). A
possible scenario is as follows:
process-a
__vmalloc_node_range(GFP_KERNEL | __GFP_NOFAIL)
__vmalloc_area_node()
vm_area_alloc_pages()
--> oom-killer send SIGKILL to process-a
if (fatal_signal_pending(current)) break;
--> return NULL;
To fix this, do not check fatal_signal_pending() in vm_area_alloc_pages()
if __GFP_NOFAIL set.
This issue occurred during OPLUS KASAN TEST. Below is part of the log
-> oom-killer sends signal to process
[65731.222840] [ T1308] oom-kill:constraint=CONSTRAINT_NONE,nodemask=(null),cpuset=/,mems_allowed=0,global_oom,task_memcg=/apps/uid_10198,task=gs.intelligence,pid=32454,uid=10198
[65731.259685] [T32454] Call trace:
[65731.259698] [T32454] dump_backtrace+0xf4/0x118
[65731.259734] [T32454] show_stack+0x18/0x24
[65731.259756] [T32454] dump_stack_lvl+0x60/0x7c
[65731.259781] [T32454] dump_stack+0x18/0x38
[65731.259800] [T32454] mrdump_common_die+0x250/0x39c [mrdump]
[65731.259936] [T32454] ipanic_die+0x20/0x34 [mrdump]
[65731.260019] [T32454] atomic_notifier_call_chain+0xb4/0xfc
[65731.260047] [T32454] notify_die+0x114/0x198
[65731.260073] [T32454] die+0xf4/0x5b4
[65731.260098] [T32454] die_kernel_fault+0x80/0x98
[65731.260124] [T32454] __do_kernel_fault+0x160/0x2a8
[65731.260146] [T32454] do_bad_area+0x68/0x148
[65731.260174] [T32454] do_mem_abort+0x151c/0x1b34
[65731.260204] [T32454] el1_abort+0x3c/0x5c
[65731.260227] [T32454] el1h_64_sync_handler+0x54/0x90
[65731.260248] [T32454] el1h_64_sync+0x68/0x6c
[65731.260269] [T32454] z_erofs_decompress_queue+0x7f0/0x2258
--> be->decompressed_pages = kvcalloc(be->nr_pages, sizeof(struct page *), GFP_KERNEL | __GFP_NOFAIL);
kernel panic by NULL pointer dereference.
erofs assume kvmalloc with __GFP_NOFAIL never return NULL.
[65731.260293] [T32454] z_erofs_runqueue+0xf30/0x104c
[65731.260314] [T32454] z_erofs_readahead+0x4f0/0x968
[65731.260339] [T32454] read_pages+0x170/0xadc
[65731.260364] [T32454] page_cache_ra_unbounded+0x874/0xf30
[65731.260388] [T32454] page_cache_ra_order+0x24c/0x714
[65731.260411] [T32454] filemap_fault+0xbf0/0x1a74
[65731.260437] [T32454] __do_fault+0xd0/0x33c
[65731.260462] [T32454] handle_mm_fault+0xf74/0x3fe0
[65731.260486] [T32454] do_mem_abort+0x54c/0x1b34
[65731.260509] [T32454] el0_da+0x44/0x94
[65731.260531] [T32454] el0t_64_sync_handler+0x98/0xb4
[65731.260553] [T32454] el0t_64_sync+0x198/0x19c
Link: https://lkml.kernel.org/r/20240510100131.1865-1-hailong.liu@oppo.com
Fixes: 9376130c390a ("mm/vmalloc: add support for __GFP_NOFAIL")
Signed-off-by: Hailong.Liu <hailong.liu(a)oppo.com>
Acked-by: Michal Hocko <mhocko(a)suse.com>
Suggested-by: Barry Song <21cnbao(a)gmail.com>
Reported-by: Oven <liyangouwen1(a)oppo.com>
Reviewed-by: Barry Song <baohua(a)kernel.org>
Reviewed-by: Uladzislau Rezki (Sony) <urezki(a)gmail.com>
Cc: Chao Yu <chao(a)kernel.org>
Cc: Christoph Hellwig <hch(a)infradead.org>
Cc: Gao Xiang <xiang(a)kernel.org>
Cc: Lorenzo Stoakes <lstoakes(a)gmail.com>
Cc: Michal Hocko <mhocko(a)suse.com>
Cc: <stable(a)vger.kernel.org>
Signed-off-by: Andrew Morton <akpm(a)linux-foundation.org>
---
mm/vmalloc.c | 5 ++---
1 file changed, 2 insertions(+), 3 deletions(-)
--- a/mm/vmalloc.c~mm-vmalloc-fix-vmalloc-which-may-return-null-if-called-with-__gfp_nofail
+++ a/mm/vmalloc.c
@@ -3498,7 +3498,7 @@ vm_area_alloc_pages(gfp_t gfp, int nid,
{
unsigned int nr_allocated = 0;
gfp_t alloc_gfp = gfp;
- bool nofail = false;
+ bool nofail = gfp & __GFP_NOFAIL;
struct page *page;
int i;
@@ -3555,12 +3555,11 @@ vm_area_alloc_pages(gfp_t gfp, int nid,
* and compaction etc.
*/
alloc_gfp &= ~__GFP_NOFAIL;
- nofail = true;
}
/* High-order pages or fallback path if "bulk" fails. */
while (nr_allocated < nr_pages) {
- if (fatal_signal_pending(current))
+ if (!nofail && fatal_signal_pending(current))
break;
if (nid == NUMA_NO_NODE)
_
Patches currently in -mm which might be from hailong.liu(a)oppo.com are
This is the start of the stable review cycle for the 4.19.315 release.
There are 18 patches in this series, all will be posted as a response
to this one. If anyone has any issues with these being applied, please
let me know.
Responses should be made by Sat, 25 May 2024 13:03:15 +0000.
Anything received after that time might be too late.
The whole patch series can be found in one patch at:
https://www.kernel.org/pub/linux/kernel/v4.x/stable-review/patch-4.19.315-r…
or in the git tree and branch at:
git://git.kernel.org/pub/scm/linux/kernel/git/stable/linux-stable-rc.git linux-4.19.y
and the diffstat can be found below.
thanks,
greg k-h
-------------
Pseudo-Shortlog of commits:
Greg Kroah-Hartman <gregkh(a)linuxfoundation.org>
Linux 4.19.315-rc1
Akira Yokosawa <akiyks(a)gmail.com>
docs: kernel_include.py: Cope with docutils 0.21
Daniel Thompson <daniel.thompson(a)linaro.org>
serial: kgdboc: Fix NMI-safety problems from keyboard reset code
Tom Zanussi <tom.zanussi(a)linux.intel.com>
tracing: Remove unnecessary var_ref destroy in track_data_destroy()
Tom Zanussi <tom.zanussi(a)linux.intel.com>
tracing: Generalize hist trigger onmax and save action
Tom Zanussi <tom.zanussi(a)linux.intel.com>
tracing: Split up onmatch action data
Tom Zanussi <tom.zanussi(a)linux.intel.com>
tracing: Refactor hist trigger action code
Steven Rostedt (VMware) <rostedt(a)goodmis.org>
tracing: Have the historgram use the result of str_has_prefix() for len of prefix
Steven Rostedt (VMware) <rostedt(a)goodmis.org>
tracing: Use str_has_prefix() instead of using fixed sizes
Steven Rostedt (VMware) <rostedt(a)goodmis.org>
tracing: Use str_has_prefix() helper for histogram code
Steven Rostedt (VMware) <rostedt(a)goodmis.org>
string.h: Add str_has_prefix() helper function
Steven Rostedt (VMware) <rostedt(a)goodmis.org>
tracing: Consolidate trace_add/remove_event_call back to the nolock functions
Masami Hiramatsu <mhiramat(a)kernel.org>
tracing: Remove unneeded synth_event_mutex
Masami Hiramatsu <mhiramat(a)kernel.org>
tracing: Use dyn_event framework for synthetic events
Masami Hiramatsu <mhiramat(a)kernel.org>
tracing: Add unified dynamic event framework
Masami Hiramatsu <mhiramat(a)kernel.org>
tracing: Simplify creation and deletion of synthetic events
Dominique Martinet <dominique.martinet(a)atmark-techno.com>
btrfs: add missing mutex_unlock in btrfs_relocate_sys_chunks()
Mikulas Patocka <mpatocka(a)redhat.com>
dm: limit the number of targets and parameter size area
Harshit Mogalapalli <harshit.m.mogalapalli(a)oracle.com>
Revert "selftests: mm: fix map_hugetlb failure on 64K page size systems"
-------------
Diffstat:
Documentation/sphinx/kernel_include.py | 1 -
Makefile | 4 +-
drivers/md/dm-core.h | 2 +
drivers/md/dm-ioctl.c | 3 +-
drivers/md/dm-table.c | 9 +-
drivers/tty/serial/kgdboc.c | 30 +-
fs/btrfs/volumes.c | 1 +
include/linux/string.h | 20 +
include/linux/trace_events.h | 2 -
kernel/trace/Kconfig | 4 +
kernel/trace/Makefile | 1 +
kernel/trace/trace.c | 26 +-
kernel/trace/trace_dynevent.c | 210 ++++++
kernel/trace/trace_dynevent.h | 119 ++++
kernel/trace/trace_events.c | 32 +-
kernel/trace/trace_events_hist.c | 1082 ++++++++++++++++++------------
kernel/trace/trace_probe.c | 2 +-
kernel/trace/trace_stack.c | 2 +-
tools/testing/selftests/vm/map_hugetlb.c | 7 -
19 files changed, 1068 insertions(+), 489 deletions(-)
This is the start of the stable review cycle for the 5.4.277 release.
There are 16 patches in this series, all will be posted as a response
to this one. If anyone has any issues with these being applied, please
let me know.
Responses should be made by Sat, 25 May 2024 13:03:15 +0000.
Anything received after that time might be too late.
The whole patch series can be found in one patch at:
https://www.kernel.org/pub/linux/kernel/v5.x/stable-review/patch-5.4.277-rc…
or in the git tree and branch at:
git://git.kernel.org/pub/scm/linux/kernel/git/stable/linux-stable-rc.git linux-5.4.y
and the diffstat can be found below.
thanks,
greg k-h
-------------
Pseudo-Shortlog of commits:
Greg Kroah-Hartman <gregkh(a)linuxfoundation.org>
Linux 5.4.277-rc1
Akira Yokosawa <akiyks(a)gmail.com>
docs: kernel_include.py: Cope with docutils 0.21
Daniel Thompson <daniel.thompson(a)linaro.org>
serial: kgdboc: Fix NMI-safety problems from keyboard reset code
Heikki Krogerus <heikki.krogerus(a)linux.intel.com>
usb: typec: ucsi: displayport: Fix potential deadlock
Srinivasan Shanmugam <srinivasan.shanmugam(a)amd.com>
drm/amdgpu: Fix possible NULL dereference in amdgpu_ras_query_error_status_helper()
Dominique Martinet <dominique.martinet(a)atmark-techno.com>
btrfs: add missing mutex_unlock in btrfs_relocate_sys_chunks()
Rob Herring <robh(a)kernel.org>
arm64: dts: qcom: Fix 'interrupt-map' parent address cells
Cristian Marussi <cristian.marussi(a)arm.com>
firmware: arm_scmi: Harden accesses to the reset domains
Paulo Alcantara <pc(a)manguebit.com>
smb: client: fix potential OOBs in smb2_parse_contexts()
Doug Berger <opendmb(a)gmail.com>
net: bcmgenet: synchronize UMAC_CMD access
Doug Berger <opendmb(a)gmail.com>
net: bcmgenet: synchronize use of bcmgenet_set_rx_mode()
Doug Berger <opendmb(a)gmail.com>
net: bcmgenet: synchronize EXT_RGMII_OOB_CTRL access
Doug Berger <opendmb(a)gmail.com>
net: bcmgenet: keep MAC in reset until PHY is up
Doug Berger <opendmb(a)gmail.com>
Revert "net: bcmgenet: use RGMII loopback for MAC reset"
Harshit Mogalapalli <harshit.m.mogalapalli(a)oracle.com>
Revert "selftests: mm: fix map_hugetlb failure on 64K page size systems"
Baokun Li <libaokun1(a)huawei.com>
ext4: fix bug_on in __es_tree_search
Sergey Shtylyov <s.shtylyov(a)omp.ru>
pinctrl: core: handle radix_tree_insert() errors in pinctrl_register_one_pin()
-------------
Diffstat:
Documentation/sphinx/kernel_include.py | 1 -
Makefile | 4 +-
arch/arm64/boot/dts/qcom/msm8998.dtsi | 8 +--
drivers/firmware/arm_scmi/reset.c | 6 +-
drivers/gpu/drm/amd/amdgpu/amdgpu_ras.c | 3 +
drivers/net/ethernet/broadcom/genet/bcmgenet.c | 22 ++++--
drivers/net/ethernet/broadcom/genet/bcmgenet.h | 2 +
drivers/net/ethernet/broadcom/genet/bcmgenet_wol.c | 12 +++-
drivers/net/ethernet/broadcom/genet/bcmmii.c | 43 +++---------
drivers/pinctrl/core.c | 14 +++-
drivers/tty/serial/kgdboc.c | 30 +++++++-
drivers/usb/typec/ucsi/displayport.c | 4 --
fs/btrfs/volumes.c | 1 +
fs/cifs/smb2ops.c | 4 +-
fs/cifs/smb2pdu.c | 79 ++++++++++++++--------
fs/cifs/smb2proto.h | 10 +--
fs/ext4/extents.c | 10 +--
tools/testing/selftests/vm/map_hugetlb.c | 7 --
18 files changed, 161 insertions(+), 99 deletions(-)
This is the start of the stable review cycle for the 5.10.218 release.
There are 15 patches in this series, all will be posted as a response
to this one. If anyone has any issues with these being applied, please
let me know.
Responses should be made by Sat, 25 May 2024 13:03:15 +0000.
Anything received after that time might be too late.
The whole patch series can be found in one patch at:
https://www.kernel.org/pub/linux/kernel/v5.x/stable-review/patch-5.10.218-r…
or in the git tree and branch at:
git://git.kernel.org/pub/scm/linux/kernel/git/stable/linux-stable-rc.git linux-5.10.y
and the diffstat can be found below.
thanks,
greg k-h
-------------
Pseudo-Shortlog of commits:
Greg Kroah-Hartman <gregkh(a)linuxfoundation.org>
Linux 5.10.218-rc1
Akira Yokosawa <akiyks(a)gmail.com>
docs: kernel_include.py: Cope with docutils 0.21
Daniel Thompson <daniel.thompson(a)linaro.org>
serial: kgdboc: Fix NMI-safety problems from keyboard reset code
Heikki Krogerus <heikki.krogerus(a)linux.intel.com>
usb: typec: ucsi: displayport: Fix potential deadlock
Srinivasan Shanmugam <srinivasan.shanmugam(a)amd.com>
drm/amdgpu: Fix possible NULL dereference in amdgpu_ras_query_error_status_helper()
Dominique Martinet <dominique.martinet(a)atmark-techno.com>
btrfs: add missing mutex_unlock in btrfs_relocate_sys_chunks()
Paolo Abeni <pabeni(a)redhat.com>
mptcp: ensure snd_nxt is properly initialized on connect
Cristian Marussi <cristian.marussi(a)arm.com>
firmware: arm_scmi: Harden accesses to the reset domains
Sean Christopherson <seanjc(a)google.com>
KVM: x86: Clear "has_error_code", not "error_code", for RM exception injection
Eric Dumazet <edumazet(a)google.com>
netlink: annotate lockless accesses to nlk->max_recvmsg_len
liqiong <liqiong(a)nfschina.com>
ima: fix deadlock when traversing "ima_default_rules".
Doug Berger <opendmb(a)gmail.com>
net: bcmgenet: synchronize UMAC_CMD access
Doug Berger <opendmb(a)gmail.com>
net: bcmgenet: synchronize EXT_RGMII_OOB_CTRL access
Harshit Mogalapalli <harshit.m.mogalapalli(a)oracle.com>
Revert "selftests: mm: fix map_hugetlb failure on 64K page size systems"
Juergen Gross <jgross(a)suse.com>
x86/xen: Drop USERGS_SYSRET64 paravirt call
Sergey Shtylyov <s.shtylyov(a)omp.ru>
pinctrl: core: handle radix_tree_insert() errors in pinctrl_register_one_pin()
-------------
Diffstat:
Documentation/sphinx/kernel_include.py | 1 -
Makefile | 4 +--
arch/x86/entry/entry_64.S | 17 ++++++------
arch/x86/include/asm/irqflags.h | 7 -----
arch/x86/include/asm/paravirt.h | 5 ----
arch/x86/include/asm/paravirt_types.h | 8 ------
arch/x86/kernel/asm-offsets_64.c | 2 --
arch/x86/kernel/paravirt.c | 5 +---
arch/x86/kernel/paravirt_patch.c | 4 ---
arch/x86/kvm/x86.c | 11 ++++++--
arch/x86/xen/enlighten_pv.c | 1 -
arch/x86/xen/xen-asm.S | 21 ---------------
arch/x86/xen/xen-ops.h | 2 --
drivers/firmware/arm_scmi/reset.c | 6 ++++-
drivers/gpu/drm/amd/amdgpu/amdgpu_ras.c | 3 +++
drivers/net/ethernet/broadcom/genet/bcmgenet.c | 12 ++++++++-
drivers/net/ethernet/broadcom/genet/bcmgenet.h | 2 ++
drivers/net/ethernet/broadcom/genet/bcmgenet_wol.c | 6 +++++
drivers/net/ethernet/broadcom/genet/bcmmii.c | 4 +++
drivers/pinctrl/core.c | 14 +++++++---
drivers/tty/serial/kgdboc.c | 30 +++++++++++++++++++++-
drivers/usb/typec/ucsi/displayport.c | 4 ---
fs/btrfs/volumes.c | 1 +
net/mptcp/protocol.c | 2 ++
net/netlink/af_netlink.c | 15 ++++++-----
security/integrity/ima/ima_policy.c | 29 ++++++++++++++-------
tools/testing/selftests/vm/map_hugetlb.c | 7 -----
27 files changed, 123 insertions(+), 100 deletions(-)
We recently upgraded the view of ESR_EL2 to 64bit, in keeping with
the requirements of the architecture.
However, the AArch32 emulation code was left unaudited, and the
(already dodgy) code that triages whether a trap is spurious or not
(because the condition code failed) broke in a subtle way:
If ESR_EL2.ISS2 is ever non-zero (unlikely, but hey, this is the ARM
architecture we're talking about), the hack that tests the top bits
of ESR_EL2.EC will break in an interesting way.
Instead, use kvm_vcpu_trap_get_class() to obtain the EC, and list
all the possible ECs that can fail a condition code check.
While we're at it, add SMC32 to the list, as it is explicitly listed
as being allowed to trap despite failing a condition code check (as
described in the HCR_EL2.TSC documentation).
Fixes: 0b12620fddb8 ("KVM: arm64: Treat ESR_EL2 as a 64-bit register")
Signed-off-by: Marc Zyngier <maz(a)kernel.org>
Cc: stable(a)vger.kernel.org
---
arch/arm64/kvm/hyp/aarch32.c | 18 ++++++++++++++++--
1 file changed, 16 insertions(+), 2 deletions(-)
diff --git a/arch/arm64/kvm/hyp/aarch32.c b/arch/arm64/kvm/hyp/aarch32.c
index 8d9670e6615d..449fa58cf3b6 100644
--- a/arch/arm64/kvm/hyp/aarch32.c
+++ b/arch/arm64/kvm/hyp/aarch32.c
@@ -50,9 +50,23 @@ bool kvm_condition_valid32(const struct kvm_vcpu *vcpu)
u32 cpsr_cond;
int cond;
- /* Top two bits non-zero? Unconditional. */
- if (kvm_vcpu_get_esr(vcpu) >> 30)
+ /*
+ * These are the exception classes that could fire with a
+ * conditional instruction.
+ */
+ switch (kvm_vcpu_trap_get_class(vcpu)) {
+ case ESR_ELx_EC_CP15_32:
+ case ESR_ELx_EC_CP15_64:
+ case ESR_ELx_EC_CP14_MR:
+ case ESR_ELx_EC_CP14_LS:
+ case ESR_ELx_EC_FP_ASIMD:
+ case ESR_ELx_EC_CP10_ID:
+ case ESR_ELx_EC_CP14_64:
+ case ESR_ELx_EC_SVC32:
+ break;
+ default:
return true;
+ }
/* Is condition field valid? */
cond = kvm_vcpu_get_condition(vcpu);
--
2.39.2
> I am using vanilla Linux 6.8.10, and I've just noticed this BUG in my
dmesg log. I have no idea what triggered it, and especially since I
have not even mounted any NFS filesystems?!
Hi all,
I have the exact same bug. I'm using the NixOS kernel but as soon as it
was updated to 6.8.10 my server has gone in a crash-reboot-loop.
The server is hosting an NFS deamon and it crashes about 10 seconds
after the tty login prompt is displayed.
Dowgrading to 6.8.9 fixes the issue.
Regards,
Paul Grandperrin
Hi all,
I am running a dual Xeon machine as my personal virtualization server at home, using
Proxmox VE, and with their latest update 8.2 which brings kernel 6.8.4-2-pve, I am seeing
a serious regression which breaks my setup because it does not boot any more. The last
message I see displayed during boot is: "Timed out for waiting the udev queue being
empty.", and then it hangs indefinitely.
Previous kernel 6.5.13-5-pve worked fine, with the following caveat: I had similar
problems initially with earlier kernels too, so from the very beginning with this machine
using PVE, I had to set grub parameter rootdelay=60. With that, everything was fine, the
busses settled and RAID controller and root device was found and system booted. With the
newer 6.8.4 kernel, not any more, although I even tried to increase rootdelay parameter to
120.
I was able to reproduce and bisect this regression also with mainline kernels (also with
stable 6.8.8 and 6.9-rc), so I thought it would be a good idea to report it upstream to
you guys.
This is an older server machine: 2-socket Ivy Bridge Xeon E5-2697 v2 (24C/48T) in an Asus
Z9PE-D16/2L motherboard (Intel C-602A chipset); BIOS patched to the latest available from
Asus. All memory slots occupied, so 256 GB RAM in total. It also has Asus ASMB6 iKVM BMC,
which supplies virtual storage devices (seel below dmesg) to which ISO images can be
attached via network to boot/install OS from.
Storage config:
I have two single M4 256 GiB SATA SSD drives attached to internal mainboard SATA ports;
one of them is my root device and PVE installation drive. The other one I use for storing
ISO images. My main VM storage is attached to a battery backed-up Adaptec 5805 SATA/SAS
RAID controller (w/ latest FW build 18948) attached to SATA/SAS enclosure of my Supermicro
server casing, having eight disk drives in total: I have one RAID1 Array, consisting of
two Samsung 1 TiB SATA SSDs for VM root disk images, and one RAID5 Array, consisting of 6
Hitachi 1 TiB HDDs which I use for storing VM data disk images. On both arrays, I use a
LVM thin pool as PVE storage location. When everything boots up, the system is running
just fine and smoothly with ~15 VMs at the same time (and has for years!). Although this
is "only" a homelab server, I love it dearly and use it for many private projects VMs,
among them runing Windows Server VM with MS SQL Server, and Linux server VMs running
Oracle Database Server (I'm a database guy).
I attach dmesg output of previous working kernel 6.5.13-5-pve, my git bisect log and
output of lspci -v. The last successful kernel messages I see from the failing kernels
version is this:
...
[ 5.540424] usb-storage 1-1.3.4:1.0: USB Mass Storage device detected
[ 5.540670] scsi host10: usb-storage 1-1.3.4:1.0
[ 5.947794] scsi 8:0:0:0: CD-ROM AMI Virtual CDROM0 1.00 PQ: 0 ANSI:
0 CCS
[ 6.267830] scsi 9:0:0:0: Direct-Access AMI Virtual Floppy0 1.00 PQ: 0 ANSI:
0 CCS
[ 6.555845] scsi 10:0:0:0: Direct-Access AMI Virtual HDISK0 1.00 PQ: 0 ANSI:
0 CCS
and then the error message "Timed out for waiting the udev queue being empty." and the
system hangs. In case of working kernels, the boot process would continue with this:
...
[ 5.947794] scsi 8:0:0:0: CD-ROM AMI Virtual CDROM0 1.00 PQ: 0 ANSI:
0 CCS
[ 6.267830] scsi 9:0:0:0: Direct-Access AMI Virtual Floppy0 1.00 PQ: 0 ANSI:
0 CCS
[ 6.555845] scsi 10:0:0:0: Direct-Access AMI Virtual HDISK0 1.00 PQ: 0 ANSI:
0 CCS
[ 32.592054] scsi 0:3:1:0: Enclosure ADAPTEC Virtual SGPIO 1 0001 PQ: 0 ANSI: 5
[ 61.536097] sd 0:0:0:0: Attached scsi generic sg0 type 0
[ 61.536215] sd 0:0:0:0: [sda] 1998565376 512-byte logical blocks: (1.02 TB/953 GiB)
[ 61.536236] sd 0:0:1:0: Attached scsi generic sg1 type 0
[ 61.536239] sd 0:0:0:0: [sda] Write Protect is off
[ 61.536246] sd 0:0:0:0: [sda] Mode Sense: 12 00 10 08
[ 61.536283] sd 0:0:0:0: [sda] Write cache: disabled, read cache: enabled, supports DPO
and FUA
[ 61.536340] scsi 0:1:0:0: Attached scsi generic sg2 type 0
[ 61.536383] sd 0:0:1:0: [sdb] Very big device. Trying to use READ CAPACITY(16).
[ 61.536400] sd 0:0:1:0: [sdb] 9762222080 512-byte logical blocks: (5.00 TB/4.54 TiB)
[ 61.536414] sd 0:0:1:0: [sdb] Write Protect is off
[ 61.536418] sd 0:0:1:0: [sdb] Mode Sense: 12 00 10 08
[ 61.536439] sd 0:0:1:0: [sdb] Write cache: disabled, read cache: enabled, supports DPO
and FUA
[ 61.536455] scsi 0:1:1:0: Attached scsi generic sg3 type 0
[ 61.536616] scsi 0:1:2:0: Attached scsi generic sg4 type 0
[ 61.536750] scsi 0:1:3:0: Attached scsi generic sg5 type 0
[ 61.536840] scsi 0:1:4:0: Attached scsi generic sg6 type 0
[ 61.536930] scsi 0:1:5:0: Attached scsi generic sg7 type 0
[ 61.537027] scsi 0:1:6:0: Attached scsi generic sg8 type 0
[ 61.537122] scsi 0:1:7:0: Attached scsi generic sg9 type 0
[ 61.537248] sd 0:0:1:0: [sdb] Very big device. Trying to use READ CAPACITY(16).
[ 61.537274] scsi 0:3:0:0: Attached scsi generic sg10 type 13
[ 61.537390] scsi 0:3:1:0: Attached scsi generic sg11 type 13
[ 61.537558] scsi 1:0:0:0: Direct-Access ATA M4-CT256M4SSD2 0309 PQ: 0 ANSI: 5
[ 61.537851] sd 1:0:0:0: Attached scsi generic sg12 type 0
[ 61.537919] scsi: waiting for bus probes to complete ...
[ 61.537973] sd 1:0:0:0: [sdc] 500118192 512-byte logical blocks: (256 GB/238 GiB)
[ 61.537986] sd 1:0:0:0: [sdc] Write Protect is off
[ 61.537989] sd 1:0:0:0: [sdc] Mode Sense: 00 3a 00 00
[ 61.538002] sd 1:0:0:0: [sdc] Write cache: enabled, read cache: enabled, doesn't
support DPO or FUA
[ 61.538022] sd 1:0:0:0: [sdc] Preferred minimum I/O size 512 bytes
[ 61.538924] sdc: sdc1 sdc2 < sdc5 >
...
so it seems to me the initialiation of the the Adaptec controller is the culprit.
I have tested and reproduced the regression with mainline kernels according to the
following list (please excuse me if it's too long ;-)
See at the very bottom for first bad commit I found this way. I always built as "make
olddefconfig" using the 6.5.13-5-pve config as starting point.
-------------------------------------------------------------------
Proxmox Virtual Environmet (PVE) Kernels
========================================
6.5.13-5-pve WORKS last working PVE (8.1) kernel; 5.15-pve and 6.2-pve work too
6.8.4-2-pve NOPE PVE release 8.2
Mainline Kernels
================
6.9.0-rc6+ NOPE Most recent (2024-05-01)
6.9.0-rc5+ NOPE Most recent (2024-04-27)
6.8.8 NOPE Most recent released (2024-04-29)
6.8.7 NOPE Most recent released (2024-04-27)
6.8.4 NOPE Same version as most recent released PVE 8.2 Kernel
6.5.13 WORKS
My tests, reverts on top of 6.8.8
=================================
6.8.8+ WORKS Revert "Merge tag 'scsi-fixes' of
git://git.kernel.org/pub/scm/linux/kernel/git/jejb/scsi" - This reverts commit
6d20acbf3e3a32d331947dbc3802cf2d1a399e7d, reversing changes made to
fef85269a19d277f23fc5ff08a3c356beeb54cb3
6.8.8+ WORKS Revert "scsi: core: Consult supported VPD page list prior to
fetching page" - This reverts commit b5fc07a5fb56216a49e6c1d0b172d5464d99a89b (this is the
first bad commit of my bisect session, see below, and a single patch as part of the above
merged tag 'scsi-fixes')
Bisecting, starting from 6.9.0-rc5 (bad) and 6.5.13 (good)
==========================================================
root@linus:/usr/src/linux# git checkout master
Bereits auf 'master'
Ihr Branch ist auf demselben Stand wie 'origin/master'.
root@linus:/usr/src/linux# git log
commit 9d1ddab261f3e2af7c384dc02238784ce0cf9f98 (HEAD -> master, origin/master, origin/HEAD)
Merge: 71b1543c83d6 77d8aa79ecfb
Author: Linus Torvalds <torvalds(a)linux-foundation.org>
Date: Tue Apr 23 09:37:32 2024 -0700
Merge tag '6.9-rc5-smb-client-fixes' of git://git.samba.org/sfrench/cifs-2.6
root@linus:/usr/src/linux# cp /boot/config-6.5.13-5-pve .config
root@linus:/usr/src/linux# git bisect start
Status: warte auf guten und schlechten Commit
root@linus:/usr/src/linux# git bisect bad
Status: warte auf gute(n) Commit(s), schlechter Commit bekannt
root@linus:/usr/src/linux# git bisect good v6.5.13
Binäre Suche: eine Merge-Basis muss geprüft werden
[2dde18cd1d8fac735875f2e4987f11817cc0bc2c] Linux 6.5
root@linus:/usr/src/linux# make olddefconfig
.config:10571:warning: symbol value 'm' invalid for ANDROID_BINDER_IPC
.config:10572:warning: symbol value 'm' invalid for ANDROID_BINDERFS
#
# configuration written to .config
#
root@linus:/usr/src/linux# make -j 48
=> 6.5.0 (Merge Base) WORKS
root@linus:/usr/src/linux# git bisect good
Binäre Suche: danach noch 32111 Commits zum Testen übrig (ungefähr 15 Schritte)
[0f5cc96c367f2e780eb492cc9cab84e3b2ca88da] Merge tag 's390-6.7-3' of
git://git.kernel.org/pub/scm/linux/kernel/git/s390/linux
root@linus:/usr/src/linux# make -j 48
=> 6.7.0-rc2+ WORKS
root@linus:/usr/src/linux# git bisect good
Binäre Suche: danach noch 16056 Commits zum Testen übrig (ungefähr 14 Schritte)
[ee138217c32ccbfa75d5ea6b766158148e98f6fa] Merge tag 'btree-remove-btnum-6.9_2024-02-23'
of https://git.kernel.org/pub/scm/linux/kernel/git/djwong/xfs-linux into xfs-6.9-mergeC
=> 6.8.0-rc4+ WORKS
root@linus:/usr/src/linux# git bisect good
Binäre Suche: danach noch 8214 Commits zum Testen übrig (ungefähr 13 Schritte)
[e5e038b7ae9da96b93974bf072ca1876899a01a3] Merge tag 'fs_for_v6.9-rc1' of
git://git.kernel.org/pub/scm/linux/kernel/git/jack/linux-fs
=> 6.8.0+ NOPE => does not find root device, does not boot;
message: "BUG: arch topology borken the CPU
domain not a subset of > the NUMA domain"
message: "Timed out for waiting the udev
queue being empty."
root@linus:/usr/src/linux# git bisect bad
Binäre Suche: danach noch 3954 Commits zum Testen übrig (ungefähr 12 Schritte)
[f153fbe1ea11939e2514ba4b3b62bbd946e2892c] Merge tag 'erofs-for-6.9-rc1' of
git://git.kernel.org/pub/scm/linux/kernel/git/xiang/erofs
=> 6.8.0+ (HEAD losgelöst bei f153fbe1ea11) NOPE => same as above
root@linus:/usr/src/linux# git bisect bad
Binäre Suche: danach noch 1945 Commits zum Testen übrig (ungefähr 11 Schritte)
[1ddeeb2a058d7b2a58ed9e820396b4ceb715d529] Merge tag 'for-6.9/block-20240310' of
git://git.kernel.dk/linux
=> 6.8.0+ (HEAD losgelöst bei 1ddeeb2a058d) NOPE => same as above
root@linus:/usr/src/linux# git bisect bad
Binäre Suche: danach noch 970 Commits zum Testen übrig (ungefähr 10 Schritte)
[2652b99e43403dc464f3648483ffb38e48872fe4] ice: virtchnl: stop pretending to support RSS
over AQ or registers
=> 6.8.0-rc6+ (2652b99e4340) NOPE => same
root@linus:/usr/src/linux# git bisect bad
Binäre Suche: danach noch 506 Commits zum Testen übrig (ungefähr 9 Schritte)
[efa80dcbb7a3ecc4a1b2f54624c49b5a612f92b3] Merge tag 'trace-v6.8-rc5' of
git://git.kernel.org/pub/scm/linux/kernel/git/trace/linux-trace
=> 6.8.0-rc5+ (efa80dcbb7a3) WORKS
root@linus:/usr/src/linux# git bisect good
Binäre Suche: danach noch 251 Commits zum Testen übrig (ungefähr 8 Schritte)
[c6a597fcc7ad7335a3ecf8f5287a0459f793a257] Merge tag 'loongarch-fixes-6.8-3' of
git://git.kernel.org/pub/scm/linux/kernel/git/chenhuacai/linux-loongson
=> 6.8.0-rc5+ (c6a597fcc7ad) WORKS
root@linus:/usr/src/linux# git bisect good
Binäre Suche: danach noch 126 Commits zum Testen übrig (ungefähr 7 Schritte)
[cf1182944c7cc9f1c21a8a44e0d29abe12527412] Merge tag 'lsm-pr-20240227' of
git://git.kernel.org/pub/scm/linux/kernel/git/pcmoore/lsm
=> 6.8.0-rc6+ (cf1182944c7c) NOPE
root@linus:/usr/src/linux# git bisect bad
Binäre Suche: danach noch 62 Commits zum Testen übrig (ungefähr 6 Schritte)
[4ca0d9894fd517a2f2c0c10d26ebe99ab4396fe3] Merge tag 'erofs-for-6.8-rc6-fixes' of
git://git.kernel.org/pub/scm/linux/kernel/git/xiang/erofs
=> 6.8.0-rc5+ (4ca0d9894fd5) NOPE
root@linus:/usr/src/linux# git bisect bad
Binäre Suche: danach noch 36 Commits zum Testen übrig (ungefähr 5 Schritte)
[ac389bc0ca56e1a2f92b2a17e58298390a3879a8] Merge tag 'cxl-fixes-6.8-rc6' of
git://git.kernel.org/pub/scm/linux/kernel/git/cxl/cxl
=> 6.8.0-rc5+ (ac389bc0ca56) NOPE
root@linus:/usr/src/linux# git bisect bad
Binäre Suche: danach noch 12 Commits zum Testen übrig (ungefähr 4 Schritte)
[40de53fd002c6ba087a623722915e8006ed68a02] Merge branch 'for-6.8/cxl-cper' into for-6.8/cxl
=> 6.8.0-rc5+ (40de53fd002c) WORKS
root@linus:/usr/src/linux# git bisect good
Binäre Suche: danach noch 6 Commits zum Testen übrig (ungefähr 3 Schritte)
[9ddf190a7df77b77817f955fdb9c2ae9d1c9c9a3] scsi: jazz_esp: Only build if SCSI core is builtin
=> 6.8.0-rc1+ (9ddf190a7df7) NOPE
root@linus:/usr/src/linux# git bisect bad
Binäre Suche: danach noch 2 Commits zum Testen übrig (ungefähr 2 Schritte)
[de959094eb2197636f7c803af0943cb9d3b35804] scsi: target: pscsi: Fix bio_put() for error case
=> 6.8.0-rc1+ (de959094eb21) NOPE
root@linus:/usr/src/linux# git bisect bad
Binäre Suche: danach noch 0 Commits zum Testen übrig (ungefähr 1 Schritt)
[b5fc07a5fb56216a49e6c1d0b172d5464d99a89b] scsi: core: Consult supported VPD page list
prior to fetching page
=> 6.8.0-rc1+ (b5fc07a5fb56) NOPE
root@linus:/usr/src/linux# git bisect bad
Binäre Suche: danach noch 0 Commits zum Testen übrig (ungefähr 0 Schritte)
[321da3dc1f3c92a12e3c5da934090d2992a8814c] scsi: sd: usb_storage: uas: Access media prior
to querying device properties
=> 6.8.0-rc1+ (321da3dc1f3c) WORKS
root@linus:/usr/src/linux# git bisect good
b5fc07a5fb56216a49e6c1d0b172d5464d99a89b is the first bad commit
commit b5fc07a5fb56216a49e6c1d0b172d5464d99a89b
Author: Martin K. Petersen <martin.petersen(a)oracle.com>
Date: Wed Feb 14 17:14:11 2024 -0500
scsi: core: Consult supported VPD page list prior to fetching page
Commit c92a6b5d6335 ("scsi: core: Query VPD size before getting full
page") removed the logic which checks whether a VPD page is present on
the supported pages list before asking for the page itself. That was
done because SPC helpfully states "The Supported VPD Pages VPD page
list may or may not include all the VPD pages that are able to be
returned by the device server". Testing had revealed a few devices
that supported some of the 0xBn pages but didn't actually list them in
page 0.
Julian Sikorski bisected a problem with his drive resetting during
discovery to the commit above. As it turns out, this particular drive
firmware will crash if we attempt to fetch page 0xB9.
Various approaches were attempted to work around this. In the end,
reinstating the logic that consults VPD page 0 before fetching any
other page was the path of least resistance. A firmware update for the
devices which originally compelled us to remove the check has since
been released.
Link: https://lore.kernel.org/r/20240214221411.2888112-1-martin.petersen@oracle.c…
Fixes: c92a6b5d6335 ("scsi: core: Query VPD size before getting full page")
Cc: stable(a)vger.kernel.org
Cc: Bart Van Assche <bvanassche(a)acm.org>
Reported-by: Julian Sikorski <belegdol(a)gmail.com>
Tested-by: Julian Sikorski <belegdol(a)gmail.com>
Reviewed-by: Lee Duncan <lee.duncan(a)suse.com>
Reviewed-by: Bart Van Assche <bvanassche(a)acm.org>
Signed-off-by: Martin K. Petersen <martin.petersen(a)oracle.com>
drivers/scsi/scsi.c | 22 ++++++++++++++++++++--
include/scsi/scsi_device.h | 4 ----
2 files changed, 20 insertions(+), 6 deletions(-)
root@linus:/usr/src/linux#
-------------------------------------------------------------------
Beste Grüße,
Peter Schneider
--
Climb the mountain not to plant your flag, but to embrace the challenge,
enjoy the air and behold the view. Climb it so you can see the world,
not so the world can see you. -- David McCullough Jr.
OpenPGP: 0xA3828BD796CCE11A8CADE8866E3A92C92C3FF244
Download: https://www.peters-netzplatz.de/download/pschneider1968_pub.aschttps://keys.mailvelope.com/pks/lookup?op=get&search=pschneider1968@googlem…https://keys.mailvelope.com/pks/lookup?op=get&search=pschneider1968@gmail.c…
On Fri, May 24, 2024 at 01:07:18AM +0000, Lin Gui (桂林) wrote:
> Dear @Greg KH<mailto:gregkh@linuxfoundation.org>,
>
> Base : kernel-5.15.159
>
> diff --git a/drivers/mmc/core/mmc.c b/drivers/mmc/core/mmc.c
> index a569066..d656964 100644
> --- a/drivers/mmc/core/mmc.c
> +++ b/drivers/mmc/core/mmc.c
> @@ -1800,7 +1800,13 @@ static int mmc_init_card(struct mmc_host *host, u32 ocr,
> if (err)
> goto free_card;
>
> - } else if (!mmc_card_hs400es(card)) {
> + } else if (mmc_card_hs400es(card)){
> + if (host->ops->execute_hs400_tuning) {
> + err = host->ops->execute_hs400_tuning(host, card);
> + if (err)
> + goto free_card;
> + }
> + } else {
> /* Select the desired bus width optionally */
> err = mmc_select_bus_width(card);
> if (err > 0 && mmc_card_hs(card)) {
>
The patch is corrupted, and sent in html format.
But most importantly, you did not test this to verify it works at all,
which means that you don't really need it?
confused,
greg k-h
From: Jeff Xu <jeffxu(a)google.com>
Add documentation for MFD_NOEXEC_SEAL and MFD_EXEC
Cc: stable(a)vger.kernel.org
Signed-off-by: Jeff Xu <jeffxu(a)google.com>
---
Documentation/userspace-api/index.rst | 1 +
Documentation/userspace-api/mfd_noexec.rst | 90 ++++++++++++++++++++++
2 files changed, 91 insertions(+)
create mode 100644 Documentation/userspace-api/mfd_noexec.rst
diff --git a/Documentation/userspace-api/index.rst b/Documentation/userspace-api/index.rst
index 5926115ec0ed..8a251d71fa6e 100644
--- a/Documentation/userspace-api/index.rst
+++ b/Documentation/userspace-api/index.rst
@@ -32,6 +32,7 @@ Security-related interfaces
seccomp_filter
landlock
lsm
+ mfd_noexec
spec_ctrl
tee
diff --git a/Documentation/userspace-api/mfd_noexec.rst b/Documentation/userspace-api/mfd_noexec.rst
new file mode 100644
index 000000000000..6f11ad86b076
--- /dev/null
+++ b/Documentation/userspace-api/mfd_noexec.rst
@@ -0,0 +1,90 @@
+.. SPDX-License-Identifier: GPL-2.0
+
+==================================
+Introduction of non executable mfd
+==================================
+:Author:
+ Daniel Verkamp <dverkamp(a)chromium.org>
+ Jeff Xu <jeffxu(a)google.com>
+
+:Contributor:
+ Aleksa Sarai <cyphar(a)cyphar.com>
+ Barnabás Pőcze <pobrn(a)protonmail.com>
+ David Rheinsberg <david(a)readahead.eu>
+
+Since Linux introduced the memfd feature, memfd have always had their
+execute bit set, and the memfd_create() syscall doesn't allow setting
+it differently.
+
+However, in a secure by default system, such as ChromeOS, (where all
+executables should come from the rootfs, which is protected by Verified
+boot), this executable nature of memfd opens a door for NoExec bypass
+and enables “confused deputy attack”. E.g, in VRP bug [1]: cros_vm
+process created a memfd to share the content with an external process,
+however the memfd is overwritten and used for executing arbitrary code
+and root escalation. [2] lists more VRP in this kind.
+
+On the other hand, executable memfd has its legit use, runc uses memfd’s
+seal and executable feature to copy the contents of the binary then
+execute them, for such system, we need a solution to differentiate runc's
+use of executable memfds and an attacker's [3].
+
+To address those above.
+ - Let memfd_create() set X bit at creation time.
+ - Let memfd be sealed for modifying X bit when NX is set.
+ - A new pid namespace sysctl: vm.memfd_noexec to help applications to
+ migrating and enforcing non-executable MFD.
+
+User API
+========
+``int memfd_create(const char *name, unsigned int flags)``
+
+``MFD_NOEXEC_SEAL``
+ When MFD_NOEXEC_SEAL bit is set in the ``flags``, memfd is created
+ with NX. F_SEAL_EXEC is set and the memfd can't be modified to
+ add X later.
+ This is the most common case for the application to use memfd.
+
+``MFD_EXEC``
+ When MFD_EXEC bit is set in the ``flags``, memfd is created with X.
+
+Note:
+ ``MFD_NOEXEC_SEAL`` and ``MFD_EXEC`` doesn't change the sealable
+ characteristic of memfd, which is controlled by ``MFD_ALLOW_SEALING``.
+
+
+Sysctl:
+========
+``pid namespaced sysctl vm.memfd_noexec``
+
+The new pid namespaced sysctl vm.memfd_noexec has 3 values:
+
+ - 0: MEMFD_NOEXEC_SCOPE_EXEC
+ memfd_create() without MFD_EXEC nor MFD_NOEXEC_SEAL acts like
+ MFD_EXEC was set.
+
+ - 1: MEMFD_NOEXEC_SCOPE_NOEXEC_SEAL
+ memfd_create() without MFD_EXEC nor MFD_NOEXEC_SEAL acts like
+ MFD_NOEXEC_SEAL was set.
+
+ - 2: MEMFD_NOEXEC_SCOPE_NOEXEC_ENFORCED
+ memfd_create() without MFD_NOEXEC_SEAL will be rejected.
+
+The sysctl allows finer control of memfd_create for old-software that
+doesn't set the executable bit, for example, a container with
+vm.memfd_noexec=1 means the old-software will create non-executable memfd
+by default while new-software can create executable memfd by setting
+MFD_EXEC.
+
+The value of memfd_noexec is passed to child namespace at creation time,
+in addition, the setting is hierarchical, i.e. during memfd_create,
+we will search from current ns to root ns and use the most restrictive
+setting.
+
+Reference:
+==========
+[1] https://crbug.com/1305267
+
+[2] https://bugs.chromium.org/p/chromium/issues/list?q=type%3Dbug-security%20me…
+
+[3] https://lwn.net/Articles/781013/
--
2.45.1.288.g0e0cd299f1-goog
Hello José,
I'm testing on the 6.6 kernel with a "0b95:1790 ASIX Electronics Corp.
AX88179 Gigabit Ethernet" device.
after applying commit 56f78615bcb1 ("net: usb: ax88179_178a: avoid
writing the mac address before first reading")
the network will no longer work after brining the device down.
After plugging in the device, it generally will work with ifconfig:
$ ifconfig eth0 <ip address>
However, if I then try bringing the devcie down and back up, it no longer works.
$ ifconfig eth0 down
$ ifconfig eth0 <ip address>
$ ethtool eth0 | grep detected
Link detected: no
The link will continue to report as undetected.
If I revert 56f78615bcb1 the device will work after bringing it down
and back up.
If I build at commit d7a319889498 ("net: usb: ax88179_178a: avoid two
consecutive device resets") and its
parent d7a319889498^ these also work.
Is this something you have seen before with your test devices?
Regards,
Jeff
Hi all,
This series fixed some issues on bootloader - kernel
interface.
The first two fixed booting with devicetree, the last two
enhanced kernel's tolerance on different bootloader implementation.
Please review.
Thanks
Signed-off-by: Jiaxun Yang <jiaxun.yang(a)flygoat.com>
---
Jiaxun Yang (4):
LoongArch: Fix built-in DTB detection
LoongArch: smp: Add all CPUs enabled by fdt to NUMA node 0
LoongArch: Fix entry point in image header
LoongArch: Clear higher address bits in JUMP_VIRT_ADDR
arch/loongarch/include/asm/stackframe.h | 4 +++-
arch/loongarch/kernel/head.S | 2 +-
arch/loongarch/kernel/setup.c | 6 ++++--
arch/loongarch/kernel/smp.c | 5 ++++-
4 files changed, 12 insertions(+), 5 deletions(-)
---
base-commit: 124cfbcd6d185d4f50be02d5f5afe61578916773
change-id: 20240521-loongarch-booting-fixes-366e13e7ca55
Best regards,
--
Jiaxun Yang <jiaxun.yang(a)flygoat.com>
From: Jorge Ramirez-Ortiz <jorge(a)foundries.io>
commit 67380251e8bbd3302c64fea07f95c31971b91c22 upstream
Requesting a retune before switching to the RPMB partition has been
observed to cause CRC errors on the RPMB reads (-EILSEQ).
Since RPMB reads can not be retried, the clients would be directly
affected by the errors.
This commit disables the retune request prior to switching to the RPMB
partition: mmc_retune_pause() no longer triggers a retune before the
pause period begins.
This was verified with the sdhci-of-arasan driver (ZynqMP) configured
for HS200 using two separate eMMC cards (DG4064 and 064GB2). In both
cases, the error was easy to reproduce triggering every few tenths of
reads.
With this commit, systems that were utilizing OP-TEE to access RPMB
variables will experience an enhanced performance. Specifically, when
OP-TEE is configured to employ RPMB as a secure storage solution, it not
only writes the data but also the secure filesystem within the
partition. As a result, retrieving any variable involves multiple RPMB
reads, typically around five.
For context, on ZynqMP, each retune request consumed approximately
8ms. Consequently, reading any RPMB variable used to take at the very
minimum 40ms.
After droping the need to retune before switching to the RPMB partition,
this is no longer the case.
Signed-off-by: Jorge Ramirez-Ortiz <jorge(a)foundries.io>
Acked-by: Avri Altman <avri.altman(a)wdc.com>
Acked-by: Adrian Hunter <adrian.hunter(a)intel.com>
Link: https://lore.kernel.org/r/20240103112911.2954632-1-jorge@foundries.io
Signed-off-by: Ulf Hansson <ulf.hansson(a)linaro.org>
Signed-off-by: Florian Fainelli <florian.fainelli(a)broadcom.com>
---
drivers/mmc/core/host.c | 3 +--
1 file changed, 1 insertion(+), 2 deletions(-)
diff --git a/drivers/mmc/core/host.c b/drivers/mmc/core/host.c
index 3e94401c0eb3..23d95d2bdf05 100644
--- a/drivers/mmc/core/host.c
+++ b/drivers/mmc/core/host.c
@@ -68,13 +68,12 @@ void mmc_retune_enable(struct mmc_host *host)
/*
* Pause re-tuning for a small set of operations. The pause begins after the
- * next command and after first doing re-tuning.
+ * next command.
*/
void mmc_retune_pause(struct mmc_host *host)
{
if (!host->retune_paused) {
host->retune_paused = 1;
- mmc_retune_needed(host);
mmc_retune_hold(host);
}
}
--
2.34.1
From: "Steven Rostedt (Google)" <rostedt(a)goodmis.org>
When the inode is being dropped from the dentry, the TRACEFS_EVENT_INODE
flag needs to be cleared to prevent a remount from calling
eventfs_remount() on the tracefs_inode private data. There's a race
between the inode is dropped (and the dentry freed) to where the inode is
actually freed. If a remount happens between the two, the eventfs_inode
could be accessed after it is freed (only the dentry keeps a ref count on
it).
Currently the TRACEFS_EVENT_INODE flag is cleared from the dentry iput()
function. But this is incorrect, as it is possible that the inode has
another reference to it. The flag should only be cleared when the inode is
really being dropped and has no more references. That happens in the
drop_inode callback of the inode, as that gets called when the last
reference of the inode is released.
Remove the tracefs_d_iput() function and move its logic to the more
appropriate tracefs_drop_inode() callback function.
Link: https://lore.kernel.org/linux-trace-kernel/20240523051539.908205106@goodmis…
Cc: stable(a)vger.kernel.org
Cc: Masami Hiramatsu <mhiramat(a)kernel.org>
Cc: Mark Rutland <mark.rutland(a)arm.com>
Cc: Mathieu Desnoyers <mathieu.desnoyers(a)efficios.com>
Cc: Andrew Morton <akpm(a)linux-foundation.org>
Cc: Masahiro Yamada <masahiroy(a)kernel.org>
Fixes: baa23a8d4360d ("tracefs: Reset permissions on remount if permissions are options")
Signed-off-by: Steven Rostedt (Google) <rostedt(a)goodmis.org>
---
fs/tracefs/inode.c | 33 +++++++++++++++++----------------
1 file changed, 17 insertions(+), 16 deletions(-)
diff --git a/fs/tracefs/inode.c b/fs/tracefs/inode.c
index 9252e0d78ea2..7c29f4afc23d 100644
--- a/fs/tracefs/inode.c
+++ b/fs/tracefs/inode.c
@@ -426,10 +426,26 @@ static int tracefs_show_options(struct seq_file *m, struct dentry *root)
return 0;
}
+static int tracefs_drop_inode(struct inode *inode)
+{
+ struct tracefs_inode *ti = get_tracefs(inode);
+
+ /*
+ * This inode is being freed and cannot be used for
+ * eventfs. Clear the flag so that it doesn't call into
+ * eventfs during the remount flag updates. The eventfs_inode
+ * gets freed after an RCU cycle, so the content will still
+ * be safe if the iteration is going on now.
+ */
+ ti->flags &= ~TRACEFS_EVENT_INODE;
+
+ return 1;
+}
+
static const struct super_operations tracefs_super_operations = {
.alloc_inode = tracefs_alloc_inode,
.free_inode = tracefs_free_inode,
- .drop_inode = generic_delete_inode,
+ .drop_inode = tracefs_drop_inode,
.statfs = simple_statfs,
.show_options = tracefs_show_options,
};
@@ -455,22 +471,7 @@ static int tracefs_d_revalidate(struct dentry *dentry, unsigned int flags)
return !(ei && ei->is_freed);
}
-static void tracefs_d_iput(struct dentry *dentry, struct inode *inode)
-{
- struct tracefs_inode *ti = get_tracefs(inode);
-
- /*
- * This inode is being freed and cannot be used for
- * eventfs. Clear the flag so that it doesn't call into
- * eventfs during the remount flag updates. The eventfs_inode
- * gets freed after an RCU cycle, so the content will still
- * be safe if the iteration is going on now.
- */
- ti->flags &= ~TRACEFS_EVENT_INODE;
-}
-
static const struct dentry_operations tracefs_dentry_operations = {
- .d_iput = tracefs_d_iput,
.d_revalidate = tracefs_d_revalidate,
.d_release = tracefs_d_release,
};
--
2.43.0
From: "Steven Rostedt (Google)" <rostedt(a)goodmis.org>
The change to update the permissions of the eventfs_inode had the
misconception that using the tracefs_inode would find all the
eventfs_inodes that have been updated and reset them on remount.
The problem with this approach is that the eventfs_inodes are freed when
they are no longer used (basically the reason the eventfs system exists).
When they are freed, the updated eventfs_inodes are not reset on a remount
because their tracefs_inodes have been freed.
Instead, since the events directory eventfs_inode always has a
tracefs_inode pointing to it (it is not freed when finished), and the
events directory has a link to all its children, have the
eventfs_remount() function only operate on the events eventfs_inode and
have it descend into its children updating their uid and gids.
Link: https://lore.kernel.org/all/CAK7LNARXgaWw3kH9JgrnH4vK6fr8LDkNKf3wq8NhMWJrVw…
Link: https://lore.kernel.org/linux-trace-kernel/20240523051539.754424703@goodmis…
Cc: stable(a)vger.kernel.org
Cc: Masami Hiramatsu <mhiramat(a)kernel.org>
Cc: Mark Rutland <mark.rutland(a)arm.com>
Cc: Mathieu Desnoyers <mathieu.desnoyers(a)efficios.com>
Cc: Andrew Morton <akpm(a)linux-foundation.org>
Fixes: baa23a8d4360d ("tracefs: Reset permissions on remount if permissions are options")
Reported-by: Masahiro Yamada <masahiroy(a)kernel.org>
Signed-off-by: Steven Rostedt (Google) <rostedt(a)goodmis.org>
---
fs/tracefs/event_inode.c | 44 ++++++++++++++++++++++++++++------------
1 file changed, 31 insertions(+), 13 deletions(-)
diff --git a/fs/tracefs/event_inode.c b/fs/tracefs/event_inode.c
index 5dfb1ccd56ea..129d0f54ba62 100644
--- a/fs/tracefs/event_inode.c
+++ b/fs/tracefs/event_inode.c
@@ -305,27 +305,27 @@ static const struct file_operations eventfs_file_operations = {
.llseek = generic_file_llseek,
};
-/*
- * On a remount of tracefs, if UID or GID options are set, then
- * the mount point inode permissions should be used.
- * Reset the saved permission flags appropriately.
- */
-void eventfs_remount(struct tracefs_inode *ti, bool update_uid, bool update_gid)
+static void eventfs_set_attrs(struct eventfs_inode *ei, bool update_uid, kuid_t uid,
+ bool update_gid, kgid_t gid, int level)
{
- struct eventfs_inode *ei = ti->private;
+ struct eventfs_inode *ei_child;
- if (!ei)
+ /* Update events/<system>/<event> */
+ if (WARN_ON_ONCE(level > 3))
return;
if (update_uid) {
ei->attr.mode &= ~EVENTFS_SAVE_UID;
- ei->attr.uid = ti->vfs_inode.i_uid;
+ ei->attr.uid = uid;
}
-
if (update_gid) {
ei->attr.mode &= ~EVENTFS_SAVE_GID;
- ei->attr.gid = ti->vfs_inode.i_gid;
+ ei->attr.gid = gid;
+ }
+
+ list_for_each_entry(ei_child, &ei->children, list) {
+ eventfs_set_attrs(ei_child, update_uid, uid, update_gid, gid, level + 1);
}
if (!ei->entry_attrs)
@@ -334,13 +334,31 @@ void eventfs_remount(struct tracefs_inode *ti, bool update_uid, bool update_gid)
for (int i = 0; i < ei->nr_entries; i++) {
if (update_uid) {
ei->entry_attrs[i].mode &= ~EVENTFS_SAVE_UID;
- ei->entry_attrs[i].uid = ti->vfs_inode.i_uid;
+ ei->entry_attrs[i].uid = uid;
}
if (update_gid) {
ei->entry_attrs[i].mode &= ~EVENTFS_SAVE_GID;
- ei->entry_attrs[i].gid = ti->vfs_inode.i_gid;
+ ei->entry_attrs[i].gid = gid;
}
}
+
+}
+
+/*
+ * On a remount of tracefs, if UID or GID options are set, then
+ * the mount point inode permissions should be used.
+ * Reset the saved permission flags appropriately.
+ */
+void eventfs_remount(struct tracefs_inode *ti, bool update_uid, bool update_gid)
+{
+ struct eventfs_inode *ei = ti->private;
+
+ /* Only the events directory does the updates */
+ if (!ei || !ei->is_events || ei->is_freed)
+ return;
+
+ eventfs_set_attrs(ei, update_uid, ti->vfs_inode.i_uid,
+ update_gid, ti->vfs_inode.i_gid, 0);
}
/* Return the evenfs_inode of the "events" directory */
--
2.43.0
From: "Steven Rostedt (Google)" <rostedt(a)goodmis.org>
The directories require unique inode numbers but all the eventfs files
have the same inode number. Prevent the directories from having the same
inode numbers as the files as that can confuse some tooling.
Link: https://lore.kernel.org/linux-trace-kernel/20240523051539.428826685@goodmis…
Cc: stable(a)vger.kernel.org
Cc: Masami Hiramatsu <mhiramat(a)kernel.org>
Cc: Mark Rutland <mark.rutland(a)arm.com>
Cc: Mathieu Desnoyers <mathieu.desnoyers(a)efficios.com>
Cc: Andrew Morton <akpm(a)linux-foundation.org>
Cc: Masahiro Yamada <masahiroy(a)kernel.org>
Fixes: 834bf76add3e6 ("eventfs: Save directory inodes in the eventfs_inode structure")
Signed-off-by: Steven Rostedt (Google) <rostedt(a)goodmis.org>
---
fs/tracefs/event_inode.c | 6 +++++-
1 file changed, 5 insertions(+), 1 deletion(-)
diff --git a/fs/tracefs/event_inode.c b/fs/tracefs/event_inode.c
index 0256afdd4acf..55a40a730b10 100644
--- a/fs/tracefs/event_inode.c
+++ b/fs/tracefs/event_inode.c
@@ -50,8 +50,12 @@ static struct eventfs_root_inode *get_root_inode(struct eventfs_inode *ei)
/* Just try to make something consistent and unique */
static int eventfs_dir_ino(struct eventfs_inode *ei)
{
- if (!ei->ino)
+ if (!ei->ino) {
ei->ino = get_next_ino();
+ /* Must not have the file inode number */
+ if (ei->ino == EVENTFS_FILE_INODE_INO)
+ ei->ino = get_next_ino();
+ }
return ei->ino;
}
--
2.43.0
The following commit has been merged into the irq/urgent branch of tip:
Commit-ID: a6c11c0a5235fb144a65e0cb2ffd360ddc1f6c32
Gitweb: https://git.kernel.org/tip/a6c11c0a5235fb144a65e0cb2ffd360ddc1f6c32
Author: Dongli Zhang <dongli.zhang(a)oracle.com>
AuthorDate: Wed, 22 May 2024 15:02:18 -07:00
Committer: Thomas Gleixner <tglx(a)linutronix.de>
CommitterDate: Thu, 23 May 2024 21:51:50 +02:00
genirq/cpuhotplug, x86/vector: Prevent vector leak during CPU offline
The absence of IRQD_MOVE_PCNTXT prevents immediate effectiveness of
interrupt affinity reconfiguration via procfs. Instead, the change is
deferred until the next instance of the interrupt being triggered on the
original CPU.
When the interrupt next triggers on the original CPU, the new affinity is
enforced within __irq_move_irq(). A vector is allocated from the new CPU,
but the old vector on the original CPU remains and is not immediately
reclaimed. Instead, apicd->move_in_progress is flagged, and the reclaiming
process is delayed until the next trigger of the interrupt on the new CPU.
Upon the subsequent triggering of the interrupt on the new CPU,
irq_complete_move() adds a task to the old CPU's vector_cleanup list if it
remains online. Subsequently, the timer on the old CPU iterates over its
vector_cleanup list, reclaiming old vectors.
However, a rare scenario arises if the old CPU is outgoing before the
interrupt triggers again on the new CPU.
In that case irq_force_complete_move() is not invoked on the outgoing CPU
to reclaim the old apicd->prev_vector because the interrupt isn't currently
affine to the outgoing CPU, and irq_needs_fixup() returns false. Even
though __vector_schedule_cleanup() is later called on the new CPU, it
doesn't reclaim apicd->prev_vector; instead, it simply resets both
apicd->move_in_progress and apicd->prev_vector to 0.
As a result, the vector remains unreclaimed in vector_matrix, leading to a
CPU vector leak.
To address this issue, move the invocation of irq_force_complete_move()
before the irq_needs_fixup() call to reclaim apicd->prev_vector, if the
interrupt is currently or used to be affine to the outgoing CPU.
Additionally, reclaim the vector in __vector_schedule_cleanup() as well,
following a warning message, although theoretically it should never see
apicd->move_in_progress with apicd->prev_cpu pointing to an offline CPU.
Fixes: f0383c24b485 ("genirq/cpuhotplug: Add support for cleaning up move in progress")
Signed-off-by: Dongli Zhang <dongli.zhang(a)oracle.com>
Signed-off-by: Thomas Gleixner <tglx(a)linutronix.de>
Cc: stable(a)vger.kernel.org
Link: https://lore.kernel.org/r/20240522220218.162423-1-dongli.zhang@oracle.com
---
arch/x86/kernel/apic/vector.c | 9 ++++++---
kernel/irq/cpuhotplug.c | 16 ++++++++--------
2 files changed, 14 insertions(+), 11 deletions(-)
diff --git a/arch/x86/kernel/apic/vector.c b/arch/x86/kernel/apic/vector.c
index 9eec529..5573181 100644
--- a/arch/x86/kernel/apic/vector.c
+++ b/arch/x86/kernel/apic/vector.c
@@ -1035,7 +1035,8 @@ static void __vector_schedule_cleanup(struct apic_chip_data *apicd)
add_timer_on(&cl->timer, cpu);
}
} else {
- apicd->prev_vector = 0;
+ pr_warn("IRQ %u schedule cleanup for offline CPU %u\n", apicd->irq, cpu);
+ free_moved_vector(apicd);
}
raw_spin_unlock(&vector_lock);
}
@@ -1072,6 +1073,7 @@ void irq_complete_move(struct irq_cfg *cfg)
*/
void irq_force_complete_move(struct irq_desc *desc)
{
+ unsigned int cpu = smp_processor_id();
struct apic_chip_data *apicd;
struct irq_data *irqd;
unsigned int vector;
@@ -1096,10 +1098,11 @@ void irq_force_complete_move(struct irq_desc *desc)
goto unlock;
/*
- * If prev_vector is empty, no action required.
+ * If prev_vector is empty or the descriptor is neither currently
+ * nor previously on the outgoing CPU no action required.
*/
vector = apicd->prev_vector;
- if (!vector)
+ if (!vector || (apicd->cpu != cpu && apicd->prev_cpu != cpu))
goto unlock;
/*
diff --git a/kernel/irq/cpuhotplug.c b/kernel/irq/cpuhotplug.c
index 75cadbc..eb86283 100644
--- a/kernel/irq/cpuhotplug.c
+++ b/kernel/irq/cpuhotplug.c
@@ -70,6 +70,14 @@ static bool migrate_one_irq(struct irq_desc *desc)
}
/*
+ * Complete an eventually pending irq move cleanup. If this
+ * interrupt was moved in hard irq context, then the vectors need
+ * to be cleaned up. It can't wait until this interrupt actually
+ * happens and this CPU was involved.
+ */
+ irq_force_complete_move(desc);
+
+ /*
* No move required, if:
* - Interrupt is per cpu
* - Interrupt is not started
@@ -88,14 +96,6 @@ static bool migrate_one_irq(struct irq_desc *desc)
}
/*
- * Complete an eventually pending irq move cleanup. If this
- * interrupt was moved in hard irq context, then the vectors need
- * to be cleaned up. It can't wait until this interrupt actually
- * happens and this CPU was involved.
- */
- irq_force_complete_move(desc);
-
- /*
* If there is a setaffinity pending, then try to reuse the pending
* mask, so the last change of the affinity does not get lost. If
* there is no move pending or the pending mask does not contain
The patch titled
Subject: selftest: mm: Test if hugepage does not get leaked during __bio_release_pages()
has been added to the -mm mm-unstable branch. Its filename is
selftest-mm-test-if-hugepage-does-not-get-leaked-during-__bio_release_pages.patch
This patch will shortly appear at
https://git.kernel.org/pub/scm/linux/kernel/git/akpm/25-new.git/tree/patche…
This patch will later appear in the mm-unstable branch at
git://git.kernel.org/pub/scm/linux/kernel/git/akpm/mm
Before you just go and hit "reply", please:
a) Consider who else should be cc'ed
b) Prefer to cc a suitable mailing list as well
c) Ideally: find the original patch on the mailing list and do a
reply-to-all to that, adding suitable additional cc's
*** Remember to use Documentation/process/submit-checklist.rst when testing your code ***
The -mm tree is included into linux-next via the mm-everything
branch at git://git.kernel.org/pub/scm/linux/kernel/git/akpm/mm
and is updated there every 2-3 working days
------------------------------------------------------
From: Donet Tom <donettom(a)linux.ibm.com>
Subject: selftest: mm: Test if hugepage does not get leaked during __bio_release_pages()
Date: Thu, 23 May 2024 01:39:05 -0500
Commit 1b151e2435fc ("block: Remove special-casing of compound pages")
caused a change in behaviour when releasing the pages if the buffer does
not start at the beginning of the page. This was because the calculation
of the number of pages to release was incorrect. This was fixed by commit
38b43539d64b ("block: Fix page refcounts for unaligned buffers in
__bio_release_pages()").
We pin the user buffer during direct I/O writes. If this buffer is a
hugepage, bio_release_page() will unpin it and decrement all references
and pin counts at ->bi_end_io. However, if any references to the hugepage
remain post-I/O, the hugepage will not be freed upon unmap, leading to a
memory leak.
This patch verifies that a hugepage, used as a user buffer for DIO
operations, is correctly freed upon unmapping, regardless of whether the
offsets are aligned or unaligned w.r.t page boundary.
Test Result Fail Scenario (Without the fix)
--------------------------------------------------------
[]# ./hugetlb_dio
TAP version 13
1..4
No. Free pages before allocation : 7
No. Free pages after munmap : 7
ok 1 : Huge pages freed successfully !
No. Free pages before allocation : 7
No. Free pages after munmap : 7
ok 2 : Huge pages freed successfully !
No. Free pages before allocation : 7
No. Free pages after munmap : 7
ok 3 : Huge pages freed successfully !
No. Free pages before allocation : 7
No. Free pages after munmap : 6
not ok 4 : Huge pages not freed!
Totals: pass:3 fail:1 xfail:0 xpass:0 skip:0 error:0
Test Result PASS Scenario (With the fix)
---------------------------------------------------------
[]#./hugetlb_dio
TAP version 13
1..4
No. Free pages before allocation : 7
No. Free pages after munmap : 7
ok 1 : Huge pages freed successfully !
No. Free pages before allocation : 7
No. Free pages after munmap : 7
ok 2 : Huge pages freed successfully !
No. Free pages before allocation : 7
No. Free pages after munmap : 7
ok 3 : Huge pages freed successfully !
No. Free pages before allocation : 7
No. Free pages after munmap : 7
ok 4 : Huge pages freed successfully !
Totals: pass:4 fail:0 xfail:0 xpass:0 skip:0 error:0
Link: https://lkml.kernel.org/r/20240523063905.3173-1-donettom@linux.ibm.com
Fixes: 38b43539d64b ("block: Fix page refcounts for unaligned buffers in __bio_release_pages()")
Signed-off-by: Donet Tom <donettom(a)linux.ibm.com>
Signed-off-by: Ritesh Harjani (IBM) <ritesh.list(a)gmail.com>
Cc: David Hildenbrand <david(a)redhat.com>
Cc: Matthew Wilcox (Oracle) <willy(a)infradead.org>
Cc: Mike Rapoport (IBM) <rppt(a)kernel.org>
Cc: Muchun Song <songmuchun(a)bytedance.com>
Cc: Ritesh Harjani (IBM) <ritesh.list(a)gmail.com>
Cc: Shuah Khan <shuah(a)kernel.org>
Cc: Tony Battersby <tonyb(a)cybernetics.com>
Cc: Jens Axboe <axboe(a)kernel.dk>
Cc: <stable(a)vger.kernel.org>
Signed-off-by: Andrew Morton <akpm(a)linux-foundation.org>
---
tools/testing/selftests/mm/Makefile | 1
tools/testing/selftests/mm/hugetlb_dio.c | 118 +++++++++++++++++++++
2 files changed, 119 insertions(+)
--- /dev/null
+++ a/tools/testing/selftests/mm/hugetlb_dio.c
@@ -0,0 +1,118 @@
+// SPDX-License-Identifier: GPL-2.0
+/*
+ * This program tests for hugepage leaks after DIO writes to a file using a
+ * hugepage as the user buffer. During DIO, the user buffer is pinned and
+ * should be properly unpinned upon completion. This patch verifies that the
+ * kernel correctly unpins the buffer at DIO completion for both aligned and
+ * unaligned user buffer offsets (w.r.t page boundary), ensuring the hugepage
+ * is freed upon unmapping.
+ */
+
+#define _GNU_SOURCE
+#include <stdio.h>
+#include <sys/stat.h>
+#include <stdlib.h>
+#include <fcntl.h>
+#include <stdint.h>
+#include <unistd.h>
+#include <string.h>
+#include <sys/mman.h>
+#include "vm_util.h"
+#include "../kselftest.h"
+
+void run_dio_using_hugetlb(unsigned int start_off, unsigned int end_off)
+{
+ int fd;
+ char *buffer = NULL;
+ char *orig_buffer = NULL;
+ size_t h_pagesize = 0;
+ size_t writesize;
+ int free_hpage_b = 0;
+ int free_hpage_a = 0;
+
+ writesize = end_off - start_off;
+
+ /* Get the default huge page size */
+ h_pagesize = default_huge_page_size();
+ if (!h_pagesize)
+ ksft_exit_fail_msg("Unable to determine huge page size\n");
+
+ /* Open the file to DIO */
+ fd = open("/tmp", O_TMPFILE | O_RDWR | O_DIRECT);
+ if (fd < 0)
+ ksft_exit_fail_msg("Error opening file");
+
+ /* Get the free huge pages before allocation */
+ free_hpage_b = get_free_hugepages();
+ if (free_hpage_b == 0) {
+ close(fd);
+ ksft_exit_skip("No free hugepage, exiting!\n");
+ }
+
+ /* Allocate a hugetlb page */
+ orig_buffer = mmap(NULL, h_pagesize, PROT_READ | PROT_WRITE, MAP_PRIVATE
+ | MAP_ANONYMOUS | MAP_HUGETLB, -1, 0);
+ if (orig_buffer == MAP_FAILED) {
+ close(fd);
+ ksft_exit_fail_msg("Error mapping memory");
+ }
+ buffer = orig_buffer;
+ buffer += start_off;
+
+ memset(buffer, 'A', writesize);
+
+ /* Write the buffer to the file */
+ if (write(fd, buffer, writesize) != (writesize)) {
+ munmap(orig_buffer, h_pagesize);
+ close(fd);
+ ksft_exit_fail_msg("Error writing to file");
+ }
+
+ /* unmap the huge page */
+ munmap(orig_buffer, h_pagesize);
+ close(fd);
+
+ /* Get the free huge pages after unmap*/
+ free_hpage_a = get_free_hugepages();
+
+ /*
+ * If the no. of free hugepages before allocation and after unmap does
+ * not match - that means there could still be a page which is pinned.
+ */
+ if (free_hpage_a != free_hpage_b) {
+ printf("No. Free pages before allocation : %d\n", free_hpage_b);
+ printf("No. Free pages after munmap : %d\n", free_hpage_a);
+ ksft_test_result_fail(": Huge pages not freed!\n");
+ } else {
+ printf("No. Free pages before allocation : %d\n", free_hpage_b);
+ printf("No. Free pages after munmap : %d\n", free_hpage_a);
+ ksft_test_result_pass(": Huge pages freed successfully !\n");
+ }
+}
+
+int main(void)
+{
+ size_t pagesize = 0;
+
+ ksft_print_header();
+ ksft_set_plan(4);
+
+ /* Get base page size */
+ pagesize = psize();
+
+ /* start and end is aligned to pagesize */
+ run_dio_using_hugetlb(0, (pagesize * 3));
+
+ /* start is aligned but end is not aligned */
+ run_dio_using_hugetlb(0, (pagesize * 3) - (pagesize / 2));
+
+ /* start is unaligned and end is aligned */
+ run_dio_using_hugetlb(pagesize / 2, (pagesize * 3));
+
+ /* both start and end are unaligned */
+ run_dio_using_hugetlb(pagesize / 2, (pagesize * 3) + (pagesize / 2));
+
+ ksft_finished();
+ return 0;
+}
+
--- a/tools/testing/selftests/mm/Makefile~selftest-mm-test-if-hugepage-does-not-get-leaked-during-__bio_release_pages
+++ a/tools/testing/selftests/mm/Makefile
@@ -71,6 +71,7 @@ TEST_GEN_FILES += ksm_functional_tests
TEST_GEN_FILES += mdwe_test
TEST_GEN_FILES += hugetlb_fault_after_madv
TEST_GEN_FILES += hugetlb_madv_vs_map
+TEST_GEN_FILES += hugetlb_dio
ifneq ($(ARCH),arm64)
TEST_GEN_FILES += soft-dirty
_
Patches currently in -mm which might be from donettom(a)linux.ibm.com are
selftest-mm-test-if-hugepage-does-not-get-leaked-during-__bio_release_pages.patch
The patch titled
Subject: mm/memory-failure: fix handling of dissolved but not taken off from buddy pages
has been added to the -mm mm-hotfixes-unstable branch. Its filename is
mm-memory-failure-fix-handling-of-dissolved-but-not-taken-off-from-buddy-pages.patch
This patch will shortly appear at
https://git.kernel.org/pub/scm/linux/kernel/git/akpm/25-new.git/tree/patche…
This patch will later appear in the mm-hotfixes-unstable branch at
git://git.kernel.org/pub/scm/linux/kernel/git/akpm/mm
Before you just go and hit "reply", please:
a) Consider who else should be cc'ed
b) Prefer to cc a suitable mailing list as well
c) Ideally: find the original patch on the mailing list and do a
reply-to-all to that, adding suitable additional cc's
*** Remember to use Documentation/process/submit-checklist.rst when testing your code ***
The -mm tree is included into linux-next via the mm-everything
branch at git://git.kernel.org/pub/scm/linux/kernel/git/akpm/mm
and is updated there every 2-3 working days
------------------------------------------------------
From: Miaohe Lin <linmiaohe(a)huawei.com>
Subject: mm/memory-failure: fix handling of dissolved but not taken off from buddy pages
Date: Thu, 23 May 2024 15:12:17 +0800
When I did memory failure tests recently, below panic occurs:
page: refcount:0 mapcount:0 mapping:0000000000000000 index:0x0 pfn:0x8cee00
flags: 0x6fffe0000000000(node=1|zone=2|lastcpupid=0x7fff)
raw: 06fffe0000000000 dead000000000100 dead000000000122 0000000000000000
raw: 0000000000000000 0000000000000009 00000000ffffffff 0000000000000000
page dumped because: VM_BUG_ON_PAGE(!PageBuddy(page))
------------[ cut here ]------------
kernel BUG at include/linux/page-flags.h:1009!
invalid opcode: 0000 [#1] PREEMPT SMP NOPTI
RIP: 0010:__del_page_from_free_list+0x151/0x180
RSP: 0018:ffffa49c90437998 EFLAGS: 00000046
RAX: 0000000000000035 RBX: 0000000000000009 RCX: ffff8dd8dfd1c9c8
RDX: 0000000000000000 RSI: 0000000000000027 RDI: ffff8dd8dfd1c9c0
RBP: ffffd901233b8000 R08: ffffffffab5511f8 R09: 0000000000008c69
R10: 0000000000003c15 R11: ffffffffab5511f8 R12: ffff8dd8fffc0c80
R13: 0000000000000001 R14: ffff8dd8fffc0c80 R15: 0000000000000009
FS: 00007ff916304740(0000) GS:ffff8dd8dfd00000(0000) knlGS:0000000000000000
CS: 0010 DS: 0000 ES: 0000 CR0: 0000000080050033
CR2: 000055eae50124c8 CR3: 00000008479e0000 CR4: 00000000000006f0
Call Trace:
<TASK>
__rmqueue_pcplist+0x23b/0x520
get_page_from_freelist+0x26b/0xe40
__alloc_pages_noprof+0x113/0x1120
__folio_alloc_noprof+0x11/0xb0
alloc_buddy_hugetlb_folio.isra.0+0x5a/0x130
__alloc_fresh_hugetlb_folio+0xe7/0x140
alloc_pool_huge_folio+0x68/0x100
set_max_huge_pages+0x13d/0x340
hugetlb_sysctl_handler_common+0xe8/0x110
proc_sys_call_handler+0x194/0x280
vfs_write+0x387/0x550
ksys_write+0x64/0xe0
do_syscall_64+0xc2/0x1d0
entry_SYSCALL_64_after_hwframe+0x77/0x7f
RIP: 0033:0x7ff916114887
RSP: 002b:00007ffec8a2fd78 EFLAGS: 00000246 ORIG_RAX: 0000000000000001
RAX: ffffffffffffffda RBX: 000055eae500e350 RCX: 00007ff916114887
RDX: 0000000000000004 RSI: 000055eae500e390 RDI: 0000000000000003
RBP: 000055eae50104c0 R08: 0000000000000000 R09: 000055eae50104c0
R10: 0000000000000077 R11: 0000000000000246 R12: 0000000000000004
R13: 0000000000000004 R14: 00007ff916216b80 R15: 00007ff916216a00
</TASK>
Modules linked in: mce_inject hwpoison_inject
---[ end trace 0000000000000000 ]---
And before the panic, there had an warning about bad page state:
BUG: Bad page state in process page-types pfn:8cee00
page: refcount:0 mapcount:0 mapping:0000000000000000 index:0x0 pfn:0x8cee00
flags: 0x6fffe0000000000(node=1|zone=2|lastcpupid=0x7fff)
page_type: 0xffffff7f(buddy)
raw: 06fffe0000000000 ffffd901241c0008 ffffd901240f8008 0000000000000000
raw: 0000000000000000 0000000000000009 00000000ffffff7f 0000000000000000
page dumped because: nonzero mapcount
Modules linked in: mce_inject hwpoison_inject
CPU: 8 PID: 154211 Comm: page-types Not tainted 6.9.0-rc4-00499-g5544ec3178e2-dirty #22
Call Trace:
<TASK>
dump_stack_lvl+0x83/0xa0
bad_page+0x63/0xf0
free_unref_page+0x36e/0x5c0
unpoison_memory+0x50b/0x630
simple_attr_write_xsigned.constprop.0.isra.0+0xb3/0x110
debugfs_attr_write+0x42/0x60
full_proxy_write+0x5b/0x80
vfs_write+0xcd/0x550
ksys_write+0x64/0xe0
do_syscall_64+0xc2/0x1d0
entry_SYSCALL_64_after_hwframe+0x77/0x7f
RIP: 0033:0x7f189a514887
RSP: 002b:00007ffdcd899718 EFLAGS: 00000246 ORIG_RAX: 0000000000000001
RAX: ffffffffffffffda RBX: 0000000000000000 RCX: 00007f189a514887
RDX: 0000000000000009 RSI: 00007ffdcd899730 RDI: 0000000000000003
RBP: 00007ffdcd8997a0 R08: 0000000000000000 R09: 00007ffdcd8994b2
R10: 0000000000000000 R11: 0000000000000246 R12: 00007ffdcda199a8
R13: 0000000000404af1 R14: 000000000040ad78 R15: 00007f189a7a5040
</TASK>
The root cause should be the below race:
memory_failure
try_memory_failure_hugetlb
me_huge_page
__page_handle_poison
dissolve_free_hugetlb_folio
drain_all_pages -- Buddy page can be isolated e.g. for compaction.
take_page_off_buddy -- Failed as page is not in the buddy list.
-- Page can be putback into buddy after compaction.
page_ref_inc -- Leads to buddy page with refcnt = 1.
Then unpoison_memory() can unpoison the page and send the buddy page back
into buddy list again leading to the above bad page state warning. And
bad_page() will call page_mapcount_reset() to remove PageBuddy from buddy
page leading to later VM_BUG_ON_PAGE(!PageBuddy(page)) when trying to
allocate this page.
Fix this issue by only treating __page_handle_poison() as successful when
it returns 1.
Link: https://lkml.kernel.org/r/20240523071217.1696196-1-linmiaohe@huawei.com
Fixes: ceaf8fbea79a ("mm, hwpoison: skip raw hwpoison page in freeing 1GB hugepage")
Signed-off-by: Miaohe Lin <linmiaohe(a)huawei.com>
Cc: Naoya Horiguchi <nao.horiguchi(a)gmail.com>
Cc: <stable(a)vger.kernel.org>
Signed-off-by: Andrew Morton <akpm(a)linux-foundation.org>
---
mm/memory-failure.c | 4 ++--
1 file changed, 2 insertions(+), 2 deletions(-)
--- a/mm/memory-failure.c~mm-memory-failure-fix-handling-of-dissolved-but-not-taken-off-from-buddy-pages
+++ a/mm/memory-failure.c
@@ -1221,7 +1221,7 @@ static int me_huge_page(struct page_stat
* subpages.
*/
folio_put(folio);
- if (__page_handle_poison(p) >= 0) {
+ if (__page_handle_poison(p) > 0) {
page_ref_inc(p);
res = MF_RECOVERED;
} else {
@@ -2091,7 +2091,7 @@ retry:
*/
if (res == 0) {
folio_unlock(folio);
- if (__page_handle_poison(p) >= 0) {
+ if (__page_handle_poison(p) > 0) {
page_ref_inc(p);
res = MF_RECOVERED;
} else {
_
Patches currently in -mm which might be from linmiaohe(a)huawei.com are
mm-huge_memory-dont-unpoison-huge_zero_folio.patch
mm-memory-failure-fix-handling-of-dissolved-but-not-taken-off-from-buddy-pages.patch
The patch below does not apply to the 4.19-stable tree.
If someone wants it applied there, or to any other stable or longterm
tree, then please email the backport, including the original git commit
id to <stable(a)vger.kernel.org>.
To reproduce the conflict and resubmit, you may use the following commands:
git fetch https://git.kernel.org/pub/scm/linux/kernel/git/stable/linux.git/ linux-4.19.y
git checkout FETCH_HEAD
git cherry-pick -x 42316941335644a98335f209daafa4c122f28983
# <resolve conflicts, build, test, etc.>
git commit -s
git send-email --to '<stable(a)vger.kernel.org>' --in-reply-to '2024052314-demeanor-mushy-46bb@gregkh' --subject-prefix 'PATCH 4.19.y' HEAD^..
Possible dependencies:
thanks,
greg k-h
------------------ original commit in Linus's tree ------------------
From 42316941335644a98335f209daafa4c122f28983 Mon Sep 17 00:00:00 2001
From: Carlos Llamas <cmllamas(a)google.com>
Date: Sun, 21 Apr 2024 17:37:49 +0000
Subject: [PATCH] binder: fix max_thread type inconsistency
MIME-Version: 1.0
Content-Type: text/plain; charset=UTF-8
Content-Transfer-Encoding: 8bit
The type defined for the BINDER_SET_MAX_THREADS ioctl was changed from
size_t to __u32 in order to avoid incompatibility issues between 32 and
64-bit kernels. However, the internal types used to copy from user and
store the value were never updated. Use u32 to fix the inconsistency.
Fixes: a9350fc859ae ("staging: android: binder: fix BINDER_SET_MAX_THREADS declaration")
Reported-by: Arve Hjønnevåg <arve(a)android.com>
Cc: stable(a)vger.kernel.org
Signed-off-by: Carlos Llamas <cmllamas(a)google.com>
Reviewed-by: Alice Ryhl <aliceryhl(a)google.com>
Link: https://lore.kernel.org/r/20240421173750.3117808-1-cmllamas@google.com
Signed-off-by: Greg Kroah-Hartman <gregkh(a)linuxfoundation.org>
diff --git a/drivers/android/binder.c b/drivers/android/binder.c
index dd6923d37931..b21a7b246a0d 100644
--- a/drivers/android/binder.c
+++ b/drivers/android/binder.c
@@ -5367,7 +5367,7 @@ static long binder_ioctl(struct file *filp, unsigned int cmd, unsigned long arg)
goto err;
break;
case BINDER_SET_MAX_THREADS: {
- int max_threads;
+ u32 max_threads;
if (copy_from_user(&max_threads, ubuf,
sizeof(max_threads))) {
diff --git a/drivers/android/binder_internal.h b/drivers/android/binder_internal.h
index 7270d4d22207..5b7c80b99ae8 100644
--- a/drivers/android/binder_internal.h
+++ b/drivers/android/binder_internal.h
@@ -421,7 +421,7 @@ struct binder_proc {
struct list_head todo;
struct binder_stats stats;
struct list_head delivered_death;
- int max_threads;
+ u32 max_threads;
int requested_threads;
int requested_threads_started;
int tmp_ref;
The patch below does not apply to the 5.4-stable tree.
If someone wants it applied there, or to any other stable or longterm
tree, then please email the backport, including the original git commit
id to <stable(a)vger.kernel.org>.
To reproduce the conflict and resubmit, you may use the following commands:
git fetch https://git.kernel.org/pub/scm/linux/kernel/git/stable/linux.git/ linux-5.4.y
git checkout FETCH_HEAD
git cherry-pick -x 42316941335644a98335f209daafa4c122f28983
# <resolve conflicts, build, test, etc.>
git commit -s
git send-email --to '<stable(a)vger.kernel.org>' --in-reply-to '2024052314-pardon-confider-6160@gregkh' --subject-prefix 'PATCH 5.4.y' HEAD^..
Possible dependencies:
thanks,
greg k-h
------------------ original commit in Linus's tree ------------------
From 42316941335644a98335f209daafa4c122f28983 Mon Sep 17 00:00:00 2001
From: Carlos Llamas <cmllamas(a)google.com>
Date: Sun, 21 Apr 2024 17:37:49 +0000
Subject: [PATCH] binder: fix max_thread type inconsistency
MIME-Version: 1.0
Content-Type: text/plain; charset=UTF-8
Content-Transfer-Encoding: 8bit
The type defined for the BINDER_SET_MAX_THREADS ioctl was changed from
size_t to __u32 in order to avoid incompatibility issues between 32 and
64-bit kernels. However, the internal types used to copy from user and
store the value were never updated. Use u32 to fix the inconsistency.
Fixes: a9350fc859ae ("staging: android: binder: fix BINDER_SET_MAX_THREADS declaration")
Reported-by: Arve Hjønnevåg <arve(a)android.com>
Cc: stable(a)vger.kernel.org
Signed-off-by: Carlos Llamas <cmllamas(a)google.com>
Reviewed-by: Alice Ryhl <aliceryhl(a)google.com>
Link: https://lore.kernel.org/r/20240421173750.3117808-1-cmllamas@google.com
Signed-off-by: Greg Kroah-Hartman <gregkh(a)linuxfoundation.org>
diff --git a/drivers/android/binder.c b/drivers/android/binder.c
index dd6923d37931..b21a7b246a0d 100644
--- a/drivers/android/binder.c
+++ b/drivers/android/binder.c
@@ -5367,7 +5367,7 @@ static long binder_ioctl(struct file *filp, unsigned int cmd, unsigned long arg)
goto err;
break;
case BINDER_SET_MAX_THREADS: {
- int max_threads;
+ u32 max_threads;
if (copy_from_user(&max_threads, ubuf,
sizeof(max_threads))) {
diff --git a/drivers/android/binder_internal.h b/drivers/android/binder_internal.h
index 7270d4d22207..5b7c80b99ae8 100644
--- a/drivers/android/binder_internal.h
+++ b/drivers/android/binder_internal.h
@@ -421,7 +421,7 @@ struct binder_proc {
struct list_head todo;
struct binder_stats stats;
struct list_head delivered_death;
- int max_threads;
+ u32 max_threads;
int requested_threads;
int requested_threads_started;
int tmp_ref;
The dynamically created mei client device (mei csi) is used as one V4L2
sub device of the whole video pipeline, and the V4L2 connection graph is
built by software node. The mei_stop() and mei_restart() will delete the
old mei csi client device and create a new mei client device, which will
cause the software node information saved in old mei csi device lost and
the whole video pipeline will be broken.
Removing mei_stop()/mei_restart() during system suspend/resume can fix
the issue above and won't impact hardware actual power saving logic.
Fixes: f6085a96c973 ("mei: vsc: Unregister interrupt handler for system suspend")
Cc: stable(a)vger.kernel.org # for 6.8+
Reported-by: Hao Yao <hao.yao(a)intel.com>
Signed-off-by: Wentong Wu <wentong.wu(a)intel.com>
Reviewed-by: Sakari Ailus <sakari.ailus(a)linux.intel.com>
Tested-by: Jason Chen <jason.z.chen(a)intel.com>
Tested-by: Sakari Ailus <sakari.ailus(a)linux.intel.com>
---
drivers/misc/mei/platform-vsc.c | 39 +++++++++++++--------------------
1 file changed, 15 insertions(+), 24 deletions(-)
diff --git a/drivers/misc/mei/platform-vsc.c b/drivers/misc/mei/platform-vsc.c
index b543e6b9f3cf..1ec65d87488a 100644
--- a/drivers/misc/mei/platform-vsc.c
+++ b/drivers/misc/mei/platform-vsc.c
@@ -399,41 +399,32 @@ static void mei_vsc_remove(struct platform_device *pdev)
static int mei_vsc_suspend(struct device *dev)
{
- struct mei_device *mei_dev = dev_get_drvdata(dev);
- struct mei_vsc_hw *hw = mei_dev_to_vsc_hw(mei_dev);
+ struct mei_device *mei_dev;
+ int ret = 0;
- mei_stop(mei_dev);
+ mei_dev = dev_get_drvdata(dev);
+ if (!mei_dev)
+ return -ENODEV;
- mei_disable_interrupts(mei_dev);
+ mutex_lock(&mei_dev->device_lock);
- vsc_tp_free_irq(hw->tp);
+ if (!mei_write_is_idle(mei_dev))
+ ret = -EAGAIN;
- return 0;
+ mutex_unlock(&mei_dev->device_lock);
+
+ return ret;
}
static int mei_vsc_resume(struct device *dev)
{
- struct mei_device *mei_dev = dev_get_drvdata(dev);
- struct mei_vsc_hw *hw = mei_dev_to_vsc_hw(mei_dev);
- int ret;
-
- ret = vsc_tp_request_irq(hw->tp);
- if (ret)
- return ret;
-
- ret = mei_restart(mei_dev);
- if (ret)
- goto err_free;
+ struct mei_device *mei_dev;
- /* start timer if stopped in suspend */
- schedule_delayed_work(&mei_dev->timer_work, HZ);
+ mei_dev = dev_get_drvdata(dev);
+ if (!mei_dev)
+ return -ENODEV;
return 0;
-
-err_free:
- vsc_tp_free_irq(hw->tp);
-
- return ret;
}
static DEFINE_SIMPLE_DEV_PM_OPS(mei_vsc_pm_ops, mei_vsc_suspend, mei_vsc_resume);
--
2.34.1
In some scenarios, the DPT object gets shrunk but
the actual framebuffer did not and thus its still
there on the DPT's vm->bound_list. Then it tries to
rewrite the PTEs via a stale CPU mapping. This causes panic.
Credits-to: Ville Syrjala <ville.syrjala(a)linux.intel.com>
Shawn Lee <shawn.c.lee(a)intel.com>
Cc: stable(a)vger.kernel.org
Fixes: 0dc987b699ce ("drm/i915/display: Add smem fallback allocation for dpt")
Signed-off-by: Vidya Srinivas <vidya.srinivas(a)intel.com>
---
drivers/gpu/drm/i915/gem/i915_gem_object.h | 3 ++-
1 file changed, 2 insertions(+), 1 deletion(-)
diff --git a/drivers/gpu/drm/i915/gem/i915_gem_object.h b/drivers/gpu/drm/i915/gem/i915_gem_object.h
index 3560a062d287..e6b485fc54d4 100644
--- a/drivers/gpu/drm/i915/gem/i915_gem_object.h
+++ b/drivers/gpu/drm/i915/gem/i915_gem_object.h
@@ -284,7 +284,8 @@ bool i915_gem_object_has_iomem(const struct drm_i915_gem_object *obj);
static inline bool
i915_gem_object_is_shrinkable(const struct drm_i915_gem_object *obj)
{
- return i915_gem_object_type_has(obj, I915_GEM_OBJECT_IS_SHRINKABLE);
+ return i915_gem_object_type_has(obj, I915_GEM_OBJECT_IS_SHRINKABLE) &&
+ !obj->is_dpt;
}
static inline bool
--
2.34.1
The dynamically created mei client device (mei csi) is used as one V4L2
sub device of the whole video pipeline, and the V4L2 connection graph is
built by software node. The mei_stop() and mei_restart() will delete the
old mei csi client device and create a new mei client device, which will
cause the software node information saved in old mei csi device lost and
the whole video pipeline will be broken.
Removing mei_stop()/mei_restart() during system suspend/resume can fix
the issue above and won't impact hardware actual power saving logic.
Fixes: 386a766c4169 ("mei: Add MEI hardware support for IVSC device")
Cc: stable(a)vger.kernel.org # for 6.8+
Reported-by: Hao Yao <hao.yao(a)intel.com>
Signed-off-by: Wentong Wu <wentong.wu(a)intel.com>
Tested-by: Jason Chen <jason.z.chen(a)intel.com>
---
drivers/misc/mei/platform-vsc.c | 39 +++++++++++++--------------------
1 file changed, 15 insertions(+), 24 deletions(-)
diff --git a/drivers/misc/mei/platform-vsc.c b/drivers/misc/mei/platform-vsc.c
index b543e6b9f3cf..1ec65d87488a 100644
--- a/drivers/misc/mei/platform-vsc.c
+++ b/drivers/misc/mei/platform-vsc.c
@@ -399,41 +399,32 @@ static void mei_vsc_remove(struct platform_device *pdev)
static int mei_vsc_suspend(struct device *dev)
{
- struct mei_device *mei_dev = dev_get_drvdata(dev);
- struct mei_vsc_hw *hw = mei_dev_to_vsc_hw(mei_dev);
+ struct mei_device *mei_dev;
+ int ret = 0;
- mei_stop(mei_dev);
+ mei_dev = dev_get_drvdata(dev);
+ if (!mei_dev)
+ return -ENODEV;
- mei_disable_interrupts(mei_dev);
+ mutex_lock(&mei_dev->device_lock);
- vsc_tp_free_irq(hw->tp);
+ if (!mei_write_is_idle(mei_dev))
+ ret = -EAGAIN;
- return 0;
+ mutex_unlock(&mei_dev->device_lock);
+
+ return ret;
}
static int mei_vsc_resume(struct device *dev)
{
- struct mei_device *mei_dev = dev_get_drvdata(dev);
- struct mei_vsc_hw *hw = mei_dev_to_vsc_hw(mei_dev);
- int ret;
-
- ret = vsc_tp_request_irq(hw->tp);
- if (ret)
- return ret;
-
- ret = mei_restart(mei_dev);
- if (ret)
- goto err_free;
+ struct mei_device *mei_dev;
- /* start timer if stopped in suspend */
- schedule_delayed_work(&mei_dev->timer_work, HZ);
+ mei_dev = dev_get_drvdata(dev);
+ if (!mei_dev)
+ return -ENODEV;
return 0;
-
-err_free:
- vsc_tp_free_irq(hw->tp);
-
- return ret;
}
static DEFINE_SIMPLE_DEV_PM_OPS(mei_vsc_pm_ops, mei_vsc_suspend, mei_vsc_resume);
--
2.34.1
The patch below does not apply to the 4.19-stable tree.
If someone wants it applied there, or to any other stable or longterm
tree, then please email the backport, including the original git commit
id to <stable(a)vger.kernel.org>.
To reproduce the conflict and resubmit, you may use the following commands:
git fetch https://git.kernel.org/pub/scm/linux/kernel/git/stable/linux.git/ linux-4.19.y
git checkout FETCH_HEAD
git cherry-pick -x 409c1cfb5a803f3cf2d17aeaf75c25c4be951b07
# <resolve conflicts, build, test, etc.>
git commit -s
git send-email --to '<stable(a)vger.kernel.org>' --in-reply-to '2024052357-thong-expensive-7cdd@gregkh' --subject-prefix 'PATCH 4.19.y' HEAD^..
Possible dependencies:
thanks,
greg k-h
------------------ original commit in Linus's tree ------------------
From 409c1cfb5a803f3cf2d17aeaf75c25c4be951b07 Mon Sep 17 00:00:00 2001
From: Javier Carrasco <javier.carrasco(a)wolfvision.net>
Date: Mon, 29 Apr 2024 15:35:58 +0200
Subject: [PATCH] usb: typec: tipd: fix event checking for tps6598x
The current interrupt service routine of the tps6598x only reads the
first 64 bits of the INT_EVENT1 and INT_EVENT2 registers, which means
that any event above that range will be ignored, leaving interrupts
unattended. Moreover, those events will not be cleared, and the device
will keep the interrupt enabled.
This issue has been observed while attempting to load patches, and the
'ReadyForPatch' field (bit 81) of INT_EVENT1 was set.
Given that older versions of the tps6598x (1, 2 and 6) provide 8-byte
registers, a mechanism based on the upper byte of the version register
(0x0F) has been included. The manufacturer has confirmed [1] that this
byte is always 0 for older versions, and either 0xF7 (DH parts) or 0xF9
(DK parts) is returned in newer versions (7 and 8).
Read the complete INT_EVENT registers to handle all interrupts generated
by the device and account for the hardware version to select the
register size.
Link: https://e2e.ti.com/support/power-management-group/power-management/f/power-… [1]
Fixes: 0a4c005bd171 ("usb: typec: driver for TI TPS6598x USB Power Delivery controllers")
Cc: stable(a)vger.kernel.org
Signed-off-by: Javier Carrasco <javier.carrasco(a)wolfvision.net>
Link: https://lore.kernel.org/r/20240429-tps6598x_fix_event_handling-v3-2-4e8e58d…
Signed-off-by: Greg Kroah-Hartman <gregkh(a)linuxfoundation.org>
diff --git a/drivers/usb/typec/tipd/core.c b/drivers/usb/typec/tipd/core.c
index 7c2f01344860..191f86da283d 100644
--- a/drivers/usb/typec/tipd/core.c
+++ b/drivers/usb/typec/tipd/core.c
@@ -28,6 +28,7 @@
#define TPS_REG_MODE 0x03
#define TPS_REG_CMD1 0x08
#define TPS_REG_DATA1 0x09
+#define TPS_REG_VERSION 0x0F
#define TPS_REG_INT_EVENT1 0x14
#define TPS_REG_INT_EVENT2 0x15
#define TPS_REG_INT_MASK1 0x16
@@ -636,49 +637,67 @@ static irqreturn_t tps25750_interrupt(int irq, void *data)
static irqreturn_t tps6598x_interrupt(int irq, void *data)
{
+ int intev_len = TPS_65981_2_6_INTEVENT_LEN;
struct tps6598x *tps = data;
- u64 event1 = 0;
- u64 event2 = 0;
+ u64 event1[2] = { };
+ u64 event2[2] = { };
+ u32 version;
u32 status;
int ret;
mutex_lock(&tps->lock);
- ret = tps6598x_read64(tps, TPS_REG_INT_EVENT1, &event1);
- ret |= tps6598x_read64(tps, TPS_REG_INT_EVENT2, &event2);
+ ret = tps6598x_read32(tps, TPS_REG_VERSION, &version);
+ if (ret)
+ dev_warn(tps->dev, "%s: failed to read version (%d)\n",
+ __func__, ret);
+
+ if (TPS_VERSION_HW_VERSION(version) == TPS_VERSION_HW_65987_8_DH ||
+ TPS_VERSION_HW_VERSION(version) == TPS_VERSION_HW_65987_8_DK)
+ intev_len = TPS_65987_8_INTEVENT_LEN;
+
+ ret = tps6598x_block_read(tps, TPS_REG_INT_EVENT1, event1, intev_len);
+
+ ret = tps6598x_block_read(tps, TPS_REG_INT_EVENT1, event1, intev_len);
if (ret) {
- dev_err(tps->dev, "%s: failed to read events\n", __func__);
+ dev_err(tps->dev, "%s: failed to read event1\n", __func__);
goto err_unlock;
}
- trace_tps6598x_irq(event1, event2);
+ ret = tps6598x_block_read(tps, TPS_REG_INT_EVENT2, event2, intev_len);
+ if (ret) {
+ dev_err(tps->dev, "%s: failed to read event2\n", __func__);
+ goto err_unlock;
+ }
+ trace_tps6598x_irq(event1[0], event2[0]);
- if (!(event1 | event2))
+ if (!(event1[0] | event1[1] | event2[0] | event2[1]))
goto err_unlock;
if (!tps6598x_read_status(tps, &status))
goto err_clear_ints;
- if ((event1 | event2) & TPS_REG_INT_POWER_STATUS_UPDATE)
+ if ((event1[0] | event2[0]) & TPS_REG_INT_POWER_STATUS_UPDATE)
if (!tps6598x_read_power_status(tps))
goto err_clear_ints;
- if ((event1 | event2) & TPS_REG_INT_DATA_STATUS_UPDATE)
+ if ((event1[0] | event2[0]) & TPS_REG_INT_DATA_STATUS_UPDATE)
if (!tps6598x_read_data_status(tps))
goto err_clear_ints;
/* Handle plug insert or removal */
- if ((event1 | event2) & TPS_REG_INT_PLUG_EVENT)
+ if ((event1[0] | event2[0]) & TPS_REG_INT_PLUG_EVENT)
tps6598x_handle_plug_event(tps, status);
err_clear_ints:
- tps6598x_write64(tps, TPS_REG_INT_CLEAR1, event1);
- tps6598x_write64(tps, TPS_REG_INT_CLEAR2, event2);
+ tps6598x_block_write(tps, TPS_REG_INT_CLEAR1, event1, intev_len);
+ tps6598x_block_write(tps, TPS_REG_INT_CLEAR2, event2, intev_len);
err_unlock:
mutex_unlock(&tps->lock);
- if (event1 | event2)
+ if (event1[0] | event1[1] | event2[0] | event2[1])
return IRQ_HANDLED;
+
return IRQ_NONE;
}
diff --git a/drivers/usb/typec/tipd/tps6598x.h b/drivers/usb/typec/tipd/tps6598x.h
index 89b24519463a..9b23e9017452 100644
--- a/drivers/usb/typec/tipd/tps6598x.h
+++ b/drivers/usb/typec/tipd/tps6598x.h
@@ -253,4 +253,15 @@
#define TPS_PTCC_DEV 2
#define TPS_PTCC_APP 3
+/* Version Register */
+#define TPS_VERSION_HW_VERSION_MASK GENMASK(31, 24)
+#define TPS_VERSION_HW_VERSION(x) TPS_FIELD_GET(TPS_VERSION_HW_VERSION_MASK, (x))
+#define TPS_VERSION_HW_65981_2_6 0x00
+#define TPS_VERSION_HW_65987_8_DH 0xF7
+#define TPS_VERSION_HW_65987_8_DK 0xF9
+
+/* Int Event Register length */
+#define TPS_65981_2_6_INTEVENT_LEN 8
+#define TPS_65987_8_INTEVENT_LEN 11
+
#endif /* __TPS6598X_H__ */
The patch below does not apply to the 5.4-stable tree.
If someone wants it applied there, or to any other stable or longterm
tree, then please email the backport, including the original git commit
id to <stable(a)vger.kernel.org>.
To reproduce the conflict and resubmit, you may use the following commands:
git fetch https://git.kernel.org/pub/scm/linux/kernel/git/stable/linux.git/ linux-5.4.y
git checkout FETCH_HEAD
git cherry-pick -x 409c1cfb5a803f3cf2d17aeaf75c25c4be951b07
# <resolve conflicts, build, test, etc.>
git commit -s
git send-email --to '<stable(a)vger.kernel.org>' --in-reply-to '2024052354-lustrous-corrode-977e@gregkh' --subject-prefix 'PATCH 5.4.y' HEAD^..
Possible dependencies:
thanks,
greg k-h
------------------ original commit in Linus's tree ------------------
From 409c1cfb5a803f3cf2d17aeaf75c25c4be951b07 Mon Sep 17 00:00:00 2001
From: Javier Carrasco <javier.carrasco(a)wolfvision.net>
Date: Mon, 29 Apr 2024 15:35:58 +0200
Subject: [PATCH] usb: typec: tipd: fix event checking for tps6598x
The current interrupt service routine of the tps6598x only reads the
first 64 bits of the INT_EVENT1 and INT_EVENT2 registers, which means
that any event above that range will be ignored, leaving interrupts
unattended. Moreover, those events will not be cleared, and the device
will keep the interrupt enabled.
This issue has been observed while attempting to load patches, and the
'ReadyForPatch' field (bit 81) of INT_EVENT1 was set.
Given that older versions of the tps6598x (1, 2 and 6) provide 8-byte
registers, a mechanism based on the upper byte of the version register
(0x0F) has been included. The manufacturer has confirmed [1] that this
byte is always 0 for older versions, and either 0xF7 (DH parts) or 0xF9
(DK parts) is returned in newer versions (7 and 8).
Read the complete INT_EVENT registers to handle all interrupts generated
by the device and account for the hardware version to select the
register size.
Link: https://e2e.ti.com/support/power-management-group/power-management/f/power-… [1]
Fixes: 0a4c005bd171 ("usb: typec: driver for TI TPS6598x USB Power Delivery controllers")
Cc: stable(a)vger.kernel.org
Signed-off-by: Javier Carrasco <javier.carrasco(a)wolfvision.net>
Link: https://lore.kernel.org/r/20240429-tps6598x_fix_event_handling-v3-2-4e8e58d…
Signed-off-by: Greg Kroah-Hartman <gregkh(a)linuxfoundation.org>
diff --git a/drivers/usb/typec/tipd/core.c b/drivers/usb/typec/tipd/core.c
index 7c2f01344860..191f86da283d 100644
--- a/drivers/usb/typec/tipd/core.c
+++ b/drivers/usb/typec/tipd/core.c
@@ -28,6 +28,7 @@
#define TPS_REG_MODE 0x03
#define TPS_REG_CMD1 0x08
#define TPS_REG_DATA1 0x09
+#define TPS_REG_VERSION 0x0F
#define TPS_REG_INT_EVENT1 0x14
#define TPS_REG_INT_EVENT2 0x15
#define TPS_REG_INT_MASK1 0x16
@@ -636,49 +637,67 @@ static irqreturn_t tps25750_interrupt(int irq, void *data)
static irqreturn_t tps6598x_interrupt(int irq, void *data)
{
+ int intev_len = TPS_65981_2_6_INTEVENT_LEN;
struct tps6598x *tps = data;
- u64 event1 = 0;
- u64 event2 = 0;
+ u64 event1[2] = { };
+ u64 event2[2] = { };
+ u32 version;
u32 status;
int ret;
mutex_lock(&tps->lock);
- ret = tps6598x_read64(tps, TPS_REG_INT_EVENT1, &event1);
- ret |= tps6598x_read64(tps, TPS_REG_INT_EVENT2, &event2);
+ ret = tps6598x_read32(tps, TPS_REG_VERSION, &version);
+ if (ret)
+ dev_warn(tps->dev, "%s: failed to read version (%d)\n",
+ __func__, ret);
+
+ if (TPS_VERSION_HW_VERSION(version) == TPS_VERSION_HW_65987_8_DH ||
+ TPS_VERSION_HW_VERSION(version) == TPS_VERSION_HW_65987_8_DK)
+ intev_len = TPS_65987_8_INTEVENT_LEN;
+
+ ret = tps6598x_block_read(tps, TPS_REG_INT_EVENT1, event1, intev_len);
+
+ ret = tps6598x_block_read(tps, TPS_REG_INT_EVENT1, event1, intev_len);
if (ret) {
- dev_err(tps->dev, "%s: failed to read events\n", __func__);
+ dev_err(tps->dev, "%s: failed to read event1\n", __func__);
goto err_unlock;
}
- trace_tps6598x_irq(event1, event2);
+ ret = tps6598x_block_read(tps, TPS_REG_INT_EVENT2, event2, intev_len);
+ if (ret) {
+ dev_err(tps->dev, "%s: failed to read event2\n", __func__);
+ goto err_unlock;
+ }
+ trace_tps6598x_irq(event1[0], event2[0]);
- if (!(event1 | event2))
+ if (!(event1[0] | event1[1] | event2[0] | event2[1]))
goto err_unlock;
if (!tps6598x_read_status(tps, &status))
goto err_clear_ints;
- if ((event1 | event2) & TPS_REG_INT_POWER_STATUS_UPDATE)
+ if ((event1[0] | event2[0]) & TPS_REG_INT_POWER_STATUS_UPDATE)
if (!tps6598x_read_power_status(tps))
goto err_clear_ints;
- if ((event1 | event2) & TPS_REG_INT_DATA_STATUS_UPDATE)
+ if ((event1[0] | event2[0]) & TPS_REG_INT_DATA_STATUS_UPDATE)
if (!tps6598x_read_data_status(tps))
goto err_clear_ints;
/* Handle plug insert or removal */
- if ((event1 | event2) & TPS_REG_INT_PLUG_EVENT)
+ if ((event1[0] | event2[0]) & TPS_REG_INT_PLUG_EVENT)
tps6598x_handle_plug_event(tps, status);
err_clear_ints:
- tps6598x_write64(tps, TPS_REG_INT_CLEAR1, event1);
- tps6598x_write64(tps, TPS_REG_INT_CLEAR2, event2);
+ tps6598x_block_write(tps, TPS_REG_INT_CLEAR1, event1, intev_len);
+ tps6598x_block_write(tps, TPS_REG_INT_CLEAR2, event2, intev_len);
err_unlock:
mutex_unlock(&tps->lock);
- if (event1 | event2)
+ if (event1[0] | event1[1] | event2[0] | event2[1])
return IRQ_HANDLED;
+
return IRQ_NONE;
}
diff --git a/drivers/usb/typec/tipd/tps6598x.h b/drivers/usb/typec/tipd/tps6598x.h
index 89b24519463a..9b23e9017452 100644
--- a/drivers/usb/typec/tipd/tps6598x.h
+++ b/drivers/usb/typec/tipd/tps6598x.h
@@ -253,4 +253,15 @@
#define TPS_PTCC_DEV 2
#define TPS_PTCC_APP 3
+/* Version Register */
+#define TPS_VERSION_HW_VERSION_MASK GENMASK(31, 24)
+#define TPS_VERSION_HW_VERSION(x) TPS_FIELD_GET(TPS_VERSION_HW_VERSION_MASK, (x))
+#define TPS_VERSION_HW_65981_2_6 0x00
+#define TPS_VERSION_HW_65987_8_DH 0xF7
+#define TPS_VERSION_HW_65987_8_DK 0xF9
+
+/* Int Event Register length */
+#define TPS_65981_2_6_INTEVENT_LEN 8
+#define TPS_65987_8_INTEVENT_LEN 11
+
#endif /* __TPS6598X_H__ */
The patch below does not apply to the 5.10-stable tree.
If someone wants it applied there, or to any other stable or longterm
tree, then please email the backport, including the original git commit
id to <stable(a)vger.kernel.org>.
To reproduce the conflict and resubmit, you may use the following commands:
git fetch https://git.kernel.org/pub/scm/linux/kernel/git/stable/linux.git/ linux-5.10.y
git checkout FETCH_HEAD
git cherry-pick -x 409c1cfb5a803f3cf2d17aeaf75c25c4be951b07
# <resolve conflicts, build, test, etc.>
git commit -s
git send-email --to '<stable(a)vger.kernel.org>' --in-reply-to '2024052353-bless-encrypt-6938@gregkh' --subject-prefix 'PATCH 5.10.y' HEAD^..
Possible dependencies:
thanks,
greg k-h
------------------ original commit in Linus's tree ------------------
From 409c1cfb5a803f3cf2d17aeaf75c25c4be951b07 Mon Sep 17 00:00:00 2001
From: Javier Carrasco <javier.carrasco(a)wolfvision.net>
Date: Mon, 29 Apr 2024 15:35:58 +0200
Subject: [PATCH] usb: typec: tipd: fix event checking for tps6598x
The current interrupt service routine of the tps6598x only reads the
first 64 bits of the INT_EVENT1 and INT_EVENT2 registers, which means
that any event above that range will be ignored, leaving interrupts
unattended. Moreover, those events will not be cleared, and the device
will keep the interrupt enabled.
This issue has been observed while attempting to load patches, and the
'ReadyForPatch' field (bit 81) of INT_EVENT1 was set.
Given that older versions of the tps6598x (1, 2 and 6) provide 8-byte
registers, a mechanism based on the upper byte of the version register
(0x0F) has been included. The manufacturer has confirmed [1] that this
byte is always 0 for older versions, and either 0xF7 (DH parts) or 0xF9
(DK parts) is returned in newer versions (7 and 8).
Read the complete INT_EVENT registers to handle all interrupts generated
by the device and account for the hardware version to select the
register size.
Link: https://e2e.ti.com/support/power-management-group/power-management/f/power-… [1]
Fixes: 0a4c005bd171 ("usb: typec: driver for TI TPS6598x USB Power Delivery controllers")
Cc: stable(a)vger.kernel.org
Signed-off-by: Javier Carrasco <javier.carrasco(a)wolfvision.net>
Link: https://lore.kernel.org/r/20240429-tps6598x_fix_event_handling-v3-2-4e8e58d…
Signed-off-by: Greg Kroah-Hartman <gregkh(a)linuxfoundation.org>
diff --git a/drivers/usb/typec/tipd/core.c b/drivers/usb/typec/tipd/core.c
index 7c2f01344860..191f86da283d 100644
--- a/drivers/usb/typec/tipd/core.c
+++ b/drivers/usb/typec/tipd/core.c
@@ -28,6 +28,7 @@
#define TPS_REG_MODE 0x03
#define TPS_REG_CMD1 0x08
#define TPS_REG_DATA1 0x09
+#define TPS_REG_VERSION 0x0F
#define TPS_REG_INT_EVENT1 0x14
#define TPS_REG_INT_EVENT2 0x15
#define TPS_REG_INT_MASK1 0x16
@@ -636,49 +637,67 @@ static irqreturn_t tps25750_interrupt(int irq, void *data)
static irqreturn_t tps6598x_interrupt(int irq, void *data)
{
+ int intev_len = TPS_65981_2_6_INTEVENT_LEN;
struct tps6598x *tps = data;
- u64 event1 = 0;
- u64 event2 = 0;
+ u64 event1[2] = { };
+ u64 event2[2] = { };
+ u32 version;
u32 status;
int ret;
mutex_lock(&tps->lock);
- ret = tps6598x_read64(tps, TPS_REG_INT_EVENT1, &event1);
- ret |= tps6598x_read64(tps, TPS_REG_INT_EVENT2, &event2);
+ ret = tps6598x_read32(tps, TPS_REG_VERSION, &version);
+ if (ret)
+ dev_warn(tps->dev, "%s: failed to read version (%d)\n",
+ __func__, ret);
+
+ if (TPS_VERSION_HW_VERSION(version) == TPS_VERSION_HW_65987_8_DH ||
+ TPS_VERSION_HW_VERSION(version) == TPS_VERSION_HW_65987_8_DK)
+ intev_len = TPS_65987_8_INTEVENT_LEN;
+
+ ret = tps6598x_block_read(tps, TPS_REG_INT_EVENT1, event1, intev_len);
+
+ ret = tps6598x_block_read(tps, TPS_REG_INT_EVENT1, event1, intev_len);
if (ret) {
- dev_err(tps->dev, "%s: failed to read events\n", __func__);
+ dev_err(tps->dev, "%s: failed to read event1\n", __func__);
goto err_unlock;
}
- trace_tps6598x_irq(event1, event2);
+ ret = tps6598x_block_read(tps, TPS_REG_INT_EVENT2, event2, intev_len);
+ if (ret) {
+ dev_err(tps->dev, "%s: failed to read event2\n", __func__);
+ goto err_unlock;
+ }
+ trace_tps6598x_irq(event1[0], event2[0]);
- if (!(event1 | event2))
+ if (!(event1[0] | event1[1] | event2[0] | event2[1]))
goto err_unlock;
if (!tps6598x_read_status(tps, &status))
goto err_clear_ints;
- if ((event1 | event2) & TPS_REG_INT_POWER_STATUS_UPDATE)
+ if ((event1[0] | event2[0]) & TPS_REG_INT_POWER_STATUS_UPDATE)
if (!tps6598x_read_power_status(tps))
goto err_clear_ints;
- if ((event1 | event2) & TPS_REG_INT_DATA_STATUS_UPDATE)
+ if ((event1[0] | event2[0]) & TPS_REG_INT_DATA_STATUS_UPDATE)
if (!tps6598x_read_data_status(tps))
goto err_clear_ints;
/* Handle plug insert or removal */
- if ((event1 | event2) & TPS_REG_INT_PLUG_EVENT)
+ if ((event1[0] | event2[0]) & TPS_REG_INT_PLUG_EVENT)
tps6598x_handle_plug_event(tps, status);
err_clear_ints:
- tps6598x_write64(tps, TPS_REG_INT_CLEAR1, event1);
- tps6598x_write64(tps, TPS_REG_INT_CLEAR2, event2);
+ tps6598x_block_write(tps, TPS_REG_INT_CLEAR1, event1, intev_len);
+ tps6598x_block_write(tps, TPS_REG_INT_CLEAR2, event2, intev_len);
err_unlock:
mutex_unlock(&tps->lock);
- if (event1 | event2)
+ if (event1[0] | event1[1] | event2[0] | event2[1])
return IRQ_HANDLED;
+
return IRQ_NONE;
}
diff --git a/drivers/usb/typec/tipd/tps6598x.h b/drivers/usb/typec/tipd/tps6598x.h
index 89b24519463a..9b23e9017452 100644
--- a/drivers/usb/typec/tipd/tps6598x.h
+++ b/drivers/usb/typec/tipd/tps6598x.h
@@ -253,4 +253,15 @@
#define TPS_PTCC_DEV 2
#define TPS_PTCC_APP 3
+/* Version Register */
+#define TPS_VERSION_HW_VERSION_MASK GENMASK(31, 24)
+#define TPS_VERSION_HW_VERSION(x) TPS_FIELD_GET(TPS_VERSION_HW_VERSION_MASK, (x))
+#define TPS_VERSION_HW_65981_2_6 0x00
+#define TPS_VERSION_HW_65987_8_DH 0xF7
+#define TPS_VERSION_HW_65987_8_DK 0xF9
+
+/* Int Event Register length */
+#define TPS_65981_2_6_INTEVENT_LEN 8
+#define TPS_65987_8_INTEVENT_LEN 11
+
#endif /* __TPS6598X_H__ */
The patch below does not apply to the 5.15-stable tree.
If someone wants it applied there, or to any other stable or longterm
tree, then please email the backport, including the original git commit
id to <stable(a)vger.kernel.org>.
To reproduce the conflict and resubmit, you may use the following commands:
git fetch https://git.kernel.org/pub/scm/linux/kernel/git/stable/linux.git/ linux-5.15.y
git checkout FETCH_HEAD
git cherry-pick -x 409c1cfb5a803f3cf2d17aeaf75c25c4be951b07
# <resolve conflicts, build, test, etc.>
git commit -s
git send-email --to '<stable(a)vger.kernel.org>' --in-reply-to '2024052352-liberty-uneven-ef8a@gregkh' --subject-prefix 'PATCH 5.15.y' HEAD^..
Possible dependencies:
thanks,
greg k-h
------------------ original commit in Linus's tree ------------------
From 409c1cfb5a803f3cf2d17aeaf75c25c4be951b07 Mon Sep 17 00:00:00 2001
From: Javier Carrasco <javier.carrasco(a)wolfvision.net>
Date: Mon, 29 Apr 2024 15:35:58 +0200
Subject: [PATCH] usb: typec: tipd: fix event checking for tps6598x
The current interrupt service routine of the tps6598x only reads the
first 64 bits of the INT_EVENT1 and INT_EVENT2 registers, which means
that any event above that range will be ignored, leaving interrupts
unattended. Moreover, those events will not be cleared, and the device
will keep the interrupt enabled.
This issue has been observed while attempting to load patches, and the
'ReadyForPatch' field (bit 81) of INT_EVENT1 was set.
Given that older versions of the tps6598x (1, 2 and 6) provide 8-byte
registers, a mechanism based on the upper byte of the version register
(0x0F) has been included. The manufacturer has confirmed [1] that this
byte is always 0 for older versions, and either 0xF7 (DH parts) or 0xF9
(DK parts) is returned in newer versions (7 and 8).
Read the complete INT_EVENT registers to handle all interrupts generated
by the device and account for the hardware version to select the
register size.
Link: https://e2e.ti.com/support/power-management-group/power-management/f/power-… [1]
Fixes: 0a4c005bd171 ("usb: typec: driver for TI TPS6598x USB Power Delivery controllers")
Cc: stable(a)vger.kernel.org
Signed-off-by: Javier Carrasco <javier.carrasco(a)wolfvision.net>
Link: https://lore.kernel.org/r/20240429-tps6598x_fix_event_handling-v3-2-4e8e58d…
Signed-off-by: Greg Kroah-Hartman <gregkh(a)linuxfoundation.org>
diff --git a/drivers/usb/typec/tipd/core.c b/drivers/usb/typec/tipd/core.c
index 7c2f01344860..191f86da283d 100644
--- a/drivers/usb/typec/tipd/core.c
+++ b/drivers/usb/typec/tipd/core.c
@@ -28,6 +28,7 @@
#define TPS_REG_MODE 0x03
#define TPS_REG_CMD1 0x08
#define TPS_REG_DATA1 0x09
+#define TPS_REG_VERSION 0x0F
#define TPS_REG_INT_EVENT1 0x14
#define TPS_REG_INT_EVENT2 0x15
#define TPS_REG_INT_MASK1 0x16
@@ -636,49 +637,67 @@ static irqreturn_t tps25750_interrupt(int irq, void *data)
static irqreturn_t tps6598x_interrupt(int irq, void *data)
{
+ int intev_len = TPS_65981_2_6_INTEVENT_LEN;
struct tps6598x *tps = data;
- u64 event1 = 0;
- u64 event2 = 0;
+ u64 event1[2] = { };
+ u64 event2[2] = { };
+ u32 version;
u32 status;
int ret;
mutex_lock(&tps->lock);
- ret = tps6598x_read64(tps, TPS_REG_INT_EVENT1, &event1);
- ret |= tps6598x_read64(tps, TPS_REG_INT_EVENT2, &event2);
+ ret = tps6598x_read32(tps, TPS_REG_VERSION, &version);
+ if (ret)
+ dev_warn(tps->dev, "%s: failed to read version (%d)\n",
+ __func__, ret);
+
+ if (TPS_VERSION_HW_VERSION(version) == TPS_VERSION_HW_65987_8_DH ||
+ TPS_VERSION_HW_VERSION(version) == TPS_VERSION_HW_65987_8_DK)
+ intev_len = TPS_65987_8_INTEVENT_LEN;
+
+ ret = tps6598x_block_read(tps, TPS_REG_INT_EVENT1, event1, intev_len);
+
+ ret = tps6598x_block_read(tps, TPS_REG_INT_EVENT1, event1, intev_len);
if (ret) {
- dev_err(tps->dev, "%s: failed to read events\n", __func__);
+ dev_err(tps->dev, "%s: failed to read event1\n", __func__);
goto err_unlock;
}
- trace_tps6598x_irq(event1, event2);
+ ret = tps6598x_block_read(tps, TPS_REG_INT_EVENT2, event2, intev_len);
+ if (ret) {
+ dev_err(tps->dev, "%s: failed to read event2\n", __func__);
+ goto err_unlock;
+ }
+ trace_tps6598x_irq(event1[0], event2[0]);
- if (!(event1 | event2))
+ if (!(event1[0] | event1[1] | event2[0] | event2[1]))
goto err_unlock;
if (!tps6598x_read_status(tps, &status))
goto err_clear_ints;
- if ((event1 | event2) & TPS_REG_INT_POWER_STATUS_UPDATE)
+ if ((event1[0] | event2[0]) & TPS_REG_INT_POWER_STATUS_UPDATE)
if (!tps6598x_read_power_status(tps))
goto err_clear_ints;
- if ((event1 | event2) & TPS_REG_INT_DATA_STATUS_UPDATE)
+ if ((event1[0] | event2[0]) & TPS_REG_INT_DATA_STATUS_UPDATE)
if (!tps6598x_read_data_status(tps))
goto err_clear_ints;
/* Handle plug insert or removal */
- if ((event1 | event2) & TPS_REG_INT_PLUG_EVENT)
+ if ((event1[0] | event2[0]) & TPS_REG_INT_PLUG_EVENT)
tps6598x_handle_plug_event(tps, status);
err_clear_ints:
- tps6598x_write64(tps, TPS_REG_INT_CLEAR1, event1);
- tps6598x_write64(tps, TPS_REG_INT_CLEAR2, event2);
+ tps6598x_block_write(tps, TPS_REG_INT_CLEAR1, event1, intev_len);
+ tps6598x_block_write(tps, TPS_REG_INT_CLEAR2, event2, intev_len);
err_unlock:
mutex_unlock(&tps->lock);
- if (event1 | event2)
+ if (event1[0] | event1[1] | event2[0] | event2[1])
return IRQ_HANDLED;
+
return IRQ_NONE;
}
diff --git a/drivers/usb/typec/tipd/tps6598x.h b/drivers/usb/typec/tipd/tps6598x.h
index 89b24519463a..9b23e9017452 100644
--- a/drivers/usb/typec/tipd/tps6598x.h
+++ b/drivers/usb/typec/tipd/tps6598x.h
@@ -253,4 +253,15 @@
#define TPS_PTCC_DEV 2
#define TPS_PTCC_APP 3
+/* Version Register */
+#define TPS_VERSION_HW_VERSION_MASK GENMASK(31, 24)
+#define TPS_VERSION_HW_VERSION(x) TPS_FIELD_GET(TPS_VERSION_HW_VERSION_MASK, (x))
+#define TPS_VERSION_HW_65981_2_6 0x00
+#define TPS_VERSION_HW_65987_8_DH 0xF7
+#define TPS_VERSION_HW_65987_8_DK 0xF9
+
+/* Int Event Register length */
+#define TPS_65981_2_6_INTEVENT_LEN 8
+#define TPS_65987_8_INTEVENT_LEN 11
+
#endif /* __TPS6598X_H__ */
The patch below does not apply to the 5.10-stable tree.
If someone wants it applied there, or to any other stable or longterm
tree, then please email the backport, including the original git commit
id to <stable(a)vger.kernel.org>.
To reproduce the conflict and resubmit, you may use the following commands:
git fetch https://git.kernel.org/pub/scm/linux/kernel/git/stable/linux.git/ linux-5.10.y
git checkout FETCH_HEAD
git cherry-pick -x 9af503d91298c3f2945e73703f0e00995be08c30
# <resolve conflicts, build, test, etc.>
git commit -s
git send-email --to '<stable(a)vger.kernel.org>' --in-reply-to '2024051346-unvocal-magnetism-4ae1@gregkh' --subject-prefix 'PATCH 5.10.y' HEAD^..
Possible dependencies:
9af503d91298 ("btrfs: add missing mutex_unlock in btrfs_relocate_sys_chunks()")
7411055db5ce ("btrfs: handle chunk tree lookup error in btrfs_relocate_sys_chunks()")
thanks,
greg k-h
------------------ original commit in Linus's tree ------------------
From 9af503d91298c3f2945e73703f0e00995be08c30 Mon Sep 17 00:00:00 2001
From: Dominique Martinet <dominique.martinet(a)atmark-techno.com>
Date: Fri, 19 Apr 2024 11:22:48 +0900
Subject: [PATCH] btrfs: add missing mutex_unlock in
btrfs_relocate_sys_chunks()
The previous patch that replaced BUG_ON by error handling forgot to
unlock the mutex in the error path.
Link: https://lore.kernel.org/all/Zh%2fHpAGFqa7YAFuM@duo.ucw.cz
Reported-by: Pavel Machek <pavel(a)denx.de>
Fixes: 7411055db5ce ("btrfs: handle chunk tree lookup error in btrfs_relocate_sys_chunks()")
CC: stable(a)vger.kernel.org
Reviewed-by: Pavel Machek <pavel(a)denx.de>
Signed-off-by: Dominique Martinet <dominique.martinet(a)atmark-techno.com>
Reviewed-by: David Sterba <dsterba(a)suse.com>
Signed-off-by: David Sterba <dsterba(a)suse.com>
diff --git a/fs/btrfs/volumes.c b/fs/btrfs/volumes.c
index dedec3d9b111..c72c351fe7eb 100644
--- a/fs/btrfs/volumes.c
+++ b/fs/btrfs/volumes.c
@@ -3419,6 +3419,7 @@ static int btrfs_relocate_sys_chunks(struct btrfs_fs_info *fs_info)
* alignment and size).
*/
ret = -EUCLEAN;
+ mutex_unlock(&fs_info->reclaim_bgs_lock);
goto error;
}
The patch below does not apply to the 5.10-stable tree.
If someone wants it applied there, or to any other stable or longterm
tree, then please email the backport, including the original git commit
id to <stable(a)vger.kernel.org>.
To reproduce the conflict and resubmit, you may use the following commands:
git fetch https://git.kernel.org/pub/scm/linux/kernel/git/stable/linux.git/ linux-5.10.y
git checkout FETCH_HEAD
git cherry-pick -x fb7a0d334894206ae35f023a82cad5a290fd7386
# <resolve conflicts, build, test, etc.>
git commit -s
git send-email --to '<stable(a)vger.kernel.org>' --in-reply-to '2024051325-dreamt-freebee-5563@gregkh' --subject-prefix 'PATCH 5.10.y' HEAD^..
Possible dependencies:
fb7a0d334894 ("mptcp: ensure snd_nxt is properly initialized on connect")
54f1944ed6d2 ("mptcp: factor out mptcp_connect()")
a42cf9d18278 ("mptcp: poll allow write call before actual connect")
d98a82a6afc7 ("mptcp: handle defer connect in mptcp_sendmsg")
3e5014909b56 ("mptcp: cleanup MPJ subflow list handling")
3d1d6d66e156 ("mptcp: implement support for user-space disconnect")
b29fcfb54cd7 ("mptcp: full disconnect implementation")
3ce0852c86b9 ("mptcp: enforce HoL-blocking estimation")
7cd2802d7496 ("Merge git://git.kernel.org/pub/scm/linux/kernel/git/netdev/net")
thanks,
greg k-h
------------------ original commit in Linus's tree ------------------
From fb7a0d334894206ae35f023a82cad5a290fd7386 Mon Sep 17 00:00:00 2001
From: Paolo Abeni <pabeni(a)redhat.com>
Date: Mon, 29 Apr 2024 20:00:31 +0200
Subject: [PATCH] mptcp: ensure snd_nxt is properly initialized on connect
Christoph reported a splat hinting at a corrupted snd_una:
WARNING: CPU: 1 PID: 38 at net/mptcp/protocol.c:1005 __mptcp_clean_una+0x4b3/0x620 net/mptcp/protocol.c:1005
Modules linked in:
CPU: 1 PID: 38 Comm: kworker/1:1 Not tainted 6.9.0-rc1-gbbeac67456c9 #59
Hardware name: QEMU Standard PC (i440FX + PIIX, 1996), BIOS 1.11.0-2.el7 04/01/2014
Workqueue: events mptcp_worker
RIP: 0010:__mptcp_clean_una+0x4b3/0x620 net/mptcp/protocol.c:1005
Code: be 06 01 00 00 bf 06 01 00 00 e8 a8 12 e7 fe e9 00 fe ff ff e8
8e 1a e7 fe 0f b7 ab 3e 02 00 00 e9 d3 fd ff ff e8 7d 1a e7 fe
<0f> 0b 4c 8b bb e0 05 00 00 e9 74 fc ff ff e8 6a 1a e7 fe 0f 0b e9
RSP: 0018:ffffc9000013fd48 EFLAGS: 00010293
RAX: 0000000000000000 RBX: ffff8881029bd280 RCX: ffffffff82382fe4
RDX: ffff8881003cbd00 RSI: ffffffff823833c3 RDI: 0000000000000001
RBP: 0000000000000000 R08: 0000000000000001 R09: 0000000000000000
R10: 0000000000000000 R11: fefefefefefefeff R12: ffff888138ba8000
R13: 0000000000000106 R14: ffff8881029bd908 R15: ffff888126560000
FS: 0000000000000000(0000) GS:ffff88813bd00000(0000) knlGS:0000000000000000
CS: 0010 DS: 0000 ES: 0000 CR0: 0000000080050033
CR2: 00007f604a5dae38 CR3: 0000000101dac002 CR4: 0000000000170ef0
Call Trace:
<TASK>
__mptcp_clean_una_wakeup net/mptcp/protocol.c:1055 [inline]
mptcp_clean_una_wakeup net/mptcp/protocol.c:1062 [inline]
__mptcp_retrans+0x7f/0x7e0 net/mptcp/protocol.c:2615
mptcp_worker+0x434/0x740 net/mptcp/protocol.c:2767
process_one_work+0x1e0/0x560 kernel/workqueue.c:3254
process_scheduled_works kernel/workqueue.c:3335 [inline]
worker_thread+0x3c7/0x640 kernel/workqueue.c:3416
kthread+0x121/0x170 kernel/kthread.c:388
ret_from_fork+0x44/0x50 arch/x86/kernel/process.c:147
ret_from_fork_asm+0x1a/0x30 arch/x86/entry/entry_64.S:243
</TASK>
When fallback to TCP happens early on a client socket, snd_nxt
is not yet initialized and any incoming ack will copy such value
into snd_una. If the mptcp worker (dumbly) tries mptcp-level
re-injection after such ack, that would unconditionally trigger a send
buffer cleanup using 'bad' snd_una values.
We could easily disable re-injection for fallback sockets, but such
dumb behavior already helped catching a few subtle issues and a very
low to zero impact in practice.
Instead address the issue always initializing snd_nxt (and write_seq,
for consistency) at connect time.
Fixes: 8fd738049ac3 ("mptcp: fallback in case of simultaneous connect")
Cc: stable(a)vger.kernel.org
Reported-by: Christoph Paasch <cpaasch(a)apple.com>
Closes: https://github.com/multipath-tcp/mptcp_net-next/issues/485
Tested-by: Christoph Paasch <cpaasch(a)apple.com>
Signed-off-by: Paolo Abeni <pabeni(a)redhat.com>
Reviewed-by: Mat Martineau <martineau(a)kernel.org>
Signed-off-by: Matthieu Baerts (NGI0) <matttbe(a)kernel.org>
Link: https://lore.kernel.org/r/20240429-upstream-net-20240429-mptcp-snd_nxt-init…
Signed-off-by: Jakub Kicinski <kuba(a)kernel.org>
diff --git a/net/mptcp/protocol.c b/net/mptcp/protocol.c
index 7e74b812e366..965eb69dc5de 100644
--- a/net/mptcp/protocol.c
+++ b/net/mptcp/protocol.c
@@ -3723,6 +3723,9 @@ static int mptcp_connect(struct sock *sk, struct sockaddr *uaddr, int addr_len)
MPTCP_INC_STATS(sock_net(ssk), MPTCP_MIB_TOKENFALLBACKINIT);
mptcp_subflow_early_fallback(msk, subflow);
}
+
+ WRITE_ONCE(msk->write_seq, subflow->idsn);
+ WRITE_ONCE(msk->snd_nxt, subflow->idsn);
if (likely(!__mptcp_check_fallback(msk)))
MPTCP_INC_STATS(sock_net(sk), MPTCP_MIB_MPCAPABLEACTIVE);
From: Cristian Marussi <cristian.marussi(a)arm.com>
[ Upstream commit e9076ffbcaed5da6c182b144ef9f6e24554af268 ]
Accessing reset domains descriptors by the index upon the SCMI drivers
requests through the SCMI reset operations interface can potentially
lead to out-of-bound violations if the SCMI driver misbehave.
Add an internal consistency check before any such domains descriptors
accesses.
Link: https://lore.kernel.org/r/20220817172731.1185305-5-cristian.marussi@arm.com
Signed-off-by: Cristian Marussi <cristian.marussi(a)arm.com>
Signed-off-by: Sudeep Holla <sudeep.holla(a)arm.com>
Signed-off-by: Dominique Martinet <dominique.martinet(a)atmark-techno.com>
---
This is the backport I promised for CVE-2022-48655[1]
[1] https://lkml.kernel.org/r/Zj4t4q_w6gqzdvhz@codewreck.org
The 'pi' variable declaration context just changed a bit
(handle->reset_priv -> ph->get_priv(ph)) but the patch is
otherwise fine as is.
(I've also checked that num_domains is properly initialized at module
init time and this part of the code hasn't changed until 5.15, so it
should be safe to use this previously unused field)
This same patch applies cleanly to both 5.4.275 and 5.10.216.
Thanks!
drivers/firmware/arm_scmi/reset.c | 6 +++++-
1 file changed, 5 insertions(+), 1 deletion(-)
diff --git a/drivers/firmware/arm_scmi/reset.c b/drivers/firmware/arm_scmi/reset.c
index a981a22cfe89..b8388a3b9c06 100644
--- a/drivers/firmware/arm_scmi/reset.c
+++ b/drivers/firmware/arm_scmi/reset.c
@@ -149,8 +149,12 @@ static int scmi_domain_reset(const struct scmi_handle *handle, u32 domain,
struct scmi_xfer *t;
struct scmi_msg_reset_domain_reset *dom;
struct scmi_reset_info *pi = handle->reset_priv;
- struct reset_dom_info *rdom = pi->dom_info + domain;
+ struct reset_dom_info *rdom;
+ if (domain >= pi->num_domains)
+ return -EINVAL;
+
+ rdom = pi->dom_info + domain;
if (rdom->async_reset)
flags |= ASYNCHRONOUS_RESET;
--
2.39.2
From: Jiri Olsa <jolsa(a)kernel.org>
commit 117211aa739a926e6555cfea883be84bee6f1695 upstream.
Pengfei Xu reported [1] Syzkaller/KASAN issue found in bpf_link_show_fdinfo.
The reason is missing BPF_LINK_TYPE invocation for uprobe multi
link and for several other links, adding that.
[1] https://lore.kernel.org/bpf/ZXptoKRSLspnk2ie@xpf.sh.intel.com/
Fixes: 89ae89f53d20 ("bpf: Add multi uprobe link")
Fixes: e420bed02507 ("bpf: Add fd-based tcx multi-prog infra with link support")
Fixes: 84601d6ee68a ("bpf: add bpf_link support for BPF_NETFILTER programs")
Fixes: 35dfaad7188c ("netkit, bpf: Add bpf programmable net device")
Reported-by: Pengfei Xu <pengfei.xu(a)intel.com>
Signed-off-by: Jiri Olsa <jolsa(a)kernel.org>
Signed-off-by: Andrii Nakryiko <andrii(a)kernel.org>
Tested-by: Pengfei Xu <pengfei.xu(a)intel.com>
Acked-by: Hou Tao <houtao1(a)huawei.com>
Link: https://lore.kernel.org/bpf/20231215230502.2769743-1-jolsa@kernel.org
Cc: stable(a)vger.kernel.org # 6.6
Signed-off-by: Ignat Korchagin <ignat(a)cloudflare.com>
---
Hi,
We have experienced a KASAN warning in production on a 6.6 kernel, similar to
[1]. This backported patch was adjusted to apply onto 6.6 stable branch: the
only change is dropping the BPF_LINK_TYPE(BPF_LINK_TYPE_NETKIT, netkit)
definition from the header as netkit was only introduced in 6.7 and 6.7 has the
backport already.
I was not able to run the syzkaller reproducer from [1], but we have not seen
the KASAN warning in production since applying this patch internally.
Regards,
Ignat
include/linux/bpf_types.h | 3 +++
1 file changed, 3 insertions(+)
diff --git a/include/linux/bpf_types.h b/include/linux/bpf_types.h
index fc0d6f32c687..dfaae3e3ec15 100644
--- a/include/linux/bpf_types.h
+++ b/include/linux/bpf_types.h
@@ -142,9 +142,12 @@ BPF_LINK_TYPE(BPF_LINK_TYPE_ITER, iter)
#ifdef CONFIG_NET
BPF_LINK_TYPE(BPF_LINK_TYPE_NETNS, netns)
BPF_LINK_TYPE(BPF_LINK_TYPE_XDP, xdp)
+BPF_LINK_TYPE(BPF_LINK_TYPE_NETFILTER, netfilter)
+BPF_LINK_TYPE(BPF_LINK_TYPE_TCX, tcx)
#endif
#ifdef CONFIG_PERF_EVENTS
BPF_LINK_TYPE(BPF_LINK_TYPE_PERF_EVENT, perf)
#endif
BPF_LINK_TYPE(BPF_LINK_TYPE_KPROBE_MULTI, kprobe_multi)
BPF_LINK_TYPE(BPF_LINK_TYPE_STRUCT_OPS, struct_ops)
+BPF_LINK_TYPE(BPF_LINK_TYPE_UPROBE_MULTI, uprobe_multi)
--
2.39.2
From: Mark Rutland <mark.rutland(a)arm.com>
[ Upstream commit 657eef0a5420a02c02945ed8c87f2ddcbd255772 ]
Currently CONFIG_ARM64_USE_LSE_ATOMICS depends upon CONFIG_JUMP_LABEL,
as the inline atomics were indirected with a static branch.
However, since commit:
21fb26bfb01ffe0d ("arm64: alternatives: add alternative_has_feature_*()")
... we use an alternative_branch (which is always available) rather than
a static branch, and hence the dependency is unnecessary.
Remove the stale dependency, along with the stale include. This will
allow the use of LSE atomics in kernels built with CONFIG_JUMP_LABEL=n,
and reduces the risk of circular header dependencies via <asm/lse.h>.
Signed-off-by: Mark Rutland <mark.rutland(a)arm.com>
Cc: Catalin Marinas <catalin.marinas(a)arm.com>
Cc: Will Deacon <will(a)kernel.org>
Link: https://lore.kernel.org/r/20221114125424.2998268-1-mark.rutland@arm.com
Signed-off-by: Will Deacon <will(a)kernel.org>
Signed-off-by: Oleksandr Tymoshenko <ovt(a)google.com>
---
arch/arm64/Kconfig | 1 -
arch/arm64/include/asm/lse.h | 1 -
2 files changed, 2 deletions(-)
diff --git a/arch/arm64/Kconfig b/arch/arm64/Kconfig
index c15f71501c6c..044b98a62f7b 100644
--- a/arch/arm64/Kconfig
+++ b/arch/arm64/Kconfig
@@ -1752,7 +1752,6 @@ config ARM64_LSE_ATOMICS
config ARM64_USE_LSE_ATOMICS
bool "Atomic instructions"
- depends on JUMP_LABEL
default y
help
As part of the Large System Extensions, ARMv8.1 introduces new
diff --git a/arch/arm64/include/asm/lse.h b/arch/arm64/include/asm/lse.h
index c503db8e73b0..f99d74826a7e 100644
--- a/arch/arm64/include/asm/lse.h
+++ b/arch/arm64/include/asm/lse.h
@@ -10,7 +10,6 @@
#include <linux/compiler_types.h>
#include <linux/export.h>
-#include <linux/jump_label.h>
#include <linux/stringify.h>
#include <asm/alternative.h>
#include <asm/alternative-macros.h>
---
base-commit: 4078fa637fcd80c8487680ec2e4ef7c58308e9aa
change-id: 20240521-lse-atomics-6-1-b0960e206035
Best regards,
--
Oleksandr Tymoshenko <ovt(a)google.com>
The patch below does not apply to the 5.10-stable tree.
If someone wants it applied there, or to any other stable or longterm
tree, then please email the backport, including the original git commit
id to <stable(a)vger.kernel.org>.
To reproduce the conflict and resubmit, you may use the following commands:
git fetch https://git.kernel.org/pub/scm/linux/kernel/git/stable/linux.git/ linux-5.10.y
git checkout FETCH_HEAD
git cherry-pick -x 6c41468c7c12d74843bb414fc00307ea8a6318c3
# <resolve conflicts, build, test, etc.>
git commit -s
git send-email --to '<stable(a)vger.kernel.org>' --in-reply-to '2023041135-yippee-shabby-b9ad@gregkh' --subject-prefix 'PATCH 5.10.y' HEAD^..
Possible dependencies:
6c41468c7c12 ("KVM: x86: Clear "has_error_code", not "error_code", for RM exception injection")
d4963e319f1f ("KVM: x86: Make kvm_queued_exception a properly named, visible struct")
6ad75c5c99f7 ("KVM: x86: Rename kvm_x86_ops.queue_exception to inject_exception")
5623f751bd9c ("KVM: x86: Treat #DBs from the emulator as fault-like (code and DR7.GD=1)")
8d178f460772 ("KVM: nVMX: Treat General Detect #DB (DR7.GD=1) as fault-like")
eba9799b5a6e ("KVM: VMX: Drop bits 31:16 when shoving exception error code into VMCS")
a61d7c5432ac ("KVM: x86: Trace re-injected exceptions")
6ef88d6e36c2 ("KVM: SVM: Re-inject INT3/INTO instead of retrying the instruction")
3741aec4c38f ("KVM: SVM: Stuff next_rip on emulated INT3 injection if NRIPS is supported")
cd9e6da8048c ("KVM: SVM: Unwind "speculative" RIP advancement if INTn injection "fails"")
00f08d99dd7d ("KVM: nSVM: Sync next_rip field from vmcb12 to vmcb02")
9bd1f0efa859 ("KVM: nVMX: Clear IDT vectoring on nested VM-Exit for double/triple fault")
c3634d25fbee ("KVM: nVMX: Leave most VM-Exit info fields unmodified on failed VM-Entry")
1d5a1b5860ed ("KVM: x86: nSVM: correctly virtualize LBR msrs when L2 is running")
db663af4a001 ("kvm: x86: SVM: use vmcb* instead of svm->vmcb where it makes sense")
b9f3973ab3a8 ("KVM: x86: nSVM: implement nested VMLOAD/VMSAVE")
23e5092b6e2a ("KVM: SVM: Rename hook implementations to conform to kvm_x86_ops' names")
e27bc0440ebd ("KVM: x86: Rename kvm_x86_ops pointers to align w/ preferred vendor names")
068f7ea61895 ("KVM: SVM: improve split between svm_prepare_guest_switch and sev_es_prepare_guest_switch")
e1779c2714c3 ("KVM: x86: nSVM: fix potential NULL derefernce on nested migration")
thanks,
greg k-h
------------------ original commit in Linus's tree ------------------
From 6c41468c7c12d74843bb414fc00307ea8a6318c3 Mon Sep 17 00:00:00 2001
From: Sean Christopherson <seanjc(a)google.com>
Date: Wed, 22 Mar 2023 07:32:59 -0700
Subject: [PATCH] KVM: x86: Clear "has_error_code", not "error_code", for RM
exception injection
When injecting an exception into a vCPU in Real Mode, suppress the error
code by clearing the flag that tracks whether the error code is valid, not
by clearing the error code itself. The "typo" was introduced by recent
fix for SVM's funky Paged Real Mode.
Opportunistically hoist the logic above the tracepoint so that the trace
is coherent with respect to what is actually injected (this was also the
behavior prior to the buggy commit).
Fixes: b97f07458373 ("KVM: x86: determine if an exception has an error code only when injecting it.")
Cc: stable(a)vger.kernel.org
Cc: Maxim Levitsky <mlevitsk(a)redhat.com>
Signed-off-by: Sean Christopherson <seanjc(a)google.com>
Message-Id: <20230322143300.2209476-2-seanjc(a)google.com>
Signed-off-by: Paolo Bonzini <pbonzini(a)redhat.com>
diff --git a/arch/x86/kvm/x86.c b/arch/x86/kvm/x86.c
index 45017576ad5e..7d6f98b7635f 100644
--- a/arch/x86/kvm/x86.c
+++ b/arch/x86/kvm/x86.c
@@ -9908,13 +9908,20 @@ int kvm_check_nested_events(struct kvm_vcpu *vcpu)
static void kvm_inject_exception(struct kvm_vcpu *vcpu)
{
+ /*
+ * Suppress the error code if the vCPU is in Real Mode, as Real Mode
+ * exceptions don't report error codes. The presence of an error code
+ * is carried with the exception and only stripped when the exception
+ * is injected as intercepted #PF VM-Exits for AMD's Paged Real Mode do
+ * report an error code despite the CPU being in Real Mode.
+ */
+ vcpu->arch.exception.has_error_code &= is_protmode(vcpu);
+
trace_kvm_inj_exception(vcpu->arch.exception.vector,
vcpu->arch.exception.has_error_code,
vcpu->arch.exception.error_code,
vcpu->arch.exception.injected);
- if (vcpu->arch.exception.error_code && !is_protmode(vcpu))
- vcpu->arch.exception.error_code = false;
static_call(kvm_x86_inject_exception)(vcpu);
}
Backport fix commit ("tls: fix race between async notify and socket close") for CVE-2024-26583 [1].
It's dependent on three tls commits being used to simplify and factor out async waiting.
They also benefit backporting fix commit ("net: tls: handle backlogging of crypto requests")
for CVE-2024-26584 [2]. Therefore, add them for clean backport:
Jakub Kicinski (4):
tls: rx: simplify async wait
net: tls: factor out tls_*crypt_async_wait()
tls: fix race between async notify and socket close
net: tls: handle backlogging of crypto requests
Sabrina Dubroca (1):
tls: extract context alloc/initialization out of tls_set_sw_offload
Please review and consider applying these patches.
[1] https://lore.kernel.org/all/2024022146-traction-unjustly-f451@gregkh/
[2] https://lore.kernel.org/all/2024022148-showpiece-yanking-107c@gregkh/
include/net/tls.h | 6 --
net/tls/tls_sw.c | 199 ++++++++++++++++++++++++----------------------
2 files changed, 106 insertions(+), 99 deletions(-)
--
2.40.1
Fix a use-after-free on dentry's d_fsdata fid list when a thread
lookups a fid through dentry while another thread unlinks it:
UAF thread:
refcount_t: addition on 0; use-after-free.
p9_fid_get linux/./include/net/9p/client.h:262
v9fs_fid_find+0x236/0x280 linux/fs/9p/fid.c:129
v9fs_fid_lookup_with_uid linux/fs/9p/fid.c:181
v9fs_fid_lookup+0xbf/0xc20 linux/fs/9p/fid.c:314
v9fs_vfs_getattr_dotl+0xf9/0x360 linux/fs/9p/vfs_inode_dotl.c:400
vfs_statx+0xdd/0x4d0 linux/fs/stat.c:248
Freed by:
p9_client_clunk+0xb0/0xe0 linux/net/9p/client.c:1456
p9_fid_put linux/./include/net/9p/client.h:278
v9fs_dentry_release+0xb5/0x140 linux/fs/9p/vfs_dentry.c:55
v9fs_remove+0x38f/0x620 linux/fs/9p/vfs_inode.c:518
vfs_unlink+0x29a/0x810 linux/fs/namei.c:4335
The problem is that d_fsdata was not accessed under d_lock, because
d_release() normally is only called once the dentry is otherwise no
longer accessible but since we also call it explicitly in v9fs_remove
that lock is required:
move the hlist out of the dentry under lock then unref its fids once
they are no longer accessible.
Fixes: 154372e67d40 ("fs/9p: fix create-unlink-getattr idiom")
Cc: stable(a)vger.kernel.org
Reported-by: Meysam Firouzi
Reported-by: Amirmohammad Eftekhar
Signed-off-by: Dominique Martinet <asmadeus(a)codewreck.org>
---
fs/9p/vfs_dentry.c | 9 +++++++--
1 file changed, 7 insertions(+), 2 deletions(-)
diff --git a/fs/9p/vfs_dentry.c b/fs/9p/vfs_dentry.c
index f16f73581634..01338d4c2d9e 100644
--- a/fs/9p/vfs_dentry.c
+++ b/fs/9p/vfs_dentry.c
@@ -48,12 +48,17 @@ static int v9fs_cached_dentry_delete(const struct dentry *dentry)
static void v9fs_dentry_release(struct dentry *dentry)
{
struct hlist_node *p, *n;
+ struct hlist_head head;
p9_debug(P9_DEBUG_VFS, " dentry: %pd (%p)\n",
dentry, dentry);
- hlist_for_each_safe(p, n, (struct hlist_head *)&dentry->d_fsdata)
+
+ spin_lock(&dentry->d_lock);
+ hlist_move_list((struct hlist_head *)&dentry->d_fsdata, &head);
+ spin_unlock(&dentry->d_lock);
+
+ hlist_for_each_safe(p, n, &head)
p9_fid_put(hlist_entry(p, struct p9_fid, dlist));
- dentry->d_fsdata = NULL;
}
static int v9fs_lookup_revalidate(struct dentry *dentry, unsigned int flags)
--
2.44.0
Some registers may be modified by parallel execution contexts and
require protections to prevent corruption.
A review of the driver revealed the need for these additional
protections.
Doug Berger (3):
net: bcmgenet: synchronize EXT_RGMII_OOB_CTRL access
net: bcmgenet: synchronize use of bcmgenet_set_rx_mode()
net: bcmgenet: synchronize UMAC_CMD access
drivers/net/ethernet/broadcom/genet/bcmgenet.c | 14 +++++++++++++-
drivers/net/ethernet/broadcom/genet/bcmgenet.h | 2 ++
drivers/net/ethernet/broadcom/genet/bcmgenet_wol.c | 6 ++++++
drivers/net/ethernet/broadcom/genet/bcmmii.c | 4 ++++
4 files changed, 25 insertions(+), 1 deletion(-)
--
These commits are dependent on the previously submitted:
[PATCH stable 5.4 0/2] net: bcmgenet: revisit MAC reset
2.34.1
Commit 3a55402c9387 ("net: bcmgenet: use RGMII loopback for MAC
reset") was intended to resolve issues with reseting the UniMAC
core within the GENET block by providing better control over the
clocks used by the UniMAC core. Unfortunately, it is not
compatible with all of the supported system configurations so an
alternative method must be applied.
This commit set provides such an alternative. The first commit
reverts the previous change and the second commit provides the
alternative reset sequence that addresses the concerns observed
with the previous implementation.
This replacement implementation should be applied to the stable
branches wherever commit 3a55402c9387 ("net: bcmgenet: use RGMII
loopback for MAC reset") has been applied.
Unfortunately, reverting that commit may conflict with some
restructuring changes introduced by commit 4f8d81b77e66 ("net:
bcmgenet: Refactor register access in bcmgenet_mii_config").
The first commit in this set has been manually edited to
resolve the conflict on stable/linux-5.4.y.
Doug Berger (2):
Revert "net: bcmgenet: use RGMII loopback for MAC reset"
net: bcmgenet: keep MAC in reset until PHY is up
.../net/ethernet/broadcom/genet/bcmgenet.c | 10 ++---
.../ethernet/broadcom/genet/bcmgenet_wol.c | 6 ++-
drivers/net/ethernet/broadcom/genet/bcmmii.c | 39 +++----------------
3 files changed, 16 insertions(+), 39 deletions(-)
--
2.34.1
From: Mark Brown <broonie(a)kernel.org>
[ Upstream commit 907f33028871fa7c9a3db1efd467b78ef82cce20 ]
The standard library perror() function provides a convenient way to print
an error message based on the current errno but this doesn't play nicely
with KTAP output. Provide a helper which does an equivalent thing in a KTAP
compatible format.
nolibc doesn't have a strerror() and adding the table of strings required
doesn't seem like a good fit for what it's trying to do so when we're using
that only print the errno.
Signed-off-by: Mark Brown <broonie(a)kernel.org>
Reviewed-by: Kees Cook <keescook(a)chromium.org>
Signed-off-by: Shuah Khan <skhan(a)linuxfoundation.org>
Stable-dep-of: 071af0c9e582 ("selftests: timers: Convert posix_timers test to generate KTAP output")
Signed-off-by: Edward Liaw <edliaw(a)google.com>
---
tools/testing/selftests/kselftest.h | 14 ++++++++++++++
1 file changed, 14 insertions(+)
diff --git a/tools/testing/selftests/kselftest.h b/tools/testing/selftests/kselftest.h
index e8eecbc83a60..ad7b97e16f37 100644
--- a/tools/testing/selftests/kselftest.h
+++ b/tools/testing/selftests/kselftest.h
@@ -48,6 +48,7 @@
#include <stdlib.h>
#include <unistd.h>
#include <stdarg.h>
+#include <string.h>
#include <stdio.h>
#include <sys/utsname.h>
#endif
@@ -156,6 +157,19 @@ static inline void ksft_print_msg(const char *msg, ...)
va_end(args);
}
+static inline void ksft_perror(const char *msg)
+{
+#ifndef NOLIBC
+ ksft_print_msg("%s: %s (%d)\n", msg, strerror(errno), errno);
+#else
+ /*
+ * nolibc doesn't provide strerror() and it seems
+ * inappropriate to add one, just print the errno.
+ */
+ ksft_print_msg("%s: %d)\n", msg, errno);
+#endif
+}
+
static inline void ksft_test_result_pass(const char *msg, ...)
{
int saved_errno = errno;
--
2.45.0.215.g3402c0e53f-goog
From: Dave Chinner <dchinner(a)redhat.com>
[ Upstream commit 118e021b4b66f758f8e8f21dc0e5e0a4c721e69e ]
When we reserve a delalloc region in xfs_buffered_write_iomap_begin,
we mark the iomap as IOMAP_F_NEW so that the the write context
understands that it allocated the delalloc region.
If we then fail that buffered write, xfs_buffered_write_iomap_end()
checks for the IOMAP_F_NEW flag and if it is set, it punches out
the unused delalloc region that was allocated for the write.
The assumption this code makes is that all buffered write operations
that can allocate space are run under an exclusive lock (i_rwsem).
This is an invalid assumption: page faults in mmap()d regions call
through this same function pair to map the file range being faulted
and this runs only holding the inode->i_mapping->invalidate_lock in
shared mode.
IOWs, we can have races between page faults and write() calls that
fail the nested page cache write operation that result in data loss.
That is, the failing iomap_end call will punch out the data that
the other racing iomap iteration brought into the page cache. This
can be reproduced with generic/34[46] if we arbitrarily fail page
cache copy-in operations from write() syscalls.
Code analysis tells us that the iomap_page_mkwrite() function holds
the already instantiated and uptodate folio locked across the iomap
mapping iterations. Hence the folio cannot be removed from memory
whilst we are mapping the range it covers, and as such we do not
care if the mapping changes state underneath the iomap iteration
loop:
1. if the folio is not already dirty, there is no writeback races
possible.
2. if we allocated the mapping (delalloc or unwritten), the folio
cannot already be dirty. See #1.
3. If the folio is already dirty, it must be up to date. As we hold
it locked, it cannot be reclaimed from memory. Hence we always
have valid data in the page cache while iterating the mapping.
4. Valid data in the page cache can exist when the underlying
mapping is DELALLOC, UNWRITTEN or WRITTEN. Having the mapping
change from DELALLOC->UNWRITTEN or UNWRITTEN->WRITTEN does not
change the data in the page - it only affects actions if we are
initialising a new page. Hence #3 applies and we don't care
about these extent map transitions racing with
iomap_page_mkwrite().
5. iomap_page_mkwrite() checks for page invalidation races
(truncate, hole punch, etc) after it locks the folio. We also
hold the mapping->invalidation_lock here, and hence the mapping
cannot change due to extent removal operations while we are
iterating the folio.
As such, filesystems that don't use bufferheads will never fail
the iomap_folio_mkwrite_iter() operation on the current mapping,
regardless of whether the iomap should be considered stale.
Further, the range we are asked to iterate is limited to the range
inside EOF that the folio spans. Hence, for XFS, we will only map
the exact range we are asked for, and we will only do speculative
preallocation with delalloc if we are mapping a hole at the EOF
page. The iterator will consume the entire range of the folio that
is within EOF, and anything beyond the EOF block cannot be accessed.
We never need to truncate this post-EOF speculative prealloc away in
the context of the iomap_page_mkwrite() iterator because if it
remains unused we'll remove it when the last reference to the inode
goes away.
Hence we don't actually need an .iomap_end() cleanup/error handling
path at all for iomap_page_mkwrite() for XFS. This means we can
separate the page fault processing from the complexity of the
.iomap_end() processing in the buffered write path. This also means
that the buffered write path will also be able to take the
mapping->invalidate_lock as necessary.
Signed-off-by: Dave Chinner <dchinner(a)redhat.com>
Reviewed-by: Christoph Hellwig <hch(a)lst.de>
Reviewed-by: Darrick J. Wong <djwong(a)kernel.org>
Signed-off-by: Leah Rumancik <leah.rumancik(a)gmail.com>
Acked-by: Darrick J. Wong <djwong(a)kernel.org>
---
fs/xfs/xfs_file.c | 2 +-
fs/xfs/xfs_iomap.c | 9 +++++++++
fs/xfs/xfs_iomap.h | 1 +
3 files changed, 11 insertions(+), 1 deletion(-)
diff --git a/fs/xfs/xfs_file.c b/fs/xfs/xfs_file.c
index e462d39c840e..595a5bcf46b9 100644
--- a/fs/xfs/xfs_file.c
+++ b/fs/xfs/xfs_file.c
@@ -1325,7 +1325,7 @@ __xfs_filemap_fault(
if (write_fault) {
xfs_ilock(XFS_I(inode), XFS_MMAPLOCK_SHARED);
ret = iomap_page_mkwrite(vmf,
- &xfs_buffered_write_iomap_ops);
+ &xfs_page_mkwrite_iomap_ops);
xfs_iunlock(XFS_I(inode), XFS_MMAPLOCK_SHARED);
} else {
ret = filemap_fault(vmf);
diff --git a/fs/xfs/xfs_iomap.c b/fs/xfs/xfs_iomap.c
index 07da03976ec1..5cea069a38b4 100644
--- a/fs/xfs/xfs_iomap.c
+++ b/fs/xfs/xfs_iomap.c
@@ -1187,6 +1187,15 @@ const struct iomap_ops xfs_buffered_write_iomap_ops = {
.iomap_end = xfs_buffered_write_iomap_end,
};
+/*
+ * iomap_page_mkwrite() will never fail in a way that requires delalloc extents
+ * that it allocated to be revoked. Hence we do not need an .iomap_end method
+ * for this operation.
+ */
+const struct iomap_ops xfs_page_mkwrite_iomap_ops = {
+ .iomap_begin = xfs_buffered_write_iomap_begin,
+};
+
static int
xfs_read_iomap_begin(
struct inode *inode,
diff --git a/fs/xfs/xfs_iomap.h b/fs/xfs/xfs_iomap.h
index c782e8c0479c..0f62ab633040 100644
--- a/fs/xfs/xfs_iomap.h
+++ b/fs/xfs/xfs_iomap.h
@@ -47,6 +47,7 @@ xfs_aligned_fsb_count(
}
extern const struct iomap_ops xfs_buffered_write_iomap_ops;
+extern const struct iomap_ops xfs_page_mkwrite_iomap_ops;
extern const struct iomap_ops xfs_direct_write_iomap_ops;
extern const struct iomap_ops xfs_read_iomap_ops;
extern const struct iomap_ops xfs_seek_iomap_ops;
--
2.45.0.rc1.225.g2a3ae87e7f-goog
On Thu, May 23, 2024 at 08:21:23AM +0000, Lin Gui (桂林) wrote:
> Dear @Greg KH<mailto:gregkh@linuxfoundation.org>,
>
>
> I don't understand, why does this qualify as a stable patch? The
>
> changes says this is "optional", which means the device should work just
>
> fine without it, right?
>
> [MTK]
>
> If without this patch, some emmc devices may cause unstable operation and report CRC errors.
>
>
>
> Is this a regression fix from something that previously used to work
>
> properly?
> [MTK]
> Yes
Ok, thanks. But you need to provide a working, and tested, version of
it for 5.15.y as it obviously does not even work there (which means you
did not test that?)
greg k-h
In the prueth_probe() function, if one of the calls to emac_phy_connect()
fails due to of_phy_connect() returning NULL, then the subsequent call to
phy_attached_info() will dereference a NULL pointer.
Check the return code of emac_phy_connect and fail cleanly if there is an
error.
Fixes: 128d5874c082 ("net: ti: icssg-prueth: Add ICSSG ethernet driver")
Cc: stable(a)vger.kernel.org
Signed-off-by: Romain Gantois <romain.gantois(a)bootlin.com>
---
Hello everyone,
There is a possible NULL pointer dereference in the prueth_probe() function of
the icssg_prueth driver. I discovered this while testing a platform with one
PRUETH MAC enabled out of the two available.
These are the requirements to reproduce the bug:
prueth_probe() is called
either eth0_node or eth1_node is not NULL
in emac_phy_connect: of_phy_connect() returns NULL
Then, the following leads to the NULL pointer dereference:
prueth->emac[PRUETH_MAC0]->ndev->phydev is set to NULL
prueth->emac[PRUETH_MAC0]->ndev->phydev is passed to phy_attached_info()
-> phy_attached_print() dereferences phydev which is NULL
This series provides a fix by checking the return code of emac_phy_connect().
Best Regards,
Romain
---
drivers/net/ethernet/ti/icssg/icssg_prueth.c | 14 ++++++++++++--
1 file changed, 12 insertions(+), 2 deletions(-)
diff --git a/drivers/net/ethernet/ti/icssg/icssg_prueth.c b/drivers/net/ethernet/ti/icssg/icssg_prueth.c
index 7c9e9518f555a..1ea3fbd5e954e 100644
--- a/drivers/net/ethernet/ti/icssg/icssg_prueth.c
+++ b/drivers/net/ethernet/ti/icssg/icssg_prueth.c
@@ -1039,7 +1039,12 @@ static int prueth_probe(struct platform_device *pdev)
prueth->registered_netdevs[PRUETH_MAC0] = prueth->emac[PRUETH_MAC0]->ndev;
- emac_phy_connect(prueth->emac[PRUETH_MAC0]);
+ ret = emac_phy_connect(prueth->emac[PRUETH_MAC0]);
+ if (ret) {
+ dev_err(dev,
+ "can't connect to MII0 PHY, error -%d", ret);
+ goto netdev_unregister;
+ }
phy_attached_info(prueth->emac[PRUETH_MAC0]->ndev->phydev);
}
@@ -1051,7 +1056,12 @@ static int prueth_probe(struct platform_device *pdev)
}
prueth->registered_netdevs[PRUETH_MAC1] = prueth->emac[PRUETH_MAC1]->ndev;
- emac_phy_connect(prueth->emac[PRUETH_MAC1]);
+ ret = emac_phy_connect(prueth->emac[PRUETH_MAC1]);
+ if (ret) {
+ dev_err(dev,
+ "can't connect to MII1 PHY, error %d", ret);
+ goto netdev_unregister;
+ }
phy_attached_info(prueth->emac[PRUETH_MAC1]->ndev->phydev);
}
---
base-commit: e4a87abf588536d1cdfb128595e6e680af5cf3ed
change-id: 20240521-icssg-prueth-fix-03b03064c5ce
Best regards,
--
Romain Gantois <romain.gantois(a)bootlin.com>
On Thu, May 23, 2024 at 06:41:04AM +0000, Lin Gui (桂林) wrote:
> Dear @Greg KH<mailto:gregkh@linuxfoundation.org>,
>
>
> What is the git id of it in Linus's tree?
> [MTK]
>
> https://git.kernel.org/pub/scm/linux/kernel/git/torvalds/linux.git/commit/d…
>
>
> author Mengqi Zhang <mengqi.zhang(a)mediatek.com> 2023-12-25 17:38:40 +0800
> committer Ulf Hansson <ulf.hansson(a)linaro.org> 2024-01-02 17:54:05 +0100
> commit 77e01b49e35f24ebd1659096d5fc5c3b75975545 (patch)
> tree 02a13063666685bc7061b46183fc45298b2dc9f4 /drivers/mmc/core/mmc.c
> parent 09f164d393a6671e5ff8342ba6b3cb7fe3f20208 (diff)
> download linux-77e01b49e35f24ebd1659096d5fc5c3b75975545.tar.gz
> mmc: core: Add HS400 tuning in HS400es initialization
> During the initialization to HS400es stage, add a HS400 tuning flow as an
> optional process. For Mediatek IP, the HS400es mode requires a specific
> tuning to ensure the correct HS400 timing setting.
>
> Signed-off-by: Mengqi Zhang <mengqi.zhang(a)mediatek.com>
> Link: https://lore.kernel.org/r/20231225093839.22931-2-mengqi.zhang@mediatek.com
> Signed-off-by: Ulf Hansson ulf.hansson(a)linaro.org<mailto:ulf.hansson@linaro.org>
I don't understand, why does this qualify as a stable patch? The
changes says this is "optional", which means the device should work just
fine without it, right?
Is this a regression fix from something that previously used to work
properly?
You have read:
https://www.kernel.org/doc/html/latest/process/stable-kernel-rules.html
right?
thanks,
greg k-h
Hi reviewers,
I suggest to backport a commit to Linux kernel-5.10 and 6.6 stable tree.
https://git.kernel.org/pub/scm/linux/kernel/git/torvalds/linux.git/commit/d…
author Lin Gui lin.gui(a)mediatek.com 2023-12-19 07:05:32 +0800
committer Ulf Hansson ulf.hansson(a)linaro.org 2024-01-02 17:54:05 +0100
commit e4df56ad0bf3506c5189abb9be83f3bea05a4c4f (patch)
tree a5db3a85f44b29dd773c5c65c3340d50b74b6687 /drivers/mmc/core/mmc.c
parent b062136d0d6f46d7ad5c88219cbd75f90cb18e81 (diff)
download linux-e4df56ad0bf3506c5189abb9be83f3bea05a4c4f.tar.gz
mmc: core: Add wp_grp_size sysfs node
The eMMC card can be set into write-protected mode to prevent data from
being accidentally modified or deleted. Wp_grp_size (Write Protect Group
Size) refers to an attribute of the eMMC card, used to manage write
protection and is the CSD register [36:32] of the eMMC device. Wp_grp_size
(Write Protect Group Size) indicates how many eMMC blocks are contained in
each write protection group on the eMMC card.
To allow userspace easy access of the CSD register bits, let's add sysfs
node "wp_grp_size".
Signed-off-by: Lin Gui lin.gui(a)mediatek.com
Signed-off-by: Bo Ye bo.ye(a)mediatek.com
Reviewed-by: AngeloGioacchino Del Regno angelogioacchino.delregno(a)collabora.com
Link: https://lore.kernel.org/r/20231218230532.82427-1-bo.ye@mediatek.com
Signed-off-by: Ulf Hansson ulf.hansson(a)linaro.org
------------------------------------
Best Regards !
Guilin
=====================================
MediaTek(ChengDu)Inc.
Email: mailto:lin.gui@mediatek.com
tel:+86-28-85939000-67009
Fax:+86-28-85929875
==============================================
From: "Steven Rostedt (Google)" <rostedt(a)goodmis.org>
When the inode is being dropped from the dentry, the TRACEFS_EVENT_INODE
flag needs to be cleared to prevent a remount from calling
eventfs_remount() on the tracefs_inode private data. There's a race
between the inode is dropped (and the dentry freed) to where the inode is
actually freed. If a remount happens between the two, the eventfs_inode
could be accessed after it is freed (only the dentry keeps a ref count on
it).
Currently the TRACEFS_EVENT_INODE flag is cleared from the dentry iput()
function. But this is incorrect, as it is possible that the inode has
another reference to it. The flag should only be cleared when the inode is
really being dropped and has no more references. That happens in the
drop_inode callback of the inode, as that gets called when the last
reference of the inode is released.
Remove the tracefs_d_iput() function and move its logic to the more
appropriate tracefs_drop_inode() callback function.
Cc: stable(a)vger.kernel.org
Fixes: baa23a8d4360d ("tracefs: Reset permissions on remount if permissions are options")
Signed-off-by: Steven Rostedt (Google) <rostedt(a)goodmis.org>
---
fs/tracefs/inode.c | 33 +++++++++++++++++----------------
1 file changed, 17 insertions(+), 16 deletions(-)
diff --git a/fs/tracefs/inode.c b/fs/tracefs/inode.c
index 9252e0d78ea2..7c29f4afc23d 100644
--- a/fs/tracefs/inode.c
+++ b/fs/tracefs/inode.c
@@ -426,10 +426,26 @@ static int tracefs_show_options(struct seq_file *m, struct dentry *root)
return 0;
}
+static int tracefs_drop_inode(struct inode *inode)
+{
+ struct tracefs_inode *ti = get_tracefs(inode);
+
+ /*
+ * This inode is being freed and cannot be used for
+ * eventfs. Clear the flag so that it doesn't call into
+ * eventfs during the remount flag updates. The eventfs_inode
+ * gets freed after an RCU cycle, so the content will still
+ * be safe if the iteration is going on now.
+ */
+ ti->flags &= ~TRACEFS_EVENT_INODE;
+
+ return 1;
+}
+
static const struct super_operations tracefs_super_operations = {
.alloc_inode = tracefs_alloc_inode,
.free_inode = tracefs_free_inode,
- .drop_inode = generic_delete_inode,
+ .drop_inode = tracefs_drop_inode,
.statfs = simple_statfs,
.show_options = tracefs_show_options,
};
@@ -455,22 +471,7 @@ static int tracefs_d_revalidate(struct dentry *dentry, unsigned int flags)
return !(ei && ei->is_freed);
}
-static void tracefs_d_iput(struct dentry *dentry, struct inode *inode)
-{
- struct tracefs_inode *ti = get_tracefs(inode);
-
- /*
- * This inode is being freed and cannot be used for
- * eventfs. Clear the flag so that it doesn't call into
- * eventfs during the remount flag updates. The eventfs_inode
- * gets freed after an RCU cycle, so the content will still
- * be safe if the iteration is going on now.
- */
- ti->flags &= ~TRACEFS_EVENT_INODE;
-}
-
static const struct dentry_operations tracefs_dentry_operations = {
- .d_iput = tracefs_d_iput,
.d_revalidate = tracefs_d_revalidate,
.d_release = tracefs_d_release,
};
--
2.43.0
From: "Steven Rostedt (Google)" <rostedt(a)goodmis.org>
The change to update the permissions of the eventfs_inode had the
misconception that using the tracefs_inode would find all the
eventfs_inodes that have been updated and reset them on remount.
The problem with this approach is that the eventfs_inodes are freed when
they are no longer used (basically the reason the eventfs system exists).
When they are freed, the updated eventfs_inodes are not reset on a remount
because their tracefs_inodes have been freed.
Instead, since the events directory eventfs_inode always has a
tracefs_inode pointing to it (it is not freed when finished), and the
events directory has a link to all its children, have the
eventfs_remount() function only operate on the events eventfs_inode and
have it descend into its children updating their uid and gids.
Link: https://lore.kernel.org/all/CAK7LNARXgaWw3kH9JgrnH4vK6fr8LDkNKf3wq8NhMWJrVw…
Cc: stable(a)vger.kernel.org
Fixes: baa23a8d4360d ("tracefs: Reset permissions on remount if permissions are options")
Reported-by: Masahiro Yamada <masahiroy(a)kernel.org>
Signed-off-by: Steven Rostedt (Google) <rostedt(a)goodmis.org>
---
fs/tracefs/event_inode.c | 44 ++++++++++++++++++++++++++++------------
1 file changed, 31 insertions(+), 13 deletions(-)
diff --git a/fs/tracefs/event_inode.c b/fs/tracefs/event_inode.c
index 5dfb1ccd56ea..129d0f54ba62 100644
--- a/fs/tracefs/event_inode.c
+++ b/fs/tracefs/event_inode.c
@@ -305,27 +305,27 @@ static const struct file_operations eventfs_file_operations = {
.llseek = generic_file_llseek,
};
-/*
- * On a remount of tracefs, if UID or GID options are set, then
- * the mount point inode permissions should be used.
- * Reset the saved permission flags appropriately.
- */
-void eventfs_remount(struct tracefs_inode *ti, bool update_uid, bool update_gid)
+static void eventfs_set_attrs(struct eventfs_inode *ei, bool update_uid, kuid_t uid,
+ bool update_gid, kgid_t gid, int level)
{
- struct eventfs_inode *ei = ti->private;
+ struct eventfs_inode *ei_child;
- if (!ei)
+ /* Update events/<system>/<event> */
+ if (WARN_ON_ONCE(level > 3))
return;
if (update_uid) {
ei->attr.mode &= ~EVENTFS_SAVE_UID;
- ei->attr.uid = ti->vfs_inode.i_uid;
+ ei->attr.uid = uid;
}
-
if (update_gid) {
ei->attr.mode &= ~EVENTFS_SAVE_GID;
- ei->attr.gid = ti->vfs_inode.i_gid;
+ ei->attr.gid = gid;
+ }
+
+ list_for_each_entry(ei_child, &ei->children, list) {
+ eventfs_set_attrs(ei_child, update_uid, uid, update_gid, gid, level + 1);
}
if (!ei->entry_attrs)
@@ -334,13 +334,31 @@ void eventfs_remount(struct tracefs_inode *ti, bool update_uid, bool update_gid)
for (int i = 0; i < ei->nr_entries; i++) {
if (update_uid) {
ei->entry_attrs[i].mode &= ~EVENTFS_SAVE_UID;
- ei->entry_attrs[i].uid = ti->vfs_inode.i_uid;
+ ei->entry_attrs[i].uid = uid;
}
if (update_gid) {
ei->entry_attrs[i].mode &= ~EVENTFS_SAVE_GID;
- ei->entry_attrs[i].gid = ti->vfs_inode.i_gid;
+ ei->entry_attrs[i].gid = gid;
}
}
+
+}
+
+/*
+ * On a remount of tracefs, if UID or GID options are set, then
+ * the mount point inode permissions should be used.
+ * Reset the saved permission flags appropriately.
+ */
+void eventfs_remount(struct tracefs_inode *ti, bool update_uid, bool update_gid)
+{
+ struct eventfs_inode *ei = ti->private;
+
+ /* Only the events directory does the updates */
+ if (!ei || !ei->is_events || ei->is_freed)
+ return;
+
+ eventfs_set_attrs(ei, update_uid, ti->vfs_inode.i_uid,
+ update_gid, ti->vfs_inode.i_gid, 0);
}
/* Return the evenfs_inode of the "events" directory */
--
2.43.0
From: "Steven Rostedt (Google)" <rostedt(a)goodmis.org>
When a remount happens, if a gid or uid is specified update the inodes to
have the same gid and uid. This will allow the simplification of the
permissions logic for the dynamically created files and directories.
Cc: stable(a)vger.kernel.org
Fixes: baa23a8d4360d ("tracefs: Reset permissions on remount if permissions are options")
Signed-off-by: Steven Rostedt (Google) <rostedt(a)goodmis.org>
---
fs/tracefs/event_inode.c | 17 +++++++++++++----
fs/tracefs/inode.c | 15 ++++++++++++---
2 files changed, 25 insertions(+), 7 deletions(-)
diff --git a/fs/tracefs/event_inode.c b/fs/tracefs/event_inode.c
index 55a40a730b10..5dfb1ccd56ea 100644
--- a/fs/tracefs/event_inode.c
+++ b/fs/tracefs/event_inode.c
@@ -317,20 +317,29 @@ void eventfs_remount(struct tracefs_inode *ti, bool update_uid, bool update_gid)
if (!ei)
return;
- if (update_uid)
+ if (update_uid) {
ei->attr.mode &= ~EVENTFS_SAVE_UID;
+ ei->attr.uid = ti->vfs_inode.i_uid;
+ }
+
- if (update_gid)
+ if (update_gid) {
ei->attr.mode &= ~EVENTFS_SAVE_GID;
+ ei->attr.gid = ti->vfs_inode.i_gid;
+ }
if (!ei->entry_attrs)
return;
for (int i = 0; i < ei->nr_entries; i++) {
- if (update_uid)
+ if (update_uid) {
ei->entry_attrs[i].mode &= ~EVENTFS_SAVE_UID;
- if (update_gid)
+ ei->entry_attrs[i].uid = ti->vfs_inode.i_uid;
+ }
+ if (update_gid) {
ei->entry_attrs[i].mode &= ~EVENTFS_SAVE_GID;
+ ei->entry_attrs[i].gid = ti->vfs_inode.i_gid;
+ }
}
}
diff --git a/fs/tracefs/inode.c b/fs/tracefs/inode.c
index a827f6a716c4..9252e0d78ea2 100644
--- a/fs/tracefs/inode.c
+++ b/fs/tracefs/inode.c
@@ -373,12 +373,21 @@ static int tracefs_apply_options(struct super_block *sb, bool remount)
rcu_read_lock();
list_for_each_entry_rcu(ti, &tracefs_inodes, list) {
- if (update_uid)
+ if (update_uid) {
ti->flags &= ~TRACEFS_UID_PERM_SET;
+ ti->vfs_inode.i_uid = fsi->uid;
+ }
- if (update_gid)
+ if (update_gid) {
ti->flags &= ~TRACEFS_GID_PERM_SET;
-
+ ti->vfs_inode.i_gid = fsi->gid;
+ }
+
+ /*
+ * Note, the above ti->vfs_inode updates are
+ * used in eventfs_remount() so they must come
+ * before calling it.
+ */
if (ti->flags & TRACEFS_EVENT_INODE)
eventfs_remount(ti, update_uid, update_gid);
}
--
2.43.0
From: "Steven Rostedt (Google)" <rostedt(a)goodmis.org>
The directories require unique inode numbers but all the eventfs files
have the same inode number. Prevent the directories from having the same
inode numbers as the files as that can confuse some tooling.
Cc: stable(a)vger.kernel.org
Fixes: 834bf76add3e6 ("eventfs: Save directory inodes in the eventfs_inode structure")
Signed-off-by: Steven Rostedt (Google) <rostedt(a)goodmis.org>
---
fs/tracefs/event_inode.c | 6 +++++-
1 file changed, 5 insertions(+), 1 deletion(-)
diff --git a/fs/tracefs/event_inode.c b/fs/tracefs/event_inode.c
index 0256afdd4acf..55a40a730b10 100644
--- a/fs/tracefs/event_inode.c
+++ b/fs/tracefs/event_inode.c
@@ -50,8 +50,12 @@ static struct eventfs_root_inode *get_root_inode(struct eventfs_inode *ei)
/* Just try to make something consistent and unique */
static int eventfs_dir_ino(struct eventfs_inode *ei)
{
- if (!ei->ino)
+ if (!ei->ino) {
ei->ino = get_next_ino();
+ /* Must not have the file inode number */
+ if (ei->ino == EVENTFS_FILE_INODE_INO)
+ ei->ino = get_next_ino();
+ }
return ei->ino;
}
--
2.43.0
[CCing Mario, who asked for the two suspected commits to be backported]
On 06.05.24 14:24, Gia wrote:
> Hello, from 6.8.7=>6.8.8 I run into a similar problem with my Caldigit
> TS3 Plus Thunderbolt 3 dock.
>
> After the update I see this message on boot "xHCI host controller not
> responding, assume dead" and the dock is not working anymore. Kernel
> 6.8.7 works great.
Thx for the report. Could you make the kernel log (journalctl -k/dmesg)
accessible somewhere?
And have you looked into the other stuff that Mario suggested in the
other thread? See the following mail and the reply to it for details:
https://lore.kernel.org/all/1eb96465-0a81-4187-b8e7-607d85617d5f@gmail.com/…
Ciao, Thorsten
P.S.: To be sure the issue doesn't fall through the cracks unnoticed,
I'm adding it to regzbot, the Linux kernel regression tracking bot:
#regzbot ^introduced v6.8.7..v6.8.8
#regzbot title thunderbolt: TB3 dock problems, xHCI host controller not
responding, assume dead
From: Petr Pavlu <petr.pavlu(a)suse.com>
The reader code in rb_get_reader_page() swaps a new reader page into the
ring buffer by doing cmpxchg on old->list.prev->next to point it to the
new page. Following that, if the operation is successful,
old->list.next->prev gets updated too. This means the underlying
doubly-linked list is temporarily inconsistent, page->prev->next or
page->next->prev might not be equal back to page for some page in the
ring buffer.
The resize operation in ring_buffer_resize() can be invoked in parallel.
It calls rb_check_pages() which can detect the described inconsistency
and stop further tracing:
[ 190.271762] ------------[ cut here ]------------
[ 190.271771] WARNING: CPU: 1 PID: 6186 at kernel/trace/ring_buffer.c:1467 rb_check_pages.isra.0+0x6a/0xa0
[ 190.271789] Modules linked in: [...]
[ 190.271991] Unloaded tainted modules: intel_uncore_frequency(E):1 skx_edac(E):1
[ 190.272002] CPU: 1 PID: 6186 Comm: cmd.sh Kdump: loaded Tainted: G E 6.9.0-rc6-default #5 158d3e1e6d0b091c34c3b96bfd99a1c58306d79f
[ 190.272011] Hardware name: QEMU Standard PC (Q35 + ICH9, 2009), BIOS rel-1.16.0-0-gd239552c-rebuilt.opensuse.org 04/01/2014
[ 190.272015] RIP: 0010:rb_check_pages.isra.0+0x6a/0xa0
[ 190.272023] Code: [...]
[ 190.272028] RSP: 0018:ffff9c37463abb70 EFLAGS: 00010206
[ 190.272034] RAX: ffff8eba04b6cb80 RBX: 0000000000000007 RCX: ffff8eba01f13d80
[ 190.272038] RDX: ffff8eba01f130c0 RSI: ffff8eba04b6cd00 RDI: ffff8eba0004c700
[ 190.272042] RBP: ffff8eba0004c700 R08: 0000000000010002 R09: 0000000000000000
[ 190.272045] R10: 00000000ffff7f52 R11: ffff8eba7f600000 R12: ffff8eba0004c720
[ 190.272049] R13: ffff8eba00223a00 R14: 0000000000000008 R15: ffff8eba067a8000
[ 190.272053] FS: 00007f1bd64752c0(0000) GS:ffff8eba7f680000(0000) knlGS:0000000000000000
[ 190.272057] CS: 0010 DS: 0000 ES: 0000 CR0: 0000000080050033
[ 190.272061] CR2: 00007f1bd6662590 CR3: 000000010291e001 CR4: 0000000000370ef0
[ 190.272070] DR0: 0000000000000000 DR1: 0000000000000000 DR2: 0000000000000000
[ 190.272073] DR3: 0000000000000000 DR6: 00000000fffe0ff0 DR7: 0000000000000400
[ 190.272077] Call Trace:
[ 190.272098] <TASK>
[ 190.272189] ring_buffer_resize+0x2ab/0x460
[ 190.272199] __tracing_resize_ring_buffer.part.0+0x23/0xa0
[ 190.272206] tracing_resize_ring_buffer+0x65/0x90
[ 190.272216] tracing_entries_write+0x74/0xc0
[ 190.272225] vfs_write+0xf5/0x420
[ 190.272248] ksys_write+0x67/0xe0
[ 190.272256] do_syscall_64+0x82/0x170
[ 190.272363] entry_SYSCALL_64_after_hwframe+0x76/0x7e
[ 190.272373] RIP: 0033:0x7f1bd657d263
[ 190.272381] Code: [...]
[ 190.272385] RSP: 002b:00007ffe72b643f8 EFLAGS: 00000246 ORIG_RAX: 0000000000000001
[ 190.272391] RAX: ffffffffffffffda RBX: 0000000000000002 RCX: 00007f1bd657d263
[ 190.272395] RDX: 0000000000000002 RSI: 0000555a6eb538e0 RDI: 0000000000000001
[ 190.272398] RBP: 0000555a6eb538e0 R08: 000000000000000a R09: 0000000000000000
[ 190.272401] R10: 0000555a6eb55190 R11: 0000000000000246 R12: 00007f1bd6662500
[ 190.272404] R13: 0000000000000002 R14: 00007f1bd6667c00 R15: 0000000000000002
[ 190.272412] </TASK>
[ 190.272414] ---[ end trace 0000000000000000 ]---
Note that ring_buffer_resize() calls rb_check_pages() only if the parent
trace_buffer has recording disabled. Recent commit d78ab792705c
("tracing: Stop current tracer when resizing buffer") causes that it is
now always the case which makes it more likely to experience this issue.
The window to hit this race is nonetheless very small. To help
reproducing it, one can add a delay loop in rb_get_reader_page():
ret = rb_head_page_replace(reader, cpu_buffer->reader_page);
if (!ret)
goto spin;
for (unsigned i = 0; i < 1U << 26; i++) /* inserted delay loop */
__asm__ __volatile__ ("" : : : "memory");
rb_list_head(reader->list.next)->prev = &cpu_buffer->reader_page->list;
.. and then run the following commands on the target system:
echo 1 > /sys/kernel/tracing/events/sched/sched_switch/enable
while true; do
echo 16 > /sys/kernel/tracing/buffer_size_kb; sleep 0.1
echo 8 > /sys/kernel/tracing/buffer_size_kb; sleep 0.1
done &
while true; do
for i in /sys/kernel/tracing/per_cpu/*; do
timeout 0.1 cat $i/trace_pipe; sleep 0.2
done
done
To fix the problem, make sure ring_buffer_resize() doesn't invoke
rb_check_pages() concurrently with a reader operating on the same
ring_buffer_per_cpu by taking its cpu_buffer->reader_lock.
Link: https://lore.kernel.org/linux-trace-kernel/20240517134008.24529-3-petr.pavl…
Cc: stable(a)vger.kernel.org
Cc: Masami Hiramatsu <mhiramat(a)kernel.org>
Cc: Mathieu Desnoyers <mathieu.desnoyers(a)efficios.com>
Fixes: 659f451ff213 ("ring-buffer: Add integrity check at end of iter read")
Signed-off-by: Petr Pavlu <petr.pavlu(a)suse.com>
[ Fixed whitespace ]
Signed-off-by: Steven Rostedt (Google) <rostedt(a)goodmis.org>
---
kernel/trace/ring_buffer.c | 9 +++++++++
1 file changed, 9 insertions(+)
diff --git a/kernel/trace/ring_buffer.c b/kernel/trace/ring_buffer.c
index 42227727a49d..28853966aa9a 100644
--- a/kernel/trace/ring_buffer.c
+++ b/kernel/trace/ring_buffer.c
@@ -1460,6 +1460,11 @@ static void rb_check_bpage(struct ring_buffer_per_cpu *cpu_buffer,
*
* As a safety measure we check to make sure the data pages have not
* been corrupted.
+ *
+ * Callers of this function need to guarantee that the list of pages doesn't get
+ * modified during the check. In particular, if it's possible that the function
+ * is invoked with concurrent readers which can swap in a new reader page then
+ * the caller should take cpu_buffer->reader_lock.
*/
static void rb_check_pages(struct ring_buffer_per_cpu *cpu_buffer)
{
@@ -2210,8 +2215,12 @@ int ring_buffer_resize(struct trace_buffer *buffer, unsigned long size,
*/
synchronize_rcu();
for_each_buffer_cpu(buffer, cpu) {
+ unsigned long flags;
+
cpu_buffer = buffer->buffers[cpu];
+ raw_spin_lock_irqsave(&cpu_buffer->reader_lock, flags);
rb_check_pages(cpu_buffer);
+ raw_spin_unlock_irqrestore(&cpu_buffer->reader_lock, flags);
}
atomic_dec(&buffer->record_disabled);
}
--
2.43.0
Hello,
Upstream commit 11fbb1bfb5bc8c98b2d7db9da332b5e568f4aaab ("ice: use
relative VSI index for VFs VSIs") was applied to stable 6.1, 6.6 and 6.8:
6.1: 5693dd6d3d01f0eea24401f815c98b64cb315b67
6.6: c926393dc3442c38fdcab17d040837cf4acad1c3
6.8: d3da0d4d9fb472ad7dccb784f3d9de40d0c2f6a9
However, it was a part of a series submitted to net-next [1]. Applying
this one patch on its own broke the VF devices created with the ice as a PF:
# [ 307.688237] iavf: Intel(R) Ethernet Adaptive Virtual Function
Network Driver
# [ 307.688241] Copyright (c) 2013 - 2018 Intel Corporation.
# [ 307.688424] iavf 0000:af:01.0: enabling device (0000 -> 0002)
# [ 307.758860] iavf 0000:af:01.0: Invalid MAC address
00:00:00:00:00:00, using random
# [ 307.759628] iavf 0000:af:01.0: Multiqueue Enabled: Queue pair
count = 16
# [ 307.759683] iavf 0000:af:01.0: MAC address: 6a:46:83:88:c2:26
# [ 307.759688] iavf 0000:af:01.0: GRO is enabled
# [ 307.790937] iavf 0000:af:01.0 ens802f0v0: renamed from eth0
# [ 307.896041] iavf 0000:af:01.0: PF returned error -5
(IAVF_ERR_PARAM) to our request 6
# [ 307.916967] iavf 0000:af:01.0: PF returned error -5
(IAVF_ERR_PARAM) to our request 8
The VF initialization fails and the VF device is completely unusable.
This can be fixed either by:
1 - Reverting the above mentioned commit (upstream
11fbb1bfb5bc8c98b2d7db9da332b5e568f4aaab)
Or,
2 - applying the following upstream commits (part of the series):
a) a21605993dd5dfd15edfa7f06705ede17b519026 ("ice: pass VSI pointer
into ice_vc_isvalid_q_id")
b) 363f689600dd010703ce6391bcfc729a97d21840 ("ice: remove unnecessary
duplicate checks for VF VSI ID")
Thanks,
Ahmed
[1]: https://www.spinics.net/lists/netdev/msg979289.html
From: "Steven Rostedt (Google)" <rostedt(a)goodmis.org>
When a remount happens, if a gid or uid is specified update the inodes to
have the same gid and uid. This will allow the simplification of the
permissions logic for the dynamically created files and directories.
Cc: stable(a)vger.kernel.org
Fixes: baa23a8d4360d ("tracefs: Reset permissions on remount if permissions are options")
Signed-off-by: Steven Rostedt (Google) <rostedt(a)goodmis.org>
---
fs/tracefs/event_inode.c | 17 +++++++++++++----
fs/tracefs/inode.c | 15 ++++++++++++---
2 files changed, 25 insertions(+), 7 deletions(-)
diff --git a/fs/tracefs/event_inode.c b/fs/tracefs/event_inode.c
index 55a40a730b10..5dfb1ccd56ea 100644
--- a/fs/tracefs/event_inode.c
+++ b/fs/tracefs/event_inode.c
@@ -317,20 +317,29 @@ void eventfs_remount(struct tracefs_inode *ti, bool update_uid, bool update_gid)
if (!ei)
return;
- if (update_uid)
+ if (update_uid) {
ei->attr.mode &= ~EVENTFS_SAVE_UID;
+ ei->attr.uid = ti->vfs_inode.i_uid;
+ }
+
- if (update_gid)
+ if (update_gid) {
ei->attr.mode &= ~EVENTFS_SAVE_GID;
+ ei->attr.gid = ti->vfs_inode.i_gid;
+ }
if (!ei->entry_attrs)
return;
for (int i = 0; i < ei->nr_entries; i++) {
- if (update_uid)
+ if (update_uid) {
ei->entry_attrs[i].mode &= ~EVENTFS_SAVE_UID;
- if (update_gid)
+ ei->entry_attrs[i].uid = ti->vfs_inode.i_uid;
+ }
+ if (update_gid) {
ei->entry_attrs[i].mode &= ~EVENTFS_SAVE_GID;
+ ei->entry_attrs[i].gid = ti->vfs_inode.i_gid;
+ }
}
}
diff --git a/fs/tracefs/inode.c b/fs/tracefs/inode.c
index a827f6a716c4..9252e0d78ea2 100644
--- a/fs/tracefs/inode.c
+++ b/fs/tracefs/inode.c
@@ -373,12 +373,21 @@ static int tracefs_apply_options(struct super_block *sb, bool remount)
rcu_read_lock();
list_for_each_entry_rcu(ti, &tracefs_inodes, list) {
- if (update_uid)
+ if (update_uid) {
ti->flags &= ~TRACEFS_UID_PERM_SET;
+ ti->vfs_inode.i_uid = fsi->uid;
+ }
- if (update_gid)
+ if (update_gid) {
ti->flags &= ~TRACEFS_GID_PERM_SET;
-
+ ti->vfs_inode.i_gid = fsi->gid;
+ }
+
+ /*
+ * Note, the above ti->vfs_inode updates are
+ * used in eventfs_remount() so they must come
+ * before calling it.
+ */
if (ti->flags & TRACEFS_EVENT_INODE)
eventfs_remount(ti, update_uid, update_gid);
}
--
2.43.0
From: "Steven Rostedt (Google)" <rostedt(a)goodmis.org>
The directories require unique inode numbers but all the eventfs files
have the same inode number. Prevent the directories from having the same
inode numbers as the files as that can confuse some tooling.
Cc: stable(a)vger.kernel.org
Fixes: 834bf76add3e6 ("eventfs: Save directory inodes in the eventfs_inode structure")
Signed-off-by: Steven Rostedt (Google) <rostedt(a)goodmis.org>
---
fs/tracefs/event_inode.c | 6 +++++-
1 file changed, 5 insertions(+), 1 deletion(-)
diff --git a/fs/tracefs/event_inode.c b/fs/tracefs/event_inode.c
index 0256afdd4acf..55a40a730b10 100644
--- a/fs/tracefs/event_inode.c
+++ b/fs/tracefs/event_inode.c
@@ -50,8 +50,12 @@ static struct eventfs_root_inode *get_root_inode(struct eventfs_inode *ei)
/* Just try to make something consistent and unique */
static int eventfs_dir_ino(struct eventfs_inode *ei)
{
- if (!ei->ino)
+ if (!ei->ino) {
ei->ino = get_next_ino();
+ /* Must not have the file inode number */
+ if (ei->ino == EVENTFS_FILE_INODE_INO)
+ ei->ino = get_next_ino();
+ }
return ei->ino;
}
--
2.43.0
From: Baokun Li <libaokun1(a)huawei.com>
commit d36f6ed761b53933b0b4126486c10d3da7751e7f upstream.
Hulk Robot reported a BUG_ON:
==================================================================
kernel BUG at fs/ext4/extents_status.c:199!
[...]
RIP: 0010:ext4_es_end fs/ext4/extents_status.c:199 [inline]
RIP: 0010:__es_tree_search+0x1e0/0x260 fs/ext4/extents_status.c:217
[...]
Call Trace:
ext4_es_cache_extent+0x109/0x340 fs/ext4/extents_status.c:766
ext4_cache_extents+0x239/0x2e0 fs/ext4/extents.c:561
ext4_find_extent+0x6b7/0xa20 fs/ext4/extents.c:964
ext4_ext_map_blocks+0x16b/0x4b70 fs/ext4/extents.c:4384
ext4_map_blocks+0xe26/0x19f0 fs/ext4/inode.c:567
ext4_getblk+0x320/0x4c0 fs/ext4/inode.c:980
ext4_bread+0x2d/0x170 fs/ext4/inode.c:1031
ext4_quota_read+0x248/0x320 fs/ext4/super.c:6257
v2_read_header+0x78/0x110 fs/quota/quota_v2.c:63
v2_check_quota_file+0x76/0x230 fs/quota/quota_v2.c:82
vfs_load_quota_inode+0x5d1/0x1530 fs/quota/dquot.c:2368
dquot_enable+0x28a/0x330 fs/quota/dquot.c:2490
ext4_quota_enable fs/ext4/super.c:6137 [inline]
ext4_enable_quotas+0x5d7/0x960 fs/ext4/super.c:6163
ext4_fill_super+0xa7c9/0xdc00 fs/ext4/super.c:4754
mount_bdev+0x2e9/0x3b0 fs/super.c:1158
mount_fs+0x4b/0x1e4 fs/super.c:1261
[...]
==================================================================
Above issue may happen as follows:
-------------------------------------
ext4_fill_super
ext4_enable_quotas
ext4_quota_enable
ext4_iget
__ext4_iget
ext4_ext_check_inode
ext4_ext_check
__ext4_ext_check
ext4_valid_extent_entries
Check for overlapping extents does't take effect
dquot_enable
vfs_load_quota_inode
v2_check_quota_file
v2_read_header
ext4_quota_read
ext4_bread
ext4_getblk
ext4_map_blocks
ext4_ext_map_blocks
ext4_find_extent
ext4_cache_extents
ext4_es_cache_extent
ext4_es_cache_extent
__es_tree_search
ext4_es_end
BUG_ON(es->es_lblk + es->es_len < es->es_lblk)
The error ext4 extents is as follows:
0af3 0300 0400 0000 00000000 extent_header
00000000 0100 0000 12000000 extent1
00000000 0100 0000 18000000 extent2
02000000 0400 0000 14000000 extent3
In the ext4_valid_extent_entries function,
if prev is 0, no error is returned even if lblock<=prev.
This was intended to skip the check on the first extent, but
in the error image above, prev=0+1-1=0 when checking the second extent,
so even though lblock<=prev, the function does not return an error.
As a result, bug_ON occurs in __es_tree_search and the system panics.
To solve this problem, we only need to check that:
1. The lblock of the first extent is not less than 0.
2. The lblock of the next extent is not less than
the next block of the previous extent.
The same applies to extent_idx.
Cc: stable(a)kernel.org
Fixes: 5946d089379a ("ext4: check for overlapping extents in ext4_valid_extent_entries()")
Reported-by: Hulk Robot <hulkci(a)huawei.com>
Signed-off-by: Baokun Li <libaokun1(a)huawei.com>
Reviewed-by: Jan Kara <jack(a)suse.cz>
Link: https://lore.kernel.org/r/20220518120816.1541863-1-libaokun1@huawei.com
Signed-off-by: Theodore Ts'o <tytso(a)mit.edu>
Reported-by: syzbot+2a58d88f0fb315c85363(a)syzkaller.appspotmail.com
[gpiccoli: Manual backport due to unrelated missing patches.]
Signed-off-by: Guilherme G. Piccoli <gpiccoli(a)igalia.com>
---
Hey folks, this one should have been backported but due to merge
issues [0], it ended-up not being on 5.4.y . So here is a working version!
Cheers,
Guilherme
[0] https://lore.kernel.org/stable/165451751147179@kroah.com/
fs/ext4/extents.c | 10 +++++-----
1 file changed, 5 insertions(+), 5 deletions(-)
diff --git a/fs/ext4/extents.c b/fs/ext4/extents.c
index 98e1b1ddb4ec..90b12c7c0f20 100644
--- a/fs/ext4/extents.c
+++ b/fs/ext4/extents.c
@@ -409,7 +409,7 @@ static int ext4_valid_extent_entries(struct inode *inode,
{
unsigned short entries;
ext4_lblk_t lblock = 0;
- ext4_lblk_t prev = 0;
+ ext4_lblk_t cur = 0;
if (eh->eh_entries == 0)
return 1;
@@ -435,12 +435,12 @@ static int ext4_valid_extent_entries(struct inode *inode,
/* Check for overlapping extents */
lblock = le32_to_cpu(ext->ee_block);
- if ((lblock <= prev) && prev) {
+ if (lblock < cur) {
pblock = ext4_ext_pblock(ext);
es->s_last_error_block = cpu_to_le64(pblock);
return 0;
}
- prev = lblock + ext4_ext_get_actual_len(ext) - 1;
+ cur = lblock + ext4_ext_get_actual_len(ext);
ext++;
entries--;
}
@@ -460,13 +460,13 @@ static int ext4_valid_extent_entries(struct inode *inode,
/* Check for overlapping index extents */
lblock = le32_to_cpu(ext_idx->ei_block);
- if ((lblock <= prev) && prev) {
+ if (lblock < cur) {
*pblk = ext4_idx_pblock(ext_idx);
return 0;
}
ext_idx++;
entries--;
- prev = lblock;
+ cur = lblock + 1;
}
}
return 1;
--
2.43.2
From: NeilBrown <neilb(a)suse.de>
[ Upstream commit 3903902401451b1cd9d797a8c79769eb26ac7fe5 ]
The original implementation of nfsd used signals to stop threads during
shutdown.
In Linux 2.3.46pre5 nfsd gained the ability to shutdown threads
internally it if was asked to run "0" threads. After this user-space
transitioned to using "rpc.nfsd 0" to stop nfsd and sending signals to
threads was no longer an important part of the API.
In commit 3ebdbe5203a8 ("SUNRPC: discard svo_setup and rename
svc_set_num_threads_sync()") (v5.17-rc1~75^2~41) we finally removed the
use of signals for stopping threads, using kthread_stop() instead.
This patch makes the "obvious" next step and removes the ability to
signal nfsd threads - or any svc threads. nfsd stops allowing signals
and we don't check for their delivery any more.
This will allow for some simplification in later patches.
A change worth noting is in nfsd4_ssc_setup_dul(). There was previously
a signal_pending() check which would only succeed when the thread was
being shut down. It should really have tested kthread_should_stop() as
well. Now it just does the latter, not the former.
Signed-off-by: NeilBrown <neilb(a)suse.de>
Reviewed-by: Jeff Layton <jlayton(a)kernel.org>
Signed-off-by: Chuck Lever <chuck.lever(a)oracle.com>
---
fs/nfs/callback.c | 9 +--------
fs/nfsd/nfs4proc.c | 5 ++---
fs/nfsd/nfssvc.c | 12 ------------
net/sunrpc/svc_xprt.c | 16 ++++++----------
4 files changed, 9 insertions(+), 33 deletions(-)
diff --git a/fs/nfs/callback.c b/fs/nfs/callback.c
index 456af7d230cf..46a0a2d6962e 100644
--- a/fs/nfs/callback.c
+++ b/fs/nfs/callback.c
@@ -80,9 +80,6 @@ nfs4_callback_svc(void *vrqstp)
set_freezable();
while (!kthread_freezable_should_stop(NULL)) {
-
- if (signal_pending(current))
- flush_signals(current);
/*
* Listen for a request on the socket
*/
@@ -112,11 +109,7 @@ nfs41_callback_svc(void *vrqstp)
set_freezable();
while (!kthread_freezable_should_stop(NULL)) {
-
- if (signal_pending(current))
- flush_signals(current);
-
- prepare_to_wait(&serv->sv_cb_waitq, &wq, TASK_INTERRUPTIBLE);
+ prepare_to_wait(&serv->sv_cb_waitq, &wq, TASK_IDLE);
spin_lock_bh(&serv->sv_cb_lock);
if (!list_empty(&serv->sv_cb_list)) {
req = list_first_entry(&serv->sv_cb_list,
diff --git a/fs/nfsd/nfs4proc.c b/fs/nfsd/nfs4proc.c
index c14f5ac1484c..6779291efca9 100644
--- a/fs/nfsd/nfs4proc.c
+++ b/fs/nfsd/nfs4proc.c
@@ -1317,12 +1317,11 @@ static __be32 nfsd4_ssc_setup_dul(struct nfsd_net *nn, char *ipaddr,
/* found a match */
if (ni->nsui_busy) {
/* wait - and try again */
- prepare_to_wait(&nn->nfsd_ssc_waitq, &wait,
- TASK_INTERRUPTIBLE);
+ prepare_to_wait(&nn->nfsd_ssc_waitq, &wait, TASK_IDLE);
spin_unlock(&nn->nfsd_ssc_lock);
/* allow 20secs for mount/unmount for now - revisit */
- if (signal_pending(current) ||
+ if (kthread_should_stop() ||
(schedule_timeout(20*HZ) == 0)) {
finish_wait(&nn->nfsd_ssc_waitq, &wait);
kfree(work);
diff --git a/fs/nfsd/nfssvc.c b/fs/nfsd/nfssvc.c
index 4c1a0a1623e5..3d4fd40c987b 100644
--- a/fs/nfsd/nfssvc.c
+++ b/fs/nfsd/nfssvc.c
@@ -938,15 +938,6 @@ nfsd(void *vrqstp)
current->fs->umask = 0;
- /*
- * thread is spawned with all signals set to SIG_IGN, re-enable
- * the ones that will bring down the thread
- */
- allow_signal(SIGKILL);
- allow_signal(SIGHUP);
- allow_signal(SIGINT);
- allow_signal(SIGQUIT);
-
atomic_inc(&nfsdstats.th_cnt);
set_freezable();
@@ -971,9 +962,6 @@ nfsd(void *vrqstp)
validate_process_creds();
}
- /* Clear signals before calling svc_exit_thread() */
- flush_signals(current);
-
atomic_dec(&nfsdstats.th_cnt);
out:
diff --git a/net/sunrpc/svc_xprt.c b/net/sunrpc/svc_xprt.c
index 67ccf1a6459a..b19592673eef 100644
--- a/net/sunrpc/svc_xprt.c
+++ b/net/sunrpc/svc_xprt.c
@@ -700,8 +700,8 @@ static int svc_alloc_arg(struct svc_rqst *rqstp)
/* Made progress, don't sleep yet */
continue;
- set_current_state(TASK_INTERRUPTIBLE);
- if (signalled() || kthread_should_stop()) {
+ set_current_state(TASK_IDLE);
+ if (kthread_should_stop()) {
set_current_state(TASK_RUNNING);
return -EINTR;
}
@@ -736,7 +736,7 @@ rqst_should_sleep(struct svc_rqst *rqstp)
return false;
/* are we shutting down? */
- if (signalled() || kthread_should_stop())
+ if (kthread_should_stop())
return false;
/* are we freezing? */
@@ -758,11 +758,7 @@ static struct svc_xprt *svc_get_next_xprt(struct svc_rqst *rqstp, long timeout)
if (rqstp->rq_xprt)
goto out_found;
- /*
- * We have to be able to interrupt this wait
- * to bring down the daemons ...
- */
- set_current_state(TASK_INTERRUPTIBLE);
+ set_current_state(TASK_IDLE);
smp_mb__before_atomic();
clear_bit(SP_CONGESTED, &pool->sp_flags);
clear_bit(RQ_BUSY, &rqstp->rq_flags);
@@ -784,7 +780,7 @@ static struct svc_xprt *svc_get_next_xprt(struct svc_rqst *rqstp, long timeout)
if (!time_left)
atomic_long_inc(&pool->sp_stats.threads_timedout);
- if (signalled() || kthread_should_stop())
+ if (kthread_should_stop())
return ERR_PTR(-EINTR);
return ERR_PTR(-EAGAIN);
out_found:
@@ -882,7 +878,7 @@ int svc_recv(struct svc_rqst *rqstp, long timeout)
try_to_freeze();
cond_resched();
err = -EINTR;
- if (signalled() || kthread_should_stop())
+ if (kthread_should_stop())
goto out;
xprt = svc_get_next_xprt(rqstp, timeout);
base-commit: b925f60c6ee7ec871d2d48575d0fde3872129c20
--
2.44.0
Hello,
Please pickup commit c79e387389d5add7cb967d2f7622c3bf5550927b ("mfd: stpmic1: Fix swapped mask/unmask in irq chip")
for inclusion in stable kernel 6.1.y.
This fixes this warning at boot:
stpmic1 [...]: mask_base and unmask_base are inverted, please fix it
It also avoid to invert masks later in IRQ framework so regression risks should be minimal.
Thanks!
--
Yoann Congal
Smile ECS - Tech Expert
Hi,
Please backport commit:
ecfe9a015d3e ("pinctrl: core: handle radix_tree_insert() errors in pinctrl_register_one_pin()")
to stable trees 5.4.y, 5.10.y, 5.15.y, 6.1.y. This commit fixes error handling of radix_tree_insert().
This bug was discovered and resolved using Coverity Static Analysis
Security Testing (SAST) by Synopsys, Inc.
Thanks,
Hagar Hemdan
Amazon Web Services Development Center Germany GmbH
Krausenstr. 38
10117 Berlin
Geschaeftsfuehrung: Christian Schlaeger, Jonathan Weiss
Eingetragen am Amtsgericht Charlottenburg unter HRB 257764 B
Sitz: Berlin
Ust-ID: DE 365 538 597
The patch below does not apply to the 5.15-stable tree.
If someone wants it applied there, or to any other stable or longterm
tree, then please email the backport, including the original git commit
id to <stable(a)vger.kernel.org>.
To reproduce the conflict and resubmit, you may use the following commands:
git fetch https://git.kernel.org/pub/scm/linux/kernel/git/stable/linux.git/ linux-5.15.y
git checkout FETCH_HEAD
git cherry-pick -x 317a215d4932
# <resolve conflicts, build, test, etc.>
git commit -s
git send-email --to '<stable(a)vger.kernel.org>' --in-reply-to '2024052216-jaws-pester-a65d@gregkh' --subject-prefix 'PATCH 5.15.y' HEAD^..
Possible dependencies:
thanks,
greg k-h
------------------ original commit in Linus's tree ------------------
From 317a215d493230da361028ea8a4675de334bfa1a Mon Sep 17 00:00:00 2001
From: Ronald Wahl <ronald.wahl(a)raritan.com>
Date: Mon, 13 May 2024 16:39:22 +0200
Subject: [PATCH] net: ks8851: Fix another TX stall caused by wrong ISR flag
handling
Under some circumstances it may happen that the ks8851 Ethernet driver
stops sending data.
Currently the interrupt handler resets the interrupt status flags in the
hardware after handling TX. With this approach we may lose interrupts in
the time window between handling the TX interrupt and resetting the TX
interrupt status bit.
When all of the three following conditions are true then transmitting
data stops:
- TX queue is stopped to wait for room in the hardware TX buffer
- no queued SKBs in the driver (txq) that wait for being written to hw
- hardware TX buffer is empty and the last TX interrupt was lost
This is because reenabling the TX queue happens when handling the TX
interrupt status but if the TX status bit has already been cleared then
this interrupt will never come.
With this commit the interrupt status flags will be cleared before they
are handled. That way we stop losing interrupts.
The wrong handling of the ISR flags was there from the beginning but
with commit 3dc5d4454545 ("net: ks8851: Fix TX stall caused by TX
buffer overrun") the issue becomes apparent.
Fixes: 3dc5d4454545 ("net: ks8851: Fix TX stall caused by TX buffer overrun")
Cc: "David S. Miller" <davem(a)davemloft.net>
Cc: Eric Dumazet <edumazet(a)google.com>
Cc: Jakub Kicinski <kuba(a)kernel.org>
Cc: Paolo Abeni <pabeni(a)redhat.com>
Cc: Simon Horman <horms(a)kernel.org>
Cc: netdev(a)vger.kernel.org
Cc: stable(a)vger.kernel.org # 5.10+
Signed-off-by: Ronald Wahl <ronald.wahl(a)raritan.com>
Reviewed-by: Simon Horman <horms(a)kernel.org>
Signed-off-by: David S. Miller <davem(a)davemloft.net>
diff --git a/drivers/net/ethernet/micrel/ks8851_common.c b/drivers/net/ethernet/micrel/ks8851_common.c
index 502518cdb461..6453c92f0fa7 100644
--- a/drivers/net/ethernet/micrel/ks8851_common.c
+++ b/drivers/net/ethernet/micrel/ks8851_common.c
@@ -328,7 +328,6 @@ static irqreturn_t ks8851_irq(int irq, void *_ks)
{
struct ks8851_net *ks = _ks;
struct sk_buff_head rxq;
- unsigned handled = 0;
unsigned long flags;
unsigned int status;
struct sk_buff *skb;
@@ -336,24 +335,17 @@ static irqreturn_t ks8851_irq(int irq, void *_ks)
ks8851_lock(ks, &flags);
status = ks8851_rdreg16(ks, KS_ISR);
+ ks8851_wrreg16(ks, KS_ISR, status);
netif_dbg(ks, intr, ks->netdev,
"%s: status 0x%04x\n", __func__, status);
- if (status & IRQ_LCI)
- handled |= IRQ_LCI;
-
if (status & IRQ_LDI) {
u16 pmecr = ks8851_rdreg16(ks, KS_PMECR);
pmecr &= ~PMECR_WKEVT_MASK;
ks8851_wrreg16(ks, KS_PMECR, pmecr | PMECR_WKEVT_LINK);
-
- handled |= IRQ_LDI;
}
- if (status & IRQ_RXPSI)
- handled |= IRQ_RXPSI;
-
if (status & IRQ_TXI) {
unsigned short tx_space = ks8851_rdreg16(ks, KS_TXMIR);
@@ -365,20 +357,12 @@ static irqreturn_t ks8851_irq(int irq, void *_ks)
if (netif_queue_stopped(ks->netdev))
netif_wake_queue(ks->netdev);
spin_unlock(&ks->statelock);
-
- handled |= IRQ_TXI;
}
- if (status & IRQ_RXI)
- handled |= IRQ_RXI;
-
if (status & IRQ_SPIBEI) {
netdev_err(ks->netdev, "%s: spi bus error\n", __func__);
- handled |= IRQ_SPIBEI;
}
- ks8851_wrreg16(ks, KS_ISR, handled);
-
if (status & IRQ_RXI) {
/* the datasheet says to disable the rx interrupt during
* packet read-out, however we're masking the interrupt
The patch below does not apply to the 5.10-stable tree.
If someone wants it applied there, or to any other stable or longterm
tree, then please email the backport, including the original git commit
id to <stable(a)vger.kernel.org>.
To reproduce the conflict and resubmit, you may use the following commands:
git fetch https://git.kernel.org/pub/scm/linux/kernel/git/stable/linux.git/ linux-5.10.y
git checkout FETCH_HEAD
git cherry-pick -x 317a215d4932
# <resolve conflicts, build, test, etc.>
git commit -s
git send-email --to '<stable(a)vger.kernel.org>' --in-reply-to '2024052215-epidemic-outpour-6c88@gregkh' --subject-prefix 'PATCH 5.10.y' HEAD^..
Possible dependencies:
thanks,
greg k-h
------------------ original commit in Linus's tree ------------------
From 317a215d493230da361028ea8a4675de334bfa1a Mon Sep 17 00:00:00 2001
From: Ronald Wahl <ronald.wahl(a)raritan.com>
Date: Mon, 13 May 2024 16:39:22 +0200
Subject: [PATCH] net: ks8851: Fix another TX stall caused by wrong ISR flag
handling
Under some circumstances it may happen that the ks8851 Ethernet driver
stops sending data.
Currently the interrupt handler resets the interrupt status flags in the
hardware after handling TX. With this approach we may lose interrupts in
the time window between handling the TX interrupt and resetting the TX
interrupt status bit.
When all of the three following conditions are true then transmitting
data stops:
- TX queue is stopped to wait for room in the hardware TX buffer
- no queued SKBs in the driver (txq) that wait for being written to hw
- hardware TX buffer is empty and the last TX interrupt was lost
This is because reenabling the TX queue happens when handling the TX
interrupt status but if the TX status bit has already been cleared then
this interrupt will never come.
With this commit the interrupt status flags will be cleared before they
are handled. That way we stop losing interrupts.
The wrong handling of the ISR flags was there from the beginning but
with commit 3dc5d4454545 ("net: ks8851: Fix TX stall caused by TX
buffer overrun") the issue becomes apparent.
Fixes: 3dc5d4454545 ("net: ks8851: Fix TX stall caused by TX buffer overrun")
Cc: "David S. Miller" <davem(a)davemloft.net>
Cc: Eric Dumazet <edumazet(a)google.com>
Cc: Jakub Kicinski <kuba(a)kernel.org>
Cc: Paolo Abeni <pabeni(a)redhat.com>
Cc: Simon Horman <horms(a)kernel.org>
Cc: netdev(a)vger.kernel.org
Cc: stable(a)vger.kernel.org # 5.10+
Signed-off-by: Ronald Wahl <ronald.wahl(a)raritan.com>
Reviewed-by: Simon Horman <horms(a)kernel.org>
Signed-off-by: David S. Miller <davem(a)davemloft.net>
diff --git a/drivers/net/ethernet/micrel/ks8851_common.c b/drivers/net/ethernet/micrel/ks8851_common.c
index 502518cdb461..6453c92f0fa7 100644
--- a/drivers/net/ethernet/micrel/ks8851_common.c
+++ b/drivers/net/ethernet/micrel/ks8851_common.c
@@ -328,7 +328,6 @@ static irqreturn_t ks8851_irq(int irq, void *_ks)
{
struct ks8851_net *ks = _ks;
struct sk_buff_head rxq;
- unsigned handled = 0;
unsigned long flags;
unsigned int status;
struct sk_buff *skb;
@@ -336,24 +335,17 @@ static irqreturn_t ks8851_irq(int irq, void *_ks)
ks8851_lock(ks, &flags);
status = ks8851_rdreg16(ks, KS_ISR);
+ ks8851_wrreg16(ks, KS_ISR, status);
netif_dbg(ks, intr, ks->netdev,
"%s: status 0x%04x\n", __func__, status);
- if (status & IRQ_LCI)
- handled |= IRQ_LCI;
-
if (status & IRQ_LDI) {
u16 pmecr = ks8851_rdreg16(ks, KS_PMECR);
pmecr &= ~PMECR_WKEVT_MASK;
ks8851_wrreg16(ks, KS_PMECR, pmecr | PMECR_WKEVT_LINK);
-
- handled |= IRQ_LDI;
}
- if (status & IRQ_RXPSI)
- handled |= IRQ_RXPSI;
-
if (status & IRQ_TXI) {
unsigned short tx_space = ks8851_rdreg16(ks, KS_TXMIR);
@@ -365,20 +357,12 @@ static irqreturn_t ks8851_irq(int irq, void *_ks)
if (netif_queue_stopped(ks->netdev))
netif_wake_queue(ks->netdev);
spin_unlock(&ks->statelock);
-
- handled |= IRQ_TXI;
}
- if (status & IRQ_RXI)
- handled |= IRQ_RXI;
-
if (status & IRQ_SPIBEI) {
netdev_err(ks->netdev, "%s: spi bus error\n", __func__);
- handled |= IRQ_SPIBEI;
}
- ks8851_wrreg16(ks, KS_ISR, handled);
-
if (status & IRQ_RXI) {
/* the datasheet says to disable the rx interrupt during
* packet read-out, however we're masking the interrupt