On Tue, Sep 05, 2023 at 05:50:06PM +0200, Cláudio Sampaio wrote:
> Hi Thorsten and Keith,
>
> Thanks for the details. I'm still unsure if responding by email is better
> or adding to the ticket, but here it goes: I have tried for days both with
> complete power off of the machine and cycle-booting all kernels in
> succession and without exception, 6.1.x LTS and the patched 6.5.1 kernel
> always recognize and operate the NVME, whilst the other kernels also fail
> with the same error message. As this is my "production" desktop, though,
> during the week it's more difficult to me to perform tests with it, but I
> will try to do it in a more methodic way and also with 6.5.1 vanilla.
>
> As for the reason the Lexar doesn't catch the quirk default, I can't say I
> catch the complex logic of the driver activation, but I found out how to
> "fix" for my case because there are three other Lexar models in the pci.c
> file: NM610, NM620 and NM760 (this one with an additional quirk marked on
> it on the code, NVME_QUIRK_IGNORE_DEV_SUBNQN) -- so I guess whatever
> justifies the exception for them also justifies for my model, NM790. Might
> even be the case that I would need NVME_QUIRK_IGNORE_DEV_SUBNQN (not sure
> what it does) like in the NM760 case, but it activates correctly without it.
The existing Lexar quirks for the identifier existed before the default
kernel behavior changed with respect to how identifiers are considered.
But the report says the device failed to enumerate with a "device not
ready" error message. That error message happens *before* identifiers
are checked, so the quirk should be a no-op with respect to that error
message. And the driver abandons the device after printing that message,
so no futher action should be taken no matter what quirk you've set.
In order for this quirk to have any effect at all, the error you should
have seen should look like a "duplicate IDs" message.
While reading from the tracing/trace, the ftrace reader rarely encounters
a KASAN invalid access issue.
It is likely that the writer has disrupted the ring_buffer that the reader
is currently parsing. the kasan report is as below:
[name:report&]BUG: KASAN: invalid-access in rb_iter_head_event+0x27c/0x3d0
[name:report&]Read of size 4 at addr 71ffff8111a18000 by task xxxx
[name:report_sw_tags&]Pointer tag: [71], memory tag: [0f]
[name:report&]
CPU: 2 PID: 380 Comm: xxxx
Call trace:
dump_backtrace+0x168/0x1b0
show_stack+0x2c/0x3c
dump_stack_lvl+0xa4/0xd4
print_report+0x268/0x9b0
kasan_report+0xdc/0x148
kasan_tag_mismatch+0x28/0x3c
__hwasan_tag_mismatch+0x2c/0x58
rb_event_length() [inline]
rb_iter_head_event+0x27c/0x3d0
ring_buffer_iter_peek+0x23c/0x6e0
__find_next_entry+0x1ac/0x3d8
s_next+0x1f0/0x310
seq_read_iter+0x4e8/0x77c
seq_read+0xf8/0x150
vfs_read+0x1a8/0x4cc
In some edge cases, ftrace reader could access to an invalid address,
specifically when reading 4 bytes beyond the end of the currently page.
While issue happened, the dump of rb_iter_head_event is shown as below:
in function rb_iter_head_event:
- iter->head = 0xFEC
- iter->next_event = 0xFEC
- commit = 0xFF0
- read_stamp = 0x2955AC46DB0
- page_stamp = 0x2955AC2439A
- iter->head_page->page = 0x71FFFF8111A17000
- iter->head_page->time_stamp = 0x2956A142267
- iter->head_page->page->commit = 0xFF0
- the content in iter->head_page->page
0x71FFFF8111A17FF0: 01010075 00002421 0A123B7C FFFFFFC0
In rb_iter_head_event, reader will call rb_event_length with argument
(struct ring_buffer_event *event = 0x71FFFF8111A17FFC).
Since the content data start at address 0x71FFFF8111A17FFC are 0xFFFFFFC0.
event->type will be interpret as 0x0, than the reader will try to get the
length by accessing event->array[0], which is an invalid address:
&event->array[0] = 0x71FFFF8111A18000
Cc: stable(a)vger.kernel.org
Signed-off-by: Tze-nan Wu <Tze-nan.Wu(a)mediatek.com>
---
resend again due to forget cc stable(a)vger.kernel.org
Following patch may not become a solution, it merely checks if the address
to be accessed is valid or not within the rb_event_length before access.
And not sure if there is any side-effect it can lead to.
I am curious about what a better solution for this issue would look like.
Should we address the problem from the writer or the reader?
Also I wonder if the cause of the issue is exactly as I suspected.
Any Suggestion will be appreciated.
Test below can reproduce the issue in 2 hours on kernel-6.1.24:
$cd /sys/kernel/tracing/
# make the reader and writer race more through resize the buffer to 8kb
$echo 8 > buffer_size_kn
# enable all events
$echo 1 > event/enable
# enable trace
$echo 1 > tracing_on
# write and run a script that keep reading trace
$./read_trace.sh
``` read_trace.sh
while :
do
cat /sys/kernel/tracing/trace > /dev/null
done
```
---
kernel/trace/ring_buffer.c | 4 ++++
1 file changed, 4 insertions(+)
diff --git a/kernel/trace/ring_buffer.c b/kernel/trace/ring_buffer.c
index 78502d4c7214..ed5ddc3a134b 100644
--- a/kernel/trace/ring_buffer.c
+++ b/kernel/trace/ring_buffer.c
@@ -200,6 +200,8 @@ rb_event_length(struct ring_buffer_event *event)
if (rb_null_event(event))
/* undefined */
return -1;
+ if (((unsigned long)event & 0xfffUL) >= PAGE_SIZE - 4)
+ return -1;
return event->array[0] + RB_EVNT_HDR_SIZE;
case RINGBUF_TYPE_TIME_EXTEND:
@@ -209,6 +211,8 @@ rb_event_length(struct ring_buffer_event *event)
return RB_LEN_TIME_STAMP;
case RINGBUF_TYPE_DATA:
+ if ((((unsigned long)event & 0xfffUL) >= PAGE_SIZE - 4) && !event->type_len)
+ return -1;
return rb_event_data_length(event);
default:
WARN_ON_ONCE(1);
--
2.18.0
Dear Andrey, dear Nick, dear Greg, dear Sasha,
Compiling the kernel with UBSAN enabled and with gcc-8 and later fails when:
commit 1e1b6d63d634 ("lib/string.c: implement stpcpy") is applied, and
commit bac7a1fff792 ("lib/ubsan: remove returns-nonnull-attribute checks") is
not applied.
To reproduce, run:
tuxmake -r docker -a arm64 -t gcc-13 -k allnoconfig --kconfig-add
CONFIG_UBSAN=y
It then fails with:
aarch64-linux-gnu-ld: lib/string.o: in function `stpcpy':
string.c:(.text+0x694): undefined reference to
`__ubsan_handle_nonnull_return_v1'
string.c:(.text+0x694): relocation truncated to fit:
R_AARCH64_CALL26 against undefined symbol
`__ubsan_handle_nonnull_return_v1'
Below you find a complete list of architectures, compiler versions and kernel
versions that I have tested with.
As commit bac7a1fff792 ("lib/ubsan: remove returns-nonnull-attribute checks") is
included in v4.16, and commit 1e1b6d63d634 ("lib/string.c: implement stpcpy") is
included in v5.9, this is not an issue that can happen on any mainline release
or the stable releases v4.19.y and later.
In the v4.14.y branch, however, commit 1e1b6d63d634 ("lib/string.c: implement
stpcpy") was included with v4.14.200 as commit b6d38137c19f and commit
bac7a1fff792 ("lib/ubsan: remove returns-nonnull-attribute checks") from
mainline was not included yet. Hence, this reported failure with UBSAN can be
observed on v4.14.y with recent gcc versions.
Greg, once checked and confirmed by Andrey or Nick, could you please include
commit bac7a1fff792 ("lib/ubsan: remove returns-nonnull-attribute checks") into
the linux-4.14.y branch?
The commit applies directly, without any change, on v4.14.200 to v4.14.325.
With that, future versions of v4.14.y will have a working UBSAN with the recent
gcc compiler versions.
Note: For any users, intending to run UBSAN on versions 4.14.200 to v4.14.325,
e.g., for bisecting UBSAN-detected kernel bugs on the linux-4.14.y branch, they
would simply need to apply commit bac7a1fff792 on those release versions.
Appendix of my full testing record:
For arm64 and x86-64 architecture, I tested this whole matrix of combinations of
building v4.14.200, i.e., the first version that failed with the reported build
failure and v4.14.325, i.e., the latest v4.14 release version at the time of
writing.
On v4.14.200 and on v4.14.325:
x86_64:
gcc-7: unsupported configuration (according to tuxmake)
gcc-8: affected and resolved by cherry-picking bac7a1fff792
gcc-9: affected and resolved by cherry-picking bac7a1fff792
gcc-10: affected and resolved by cherry-picking bac7a1fff792
gcc-11:
v4.14.200 fails with an unrelated build error on this compiler and arch
v4.14.325 affected and resolved by cherry-picking bac7a1fff792
gcc-12:
v4.14.200 fails with an unrelated build error on this compiler and arch
v4.14.325 affected and resolved by cherry-picking bac7a1fff792
gcc-13:
v4.14.200 fails with an unrelated build error on this compiler and arch
v4.14.325 affected and resolved by cherry-picking bac7a1fff792
clang-9: unsupported configuration (according to tuxmake)
clang-10: not affected, builds with and without cherry-picking bac7a1fff792
clang-17: not affected, builds with and without cherry-picking bac7a1fff792
arm64:
gcc-7: unsupported configuration (according to tuxmake)
gcc-8: affected and resolved by cherry-picking bac7a1fff792
gcc-9: affected and resolved by cherry-picking bac7a1fff792
gcc-10: affected and resolved by cherry-picking bac7a1fff792
gcc-11: affected and resolved by cherry-picking bac7a1fff792
gcc-12: affected and resolved by cherry-picking bac7a1fff792
gcc-13: affected and resolved by cherry-picking bac7a1fff792
clang-9: unsupported configuration (according to tuxmake)
clang-10: not affected, builds with and without cherry-picking bac7a1fff792
clang-17: not affected, builds with and without cherry-picking bac7a1fff792
Best regards,
Lukas
On 9/4/23 08:21, Denis Arefev wrote:
> The value of an arithmetic expression 1 << (cpu - sdp->mynode->grplo)
> is subject to overflow due to a failure to cast operands to a larger
> data type before performing arithmetic.
>
> The maximum result of this subtraction is defined by the RCU_FANOUT
> or other srcu level-spread values assigned by rcu_init_levelspread(),
> which can indeed cause the signed 32-bit integer literal ("1") to overflow
> when shifted by any value greater than 31.
We could expand on this:
The maximum result of this subtraction is defined by the RCU_FANOUT
or other srcu level-spread values assigned by rcu_init_levelspread(),
which can indeed cause the signed 32-bit integer literal ("1") to overflow
when shifted by any value greater than 31 on a 64-bit system.
Moreover, when the subtraction value is 31, the 1 << 31 expression results
in 0xffffffff80000000 when the signed integer is promoted to unsigned long
on 64-bit systems due to type promotion rules, which is certainly not the
intended result.
>
> Found by Linux Verification Center (linuxtesting.org) with SVACE.
With the commit message updated with my comment above, please also add:
Fixes: c7e88067c1 ("srcu: Exact tracking of srcu_data structures containing callbacks")
Cc: <stable(a)vger.kernel.org> # v4.11
Reviewed-by: Mathieu Desnoyers <mathieu.desnoyers(a)efficios.com>
Thanks!
Mathieu
>
> Signed-off-by: Denis Arefev <arefev(a)swemel.ru>
> ---
> v3: Changed the name of the patch, as suggested by
> Mathieu Desnoyers <mathieu.desnoyers(a)efficios.com>
> v2: Added fixes to the srcu_schedule_cbs_snp function as suggested by
> Mathieu Desnoyers <mathieu.desnoyers(a)efficios.com>
> kernel/rcu/srcutree.c | 4 ++--
> 1 file changed, 2 insertions(+), 2 deletions(-)
>
> diff --git a/kernel/rcu/srcutree.c b/kernel/rcu/srcutree.c
> index 20d7a238d675..6c18e6005ae1 100644
> --- a/kernel/rcu/srcutree.c
> +++ b/kernel/rcu/srcutree.c
> @@ -223,7 +223,7 @@ static bool init_srcu_struct_nodes(struct srcu_struct *ssp, gfp_t gfp_flags)
> snp->grplo = cpu;
> snp->grphi = cpu;
> }
> - sdp->grpmask = 1 << (cpu - sdp->mynode->grplo);
> + sdp->grpmask = 1UL << (cpu - sdp->mynode->grplo);
> }
> smp_store_release(&ssp->srcu_sup->srcu_size_state, SRCU_SIZE_WAIT_BARRIER);
> return true;
> @@ -833,7 +833,7 @@ static void srcu_schedule_cbs_snp(struct srcu_struct *ssp, struct srcu_node *snp
> int cpu;
>
> for (cpu = snp->grplo; cpu <= snp->grphi; cpu++) {
> - if (!(mask & (1 << (cpu - snp->grplo))))
> + if (!(mask & (1UL << (cpu - snp->grplo))))
> continue;
> srcu_schedule_cbs_sdp(per_cpu_ptr(ssp->sda, cpu), delay);
> }
--
Mathieu Desnoyers
EfficiOS Inc.
https://www.efficios.com
Hi all,
Recently "memfd: improve userspace warnings for missing exec-related
flags" was merged. On my system, this is a regression, not an
improvement, because the entire 256k kernel log buffer (default on x86)
is filled with these warnings and "__do_sys_memfd_create: 122 callbacks
suppressed". I haven't investigated too closely, but the most likely
cause is Wayland libraries.
This is too serious of a consequence for using an old API, especially
considering how recently the flags were added. The vast majority of
software has not had time to add the flags: glibc does not define the
macros until 2.38 which was released less than one month ago, man-pages
does not document the flags, and according to Debian Code Search, only
systemd, stress-ng, and strace actually pass either of these flags.
Furthermore, since old kernels reject unknown flags, it's not just a
matter of defining and passing the flag; every program needs to
add logic to handle EINVAL and try again.
Some other way needs to be found to encourage userspace to add the
flags; otherwise, this message will be patched out because the kernel
log becomes unusable after running unupdated programs, which will still
exist even after upstreams are fixed. In particular, AppImages,
flatpaks, snaps, and similar app bundles contain vendored Wayland
libraries which can be difficult or impossible to update.
Thanks,
Alex.