September 2023 - Linux-stable-mirror

Re: Fwd: Lexar NM790 SSDs are not recognized anymore after 6.1.50 LTS

by Keith Busch

On Tue, Sep 05, 2023 at 05:50:06PM +0200, Cláudio Sampaio wrote: > Hi Thorsten and Keith, > > Thanks for the details. I'm still unsure if responding by email is better > or adding to the ticket, but here it goes: I have tried for days both with > complete power off of the machine and cycle-booting all kernels in > succession and without exception, 6.1.x LTS and the patched 6.5.1 kernel > always recognize and operate the NVME, whilst the other kernels also fail > with the same error message. As this is my "production" desktop, though, > during the week it's more difficult to me to perform tests with it, but I > will try to do it in a more methodic way and also with 6.5.1 vanilla. > > As for the reason the Lexar doesn't catch the quirk default, I can't say I > catch the complex logic of the driver activation, but I found out how to > "fix" for my case because there are three other Lexar models in the pci.c > file: NM610, NM620 and NM760 (this one with an additional quirk marked on > it on the code, NVME_QUIRK_IGNORE_DEV_SUBNQN) -- so I guess whatever > justifies the exception for them also justifies for my model, NM790. Might > even be the case that I would need NVME_QUIRK_IGNORE_DEV_SUBNQN (not sure > what it does) like in the NM760 case, but it activates correctly without it. The existing Lexar quirks for the identifier existed before the default kernel behavior changed with respect to how identifiers are considered. But the report says the device failed to enumerate with a "device not ready" error message. That error message happens *before* identifiers are checked, so the quirk should be a no-op with respect to that error message. And the driver abandons the device after printing that message, so no futher action should be taken no matter what quirk you've set. In order for this quirk to have any effect at all, the error you should have seen should look like a "duplicate IDs" message.

2 years, 3 months

2
1
0 0

[PATCH] ring-buffer: Do not read at &event->array[0] if it across the page

by Tze-nan Wu

While reading from the tracing/trace, the ftrace reader rarely encounters a KASAN invalid access issue. It is likely that the writer has disrupted the ring_buffer that the reader is currently parsing. the kasan report is as below: [name:report&]BUG: KASAN: invalid-access in rb_iter_head_event+0x27c/0x3d0 [name:report&]Read of size 4 at addr 71ffff8111a18000 by task xxxx [name:report_sw_tags&]Pointer tag: [71], memory tag: [0f] [name:report&] CPU: 2 PID: 380 Comm: xxxx Call trace: dump_backtrace+0x168/0x1b0 show_stack+0x2c/0x3c dump_stack_lvl+0xa4/0xd4 print_report+0x268/0x9b0 kasan_report+0xdc/0x148 kasan_tag_mismatch+0x28/0x3c __hwasan_tag_mismatch+0x2c/0x58 rb_event_length() [inline] rb_iter_head_event+0x27c/0x3d0 ring_buffer_iter_peek+0x23c/0x6e0 __find_next_entry+0x1ac/0x3d8 s_next+0x1f0/0x310 seq_read_iter+0x4e8/0x77c seq_read+0xf8/0x150 vfs_read+0x1a8/0x4cc In some edge cases, ftrace reader could access to an invalid address, specifically when reading 4 bytes beyond the end of the currently page. While issue happened, the dump of rb_iter_head_event is shown as below: in function rb_iter_head_event: - iter->head = 0xFEC - iter->next_event = 0xFEC - commit = 0xFF0 - read_stamp = 0x2955AC46DB0 - page_stamp = 0x2955AC2439A - iter->head_page->page = 0x71FFFF8111A17000 - iter->head_page->time_stamp = 0x2956A142267 - iter->head_page->page->commit = 0xFF0 - the content in iter->head_page->page 0x71FFFF8111A17FF0: 01010075 00002421 0A123B7C FFFFFFC0 In rb_iter_head_event, reader will call rb_event_length with argument (struct ring_buffer_event *event = 0x71FFFF8111A17FFC). Since the content data start at address 0x71FFFF8111A17FFC are 0xFFFFFFC0. event->type will be interpret as 0x0, than the reader will try to get the length by accessing event->array[0], which is an invalid address: &event->array[0] = 0x71FFFF8111A18000 Cc: stable(a)vger.kernel.org Signed-off-by: Tze-nan Wu <Tze-nan.Wu(a)mediatek.com> --- resend again due to forget cc stable(a)vger.kernel.org Following patch may not become a solution, it merely checks if the address to be accessed is valid or not within the rb_event_length before access. And not sure if there is any side-effect it can lead to. I am curious about what a better solution for this issue would look like. Should we address the problem from the writer or the reader? Also I wonder if the cause of the issue is exactly as I suspected. Any Suggestion will be appreciated. Test below can reproduce the issue in 2 hours on kernel-6.1.24: $cd /sys/kernel/tracing/ # make the reader and writer race more through resize the buffer to 8kb $echo 8 > buffer_size_kn # enable all events $echo 1 > event/enable # enable trace $echo 1 > tracing_on # write and run a script that keep reading trace $./read_trace.sh ``` read_trace.sh while : do cat /sys/kernel/tracing/trace > /dev/null done ``` --- kernel/trace/ring_buffer.c | 4 ++++ 1 file changed, 4 insertions(+) diff --git a/kernel/trace/ring_buffer.c b/kernel/trace/ring_buffer.c index 78502d4c7214..ed5ddc3a134b 100644 --- a/kernel/trace/ring_buffer.c +++ b/kernel/trace/ring_buffer.c @@ -200,6 +200,8 @@ rb_event_length(struct ring_buffer_event *event) if (rb_null_event(event)) /* undefined */ return -1; + if (((unsigned long)event & 0xfffUL) >= PAGE_SIZE - 4) + return -1; return event->array[0] + RB_EVNT_HDR_SIZE; case RINGBUF_TYPE_TIME_EXTEND: @@ -209,6 +211,8 @@ rb_event_length(struct ring_buffer_event *event) return RB_LEN_TIME_STAMP; case RINGBUF_TYPE_DATA: + if ((((unsigned long)event & 0xfffUL) >= PAGE_SIZE - 4) && !event->type_len) + return -1; return rb_event_data_length(event); default: WARN_ON_ONCE(1); -- 2.18.0

2 years, 3 months

2
1
0 0

[PATCH v2 2/2] hwspinlock: qcom: Remove IPQ6018 SOC specific compatible

by Vignesh Viswanathan

IPQ6018 has 32 tcsr_mutex hwlock registers with stride 0x1000. The compatible string qcom,ipq6018-tcsr-mutex is mapped to of_msm8226_tcsr_mutex which has 32 locks configured with stride of 0x80 and doesn't match the HW present in IPQ6018. Remove IPQ6018 specific compatible string so that it fallsback to of_tcsr_mutex data which maps to the correct configuration for IPQ6018. Changes in v2: - Updated commit message - Added Fixes and stable tags Cc: stable(a)vger.kernel.org Fixes: 5d4753f741d8 ("hwspinlock: qcom: add support for MMIO on older SoCs") Signed-off-by: Vignesh Viswanathan <quic_viswanat(a)quicinc.com> --- drivers/hwspinlock/qcom_hwspinlock.c | 1 - 1 file changed, 1 deletion(-) diff --git a/drivers/hwspinlock/qcom_hwspinlock.c b/drivers/hwspinlock/qcom_hwspinlock.c index a0fd67fd2934..814dfe8697bf 100644 --- a/drivers/hwspinlock/qcom_hwspinlock.c +++ b/drivers/hwspinlock/qcom_hwspinlock.c @@ -115,7 +115,6 @@ static const struct of_device_id qcom_hwspinlock_of_match[] = { { .compatible = "qcom,sfpb-mutex", .data = &of_sfpb_mutex }, { .compatible = "qcom,tcsr-mutex", .data = &of_tcsr_mutex }, { .compatible = "qcom,apq8084-tcsr-mutex", .data = &of_msm8226_tcsr_mutex }, - { .compatible = "qcom,ipq6018-tcsr-mutex", .data = &of_msm8226_tcsr_mutex }, { .compatible = "qcom,msm8226-tcsr-mutex", .data = &of_msm8226_tcsr_mutex }, { .compatible = "qcom,msm8974-tcsr-mutex", .data = &of_msm8226_tcsr_mutex }, { .compatible = "qcom,msm8994-tcsr-mutex", .data = &of_msm8226_tcsr_mutex }, -- 2.41.0

2 years, 3 months

2
1
0 0

[PATCH v2 1/2] arm64: dts: qcom: ipq6018: Fix tcsr_mutex register size

by Vignesh Viswanathan

IPQ6018's TCSR Mutex HW lock register has 32 locks of size 4KB each. Total size of the TCSR Mutex registers is 128KB. Fix size of the tcsr_mutex hwlock register to 0x20000. Changes in v2: - Drop change to remove qcom,ipq6018-tcsr-mutex compatible string - Added Fixes and stable tags Cc: stable(a)vger.kernel.org Fixes: 5bf635621245 ("arm64: dts: ipq6018: Add a few device nodes") Signed-off-by: Vignesh Viswanathan <quic_viswanat(a)quicinc.com> --- arch/arm64/boot/dts/qcom/ipq6018.dtsi | 2 +- 1 file changed, 1 insertion(+), 1 deletion(-) diff --git a/arch/arm64/boot/dts/qcom/ipq6018.dtsi b/arch/arm64/boot/dts/qcom/ipq6018.dtsi index 47b8b1d6730a..9793279e2ced 100644 --- a/arch/arm64/boot/dts/qcom/ipq6018.dtsi +++ b/arch/arm64/boot/dts/qcom/ipq6018.dtsi @@ -393,7 +393,7 @@ gcc: gcc@1800000 { tcsr_mutex: hwlock@1905000 { compatible = "qcom,ipq6018-tcsr-mutex", "qcom,tcsr-mutex"; - reg = <0x0 0x01905000 0x0 0x1000>; + reg = <0x0 0x01905000 0x0 0x20000>; #hwlock-cells = <1>; }; -- 2.41.0

2 years, 3 months

2
1
0 0

Include bac7a1fff792 ("lib/ubsan: remove returns-nonnull-attribute checks") into linux-4.14.y

by Lukas Bulwahn

Dear Andrey, dear Nick, dear Greg, dear Sasha, Compiling the kernel with UBSAN enabled and with gcc-8 and later fails when: commit 1e1b6d63d634 ("lib/string.c: implement stpcpy") is applied, and commit bac7a1fff792 ("lib/ubsan: remove returns-nonnull-attribute checks") is not applied. To reproduce, run: tuxmake -r docker -a arm64 -t gcc-13 -k allnoconfig --kconfig-add CONFIG_UBSAN=y It then fails with: aarch64-linux-gnu-ld: lib/string.o: in function `stpcpy': string.c:(.text+0x694): undefined reference to `__ubsan_handle_nonnull_return_v1' string.c:(.text+0x694): relocation truncated to fit: R_AARCH64_CALL26 against undefined symbol `__ubsan_handle_nonnull_return_v1' Below you find a complete list of architectures, compiler versions and kernel versions that I have tested with. As commit bac7a1fff792 ("lib/ubsan: remove returns-nonnull-attribute checks") is included in v4.16, and commit 1e1b6d63d634 ("lib/string.c: implement stpcpy") is included in v5.9, this is not an issue that can happen on any mainline release or the stable releases v4.19.y and later. In the v4.14.y branch, however, commit 1e1b6d63d634 ("lib/string.c: implement stpcpy") was included with v4.14.200 as commit b6d38137c19f and commit bac7a1fff792 ("lib/ubsan: remove returns-nonnull-attribute checks") from mainline was not included yet. Hence, this reported failure with UBSAN can be observed on v4.14.y with recent gcc versions. Greg, once checked and confirmed by Andrey or Nick, could you please include commit bac7a1fff792 ("lib/ubsan: remove returns-nonnull-attribute checks") into the linux-4.14.y branch? The commit applies directly, without any change, on v4.14.200 to v4.14.325. With that, future versions of v4.14.y will have a working UBSAN with the recent gcc compiler versions. Note: For any users, intending to run UBSAN on versions 4.14.200 to v4.14.325, e.g., for bisecting UBSAN-detected kernel bugs on the linux-4.14.y branch, they would simply need to apply commit bac7a1fff792 on those release versions. Appendix of my full testing record: For arm64 and x86-64 architecture, I tested this whole matrix of combinations of building v4.14.200, i.e., the first version that failed with the reported build failure and v4.14.325, i.e., the latest v4.14 release version at the time of writing. On v4.14.200 and on v4.14.325: x86_64: gcc-7: unsupported configuration (according to tuxmake) gcc-8: affected and resolved by cherry-picking bac7a1fff792 gcc-9: affected and resolved by cherry-picking bac7a1fff792 gcc-10: affected and resolved by cherry-picking bac7a1fff792 gcc-11: v4.14.200 fails with an unrelated build error on this compiler and arch v4.14.325 affected and resolved by cherry-picking bac7a1fff792 gcc-12: v4.14.200 fails with an unrelated build error on this compiler and arch v4.14.325 affected and resolved by cherry-picking bac7a1fff792 gcc-13: v4.14.200 fails with an unrelated build error on this compiler and arch v4.14.325 affected and resolved by cherry-picking bac7a1fff792 clang-9: unsupported configuration (according to tuxmake) clang-10: not affected, builds with and without cherry-picking bac7a1fff792 clang-17: not affected, builds with and without cherry-picking bac7a1fff792 arm64: gcc-7: unsupported configuration (according to tuxmake) gcc-8: affected and resolved by cherry-picking bac7a1fff792 gcc-9: affected and resolved by cherry-picking bac7a1fff792 gcc-10: affected and resolved by cherry-picking bac7a1fff792 gcc-11: affected and resolved by cherry-picking bac7a1fff792 gcc-12: affected and resolved by cherry-picking bac7a1fff792 gcc-13: affected and resolved by cherry-picking bac7a1fff792 clang-9: unsupported configuration (according to tuxmake) clang-10: not affected, builds with and without cherry-picking bac7a1fff792 clang-17: not affected, builds with and without cherry-picking bac7a1fff792 Best regards, Lukas

2 years, 3 months

3
2
0 0

Re: DRBD broken in kernel 6.5 and 6.5.1

by Jakub Kicinski

CC: David On Sat, 2 Sep 2023 22:31:06 +0200 Serguei Ivantsov wrote: > Hello, > > After upgrading the kernel to 6.5 the system can't connect to the peer > (6.4.11) anymore. > I checked 6.5.1 - same issue. > All previous kernels including 6.4.14 are working just fine. > Checking the 6.5 changelog, I found commit > 9ae440b8fdd6772b6c007fa3d3766530a09c9045 which mentioned some changes to > DRBD. I don't see anything obviously wrong here, maybe David has better ideas. Can you try this? diff --git a/drivers/block/drbd/drbd_main.c b/drivers/block/drbd/drbd_main.c index 79ab532aabaf..c607d4304608 100644 --- a/drivers/block/drbd/drbd_main.c +++ b/drivers/block/drbd/drbd_main.c @@ -1539,8 +1539,6 @@ static int _drbd_send_page(struct drbd_peer_device *peer_device, struct page *pa int offset, size_t size, unsigned msg_flags) { struct socket *socket = peer_device->connection->data.socket; - struct msghdr msg = { .msg_flags = msg_flags, }; - struct bio_vec bvec; int len = size; int err = -EIO; @@ -1551,10 +1549,13 @@ static int _drbd_send_page(struct drbd_peer_device *peer_device, struct page *pa * __page_cache_release a page that would actually still be referenced * by someone, leading to some obscure delayed Oops somewhere else. */ if (!drbd_disable_sendpage && sendpage_ok(page)) - msg.msg_flags |= MSG_NOSIGNAL | MSG_SPLICE_PAGES; + msg_flags |= MSG_SPLICE_PAGES; + msg_flags |= MSG_NOSIGNAL; drbd_update_congested(peer_device->connection); do { + struct msghdr msg = { .msg_flags = msg_flags, }; + struct bio_vec bvec; int sent; bvec_set_page(&bvec, page, offset, len); > On the 6.5.X system I have the following in the kernel log (drbd_send_block() > failed): > > [ 2.473497] drbd: initialized. Version: 8.4.11 (api:1/proto:86-101) > > [ 2.475394] drbd: built-in > > [ 2.477254] drbd: registered as block device major 147 > > [ 7.421400] drbd drbd0: Starting worker thread (from drbdsetup-84 [3844]) > > [ 7.421509] drbd drbd0/0 drbd0: disk( Diskless -> Attaching ) > > [ 7.421552] drbd drbd0: Method to ensure write ordering: flush > > [ 7.421554] drbd drbd0/0 drbd0: max BIO size = 131072 > > [ 7.421557] drbd drbd0/0 drbd0: drbd_bm_resize called with capacity == > 1845173184 > > [ 7.428017] drbd drbd0/0 drbd0: resync bitmap: bits=230646648 > words=3603854 pages=7039 > > [ 7.467370] drbd0: detected capacity change from 0 to 1845173184 > > [ 7.467372] drbd drbd0/0 drbd0: size = 880 GB (922586592 KB) > > [ 7.486005] drbd drbd0/0 drbd0: recounting of set bits took additional 0 > jiffies > > [ 7.486010] drbd drbd0/0 drbd0: 0 KB (0 bits) marked out-of-sync by on > disk bit-map. > > [ 7.486017] drbd drbd0/0 drbd0: disk( Attaching -> UpToDate ) > > [ 7.486021] drbd drbd0/0 drbd0: attached to UUIDs > 32DDB2019708F68A:0000000000000000:7D97648599B446DD:7D96648599B446DD > > [ 7.486863] drbd drbd0: conn( StandAlone -> Unconnected ) > > [ 7.486871] drbd drbd0: Starting receiver thread (from drbd_w_drbd0 > [3847]) > > [ 7.486918] drbd drbd0: receiver (re)started > > [ 7.486929] drbd drbd0: conn( Unconnected -> WFConnection ) > > [ 12.340212] drbd drbd0: initial packet S crossed > > [ 22.310856] drbd drbd0: Handshake successful: Agreed network protocol > version 101 > > [ 22.311087] drbd drbd0: Feature flags enabled on protocol level: 0xf > TRIM THIN_RESYNC WRITE_SAME WRITE_ZEROES. > > [ 22.311425] drbd drbd0: conn( WFConnection -> WFReportParams ) > > [ 22.311621] drbd drbd0: Starting ack_recv thread (from drbd_r_drbd0 > [4071]) > > [ 22.400702] drbd drbd0/0 drbd0: drbd_sync_handshake: > > [ 22.400869] drbd drbd0/0 drbd0: self > 32DDB2019708F68A:0000000000000000:7D97648599B446DD:7D96648599B446DD bits:0 > flags:0 > > [ 22.401205] drbd drbd0/0 drbd0: peer > 32DDB2019708F68A:0000000000000000:7D97648599B446DC:7D96648599B446DD bits:0 > flags:0 > > [ 22.401538] drbd drbd0/0 drbd0: uuid_compare()=0 by rule 40 > > [ 22.401709] drbd drbd0/0 drbd0: peer( Unknown -> Secondary ) conn( > WFReportParams -> Connected ) pdsk( DUnknown -> UpToDate ) > > [ 22.415394] drbd drbd0/0 drbd0: role( Secondary -> Primary ) > > [ 22.506540] drbd drbd0/0 drbd0: _drbd_send_page: size=4096 len=4096 > sent=-5 > > [ 22.506773] drbd drbd0: peer( Secondary -> Unknown ) conn( Connected -> > NetworkFailure ) pdsk( UpToDate -> DUnknown ) > > [ 22.507109] drbd drbd0/0 drbd0: new current UUID > 7F8B15C04AF49C4D:32DDB2019708F68B:7D97648599B446DD:7D96648599B446DD > > [ 22.507451] drbd drbd0: ack_receiver terminated > > [ 22.507588] drbd drbd0: Terminating drbd_a_drbd0 > > [ 22.600693] drbd drbd0: Connection closed > > [ 22.600937] drbd drbd0: conn( NetworkFailure -> Unconnected ) > > [ 22.601115] drbd drbd0: receiver terminated > > [ 22.601238] drbd drbd0: Restarting receiver thread > > [ 22.601378] drbd drbd0: receiver (re)started > > [ 22.601508] drbd drbd0: conn( Unconnected -> WFConnection ) > > [ 23.260624] drbd drbd0: Handshake successful: Agreed network protocol > version 101 > > [ 23.260859] drbd drbd0: Feature flags enabled on protocol level: 0xf > TRIM THIN_RESYNC WRITE_SAME WRITE_ZEROES. > > [ 23.261187] drbd drbd0: conn( WFConnection -> WFReportParams ) > > [ 23.261367] drbd drbd0: Starting ack_recv thread (from drbd_r_drbd0 > [4071]) > > [ 23.340593] drbd drbd0/0 drbd0: drbd_sync_handshake: > > [ 23.340771] drbd drbd0/0 drbd0: self > 7F8B15C04AF49C4D:32DDB2019708F68B:7D97648599B446DD:7D96648599B446DD bits:1 > flags:0 > > [ 23.341192] drbd drbd0/0 drbd0: peer > 32DDB2019708F68A:0000000000000000:7D97648599B446DC:7D96648599B446DD bits:0 > flags:0 > > [ 23.341649] drbd drbd0/0 drbd0: uuid_compare()=1 by rule 70 > > [ 23.341824] drbd drbd0/0 drbd0: peer( Unknown -> Secondary ) conn( > WFReportParams -> WFBitMapS ) pdsk( DUnknown -> Consistent ) > > [ 23.344911] drbd drbd0/0 drbd0: send bitmap stats [Bytes(packets)]: > plain 0(0), RLE 23(1), total 23; compression: 100.0% > > [ 23.396792] drbd drbd0/0 drbd0: receive bitmap stats [Bytes(packets)]: > plain 0(0), RLE 23(1), total 23; compression: 100.0% > > [ 23.397210] drbd drbd0/0 drbd0: helper command: /sbin/drbdadm > before-resync-source minor-0 > > [ 23.407965] drbd drbd0/0 drbd0: helper command: /sbin/drbdadm > before-resync-source minor-0 exit code 0 (0x0) > > [ 23.417547] drbd drbd0/0 drbd0: conn( WFBitMapS -> SyncSource ) pdsk( > Consistent -> Inconsistent ) > > [ 23.426697] drbd drbd0/0 drbd0: Began resync as SyncSource (will sync 4 > KB [1 bits set]). > > [ 23.435638] drbd drbd0/0 drbd0: updated sync UUID > 7F8B15C04AF49C4D:32DEB2019708F68B:32DDB2019708F68B:7D97648599B446DD > > [ 23.488608] drbd drbd0/0 drbd0: _drbd_send_page: size=4096 len=4096 > sent=-5 > > [ 23.498182] drbd drbd0/0 drbd0: drbd_send_block() failed > > [ 23.508498] drbd drbd0: peer( Secondary -> Unknown ) conn( SyncSource -> > NetworkFailure ) > > [ 23.517597] drbd drbd0: ack_receiver terminated > > [ 23.527513] drbd drbd0: Terminating drbd_a_drbd0 > > [ 23.690598] drbd drbd0: Connection closed > > [ 23.701857] drbd drbd0: conn( NetworkFailure -> Unconnected ) > > [ 23.712017] drbd drbd0: receiver terminated > > [ 23.721597] drbd drbd0: Restarting receiver thread > > > > On the peer: > > > [349071.038278] drbd drbd0: conn( Unconnected -> WFConnection ) > > [349071.558245] drbd drbd0: Handshake successful: Agreed network protocol > version 101 > > [349071.562105] drbd drbd0: Feature flags enabled on protocol level: 0xf > TRIM THIN_RESYNC WRITE_SAME WRITE_ZEROES. > > [349071.569889] drbd drbd0: conn( WFConnection -> WFReportParams ) > > [349071.573802] drbd drbd0: Starting ack_recv thread (from drbd_r_drbd0 > [2660]) > > [349071.688547] drbd drbd0/0 drbd0: drbd_sync_handshake: > > [349071.692323] drbd drbd0/0 drbd0: self > 3375B2019708F68A:0000000000000000:7D97648599B446DC:7D96648599B446DD bits:1 > flags:0 > > [349071.699871] drbd drbd0/0 drbd0: peer > 7F8B15C04AF49C4D:3375B2019708F68B:3374B2019708F68B:3373B2019708F68B bits:1 > flags:0 > > [349071.707687] drbd drbd0/0 drbd0: uuid_compare()=-1 by rule 50 > > [349071.711563] drbd drbd0/0 drbd0: Becoming sync target due to disk states. > > [349071.715381] drbd drbd0/0 drbd0: peer( Unknown -> Primary ) conn( > WFReportParams -> WFBitMapT ) pdsk( DUnknown -> UpToDate ) > > [349071.723039] drbd drbd0/0 drbd0: receive bitmap stats [Bytes(packets)]: > plain 0(0), RLE 23(1), total 23; compression: 100.0% > > [349071.732489] drbd drbd0/0 drbd0: send bitmap stats [Bytes(packets)]: > plain 0(0), RLE 23(1), total 23; compression: 100.0% > > [349071.740178] drbd drbd0/0 drbd0: conn( WFBitMapT -> WFSyncUUID ) > > [349071.787113] drbd drbd0/0 drbd0: updated sync uuid > 3376B2019708F68A:0000000000000000:7D97648599B446DC:7D96648599B446DD > > [349071.794907] drbd drbd0/0 drbd0: helper command: /sbin/drbdadm > before-resync-target minor-0 > > [349071.800006] drbd drbd0/0 drbd0: helper command: /sbin/drbdadm > before-resync-target minor-0 exit code 0 (0x0) > > [349071.807737] drbd drbd0/0 drbd0: conn( WFSyncUUID -> SyncTarget ) > > [349071.811639] drbd drbd0/0 drbd0: Began resync as SyncTarget (will sync 4 > KB [1 bits set]). > > [349071.916117] drbd drbd0: sock was shut down by peer > > [349071.919955] drbd drbd0: peer( Primary -> Unknown ) conn( SyncTarget -> > BrokenPipe ) pdsk( UpToDate -> DUnknown ) > > [349071.927796] drbd drbd0: short read (expected size 4096) > > [349071.931812] drbd drbd0: error receiving RSDataReply, e: -5 l: 4096! > > [349071.935864] drbd drbd0: ack_receiver terminated > > [349071.939906] drbd drbd0: Terminating drbd_a_drbd0 > > [349072.088385] drbd drbd0: Connection closed > > [349072.092398] drbd drbd0: conn( BrokenPipe -> Unconnected ) > > [349072.096436] drbd drbd0: receiver terminated > > [349072.100469] drbd drbd0: Restarting receiver thread > > [349072.104454] drbd drbd0: receiver (re)started > > [349072.108373] drbd drbd0: conn( Unconnected -> WFConnection ) > > > -- > > Best Regards, > > Serguei

2 years, 3 months

1
0
0 0

Re: [PATCH v3] Fix srcu_struct node grpmask overflow on 64-bit systems

by Mathieu Desnoyers

On 9/4/23 08:21, Denis Arefev wrote: > The value of an arithmetic expression 1 << (cpu - sdp->mynode->grplo) > is subject to overflow due to a failure to cast operands to a larger > data type before performing arithmetic. > > The maximum result of this subtraction is defined by the RCU_FANOUT > or other srcu level-spread values assigned by rcu_init_levelspread(), > which can indeed cause the signed 32-bit integer literal ("1") to overflow > when shifted by any value greater than 31. We could expand on this: The maximum result of this subtraction is defined by the RCU_FANOUT or other srcu level-spread values assigned by rcu_init_levelspread(), which can indeed cause the signed 32-bit integer literal ("1") to overflow when shifted by any value greater than 31 on a 64-bit system. Moreover, when the subtraction value is 31, the 1 << 31 expression results in 0xffffffff80000000 when the signed integer is promoted to unsigned long on 64-bit systems due to type promotion rules, which is certainly not the intended result. > > Found by Linux Verification Center (linuxtesting.org) with SVACE. With the commit message updated with my comment above, please also add: Fixes: c7e88067c1 ("srcu: Exact tracking of srcu_data structures containing callbacks") Cc: <stable(a)vger.kernel.org> # v4.11 Reviewed-by: Mathieu Desnoyers <mathieu.desnoyers(a)efficios.com> Thanks! Mathieu > > Signed-off-by: Denis Arefev <arefev(a)swemel.ru> > --- > v3: Changed the name of the patch, as suggested by > Mathieu Desnoyers <mathieu.desnoyers(a)efficios.com> > v2: Added fixes to the srcu_schedule_cbs_snp function as suggested by > Mathieu Desnoyers <mathieu.desnoyers(a)efficios.com> > kernel/rcu/srcutree.c | 4 ++-- > 1 file changed, 2 insertions(+), 2 deletions(-) > > diff --git a/kernel/rcu/srcutree.c b/kernel/rcu/srcutree.c > index 20d7a238d675..6c18e6005ae1 100644 > --- a/kernel/rcu/srcutree.c > +++ b/kernel/rcu/srcutree.c > @@ -223,7 +223,7 @@ static bool init_srcu_struct_nodes(struct srcu_struct *ssp, gfp_t gfp_flags) > snp->grplo = cpu; > snp->grphi = cpu; > } > - sdp->grpmask = 1 << (cpu - sdp->mynode->grplo); > + sdp->grpmask = 1UL << (cpu - sdp->mynode->grplo); > } > smp_store_release(&ssp->srcu_sup->srcu_size_state, SRCU_SIZE_WAIT_BARRIER); > return true; > @@ -833,7 +833,7 @@ static void srcu_schedule_cbs_snp(struct srcu_struct *ssp, struct srcu_node *snp > int cpu; > > for (cpu = snp->grplo; cpu <= snp->grphi; cpu++) { > - if (!(mask & (1 << (cpu - snp->grplo)))) > + if (!(mask & (1UL << (cpu - snp->grplo)))) > continue; > srcu_schedule_cbs_sdp(per_cpu_ptr(ssp->sda, cpu), delay); > } -- Mathieu Desnoyers EfficiOS Inc. https://www.efficios.com

2 years, 3 months

2
6
0 0

Don't fill the kernel log with memfd_create messages

by Alex Xu (Hello71)

Hi all, Recently "memfd: improve userspace warnings for missing exec-related flags" was merged. On my system, this is a regression, not an improvement, because the entire 256k kernel log buffer (default on x86) is filled with these warnings and "__do_sys_memfd_create: 122 callbacks suppressed". I haven't investigated too closely, but the most likely cause is Wayland libraries. This is too serious of a consequence for using an old API, especially considering how recently the flags were added. The vast majority of software has not had time to add the flags: glibc does not define the macros until 2.38 which was released less than one month ago, man-pages does not document the flags, and according to Debian Code Search, only systemd, stress-ng, and strace actually pass either of these flags. Furthermore, since old kernels reject unknown flags, it's not just a matter of defining and passing the flag; every program needs to add logic to handle EINVAL and try again. Some other way needs to be found to encourage userspace to add the flags; otherwise, this message will be patched out because the kernel log becomes unusable after running unupdated programs, which will still exist even after upstreams are fixed. In particular, AppImages, flatpaks, snaps, and similar app bundles contain vendored Wayland libraries which can be difficult or impossible to update. Thanks, Alex.

2 years, 3 months

3
2
0 0

[PATCH 5.10/5.15/6.1 1/1] udf: Handle error when adding extent to a file

by Vladislav Efanov

From: Jan Kara <jack(a)suse.cz> commit 19fd80de0a8b5170ef34704c8984cca920dffa59 upstream When adding extent to a file fails, so far we've silently squelshed the error. Make sure to propagate it up properly. Signed-off-by: Jan Kara <jack(a)suse.cz> Signed-off-by: Vladislav Efanov <VEfanov(a)ispras.ru> --- Syzkaller reports this problem in 5.10 stable release. The problem has been fixed by the following patch which can be cleanly applied to the 5.10/5.15/6.1 branches. fs/udf/inode.c | 41 +++++++++++++++++++++++++++-------------- 1 file changed, 27 insertions(+), 14 deletions(-) diff --git a/fs/udf/inode.c b/fs/udf/inode.c index d114774ecdea..3e11190b7118 100644 --- a/fs/udf/inode.c +++ b/fs/udf/inode.c @@ -57,15 +57,15 @@ static int udf_update_inode(struct inode *, int); static int udf_sync_inode(struct inode *inode); static int udf_alloc_i_data(struct inode *inode, size_t size); static sector_t inode_getblk(struct inode *, sector_t, int *, int *); -static int8_t udf_insert_aext(struct inode *, struct extent_position, - struct kernel_lb_addr, uint32_t); +static int udf_insert_aext(struct inode *, struct extent_position, + struct kernel_lb_addr, uint32_t); static void udf_split_extents(struct inode *, int *, int, udf_pblk_t, struct kernel_long_ad *, int *); static void udf_prealloc_extents(struct inode *, int, int, struct kernel_long_ad *, int *); static void udf_merge_extents(struct inode *, struct kernel_long_ad *, int *); -static void udf_update_extents(struct inode *, struct kernel_long_ad *, int, - int, struct extent_position *); +static int udf_update_extents(struct inode *, struct kernel_long_ad *, int, + int, struct extent_position *); static int udf_get_block(struct inode *, sector_t, struct buffer_head *, int); static void __udf_clear_extent_cache(struct inode *inode) @@ -887,7 +887,9 @@ static sector_t inode_getblk(struct inode *inode, sector_t block, /* write back the new extents, inserting new extents if the new number * of extents is greater than the old number, and deleting extents if * the new number of extents is less than the old number */ - udf_update_extents(inode, laarr, startnum, endnum, &prev_epos); + *err = udf_update_extents(inode, laarr, startnum, endnum, &prev_epos); + if (*err < 0) + goto out_free; newblock = udf_get_pblock(inode->i_sb, newblocknum, iinfo->i_location.partitionReferenceNum, 0); @@ -1155,21 +1157,30 @@ static void udf_merge_extents(struct inode *inode, struct kernel_long_ad *laarr, } } -static void udf_update_extents(struct inode *inode, struct kernel_long_ad *laarr, - int startnum, int endnum, - struct extent_position *epos) +static int udf_update_extents(struct inode *inode, struct kernel_long_ad *laarr, + int startnum, int endnum, + struct extent_position *epos) { int start = 0, i; struct kernel_lb_addr tmploc; uint32_t tmplen; + int err; if (startnum > endnum) { for (i = 0; i < (startnum - endnum); i++) udf_delete_aext(inode, *epos); } else if (startnum < endnum) { for (i = 0; i < (endnum - startnum); i++) { - udf_insert_aext(inode, *epos, laarr[i].extLocation, - laarr[i].extLength); + err = udf_insert_aext(inode, *epos, + laarr[i].extLocation, + laarr[i].extLength); + /* + * If we fail here, we are likely corrupting the extent + * list and leaking blocks. At least stop early to + * limit the damage. + */ + if (err < 0) + return err; udf_next_aext(inode, epos, &laarr[i].extLocation, &laarr[i].extLength, 1); start++; @@ -1181,6 +1192,7 @@ static void udf_update_extents(struct inode *inode, struct kernel_long_ad *laarr udf_write_aext(inode, epos, &laarr[i].extLocation, laarr[i].extLength, 1); } + return 0; } struct buffer_head *udf_bread(struct inode *inode, udf_pblk_t block, @@ -2215,12 +2227,13 @@ int8_t udf_current_aext(struct inode *inode, struct extent_position *epos, return etype; } -static int8_t udf_insert_aext(struct inode *inode, struct extent_position epos, - struct kernel_lb_addr neloc, uint32_t nelen) +static int udf_insert_aext(struct inode *inode, struct extent_position epos, + struct kernel_lb_addr neloc, uint32_t nelen) { struct kernel_lb_addr oeloc; uint32_t oelen; int8_t etype; + int err; if (epos.bh) get_bh(epos.bh); @@ -2230,10 +2243,10 @@ static int8_t udf_insert_aext(struct inode *inode, struct extent_position epos, neloc = oeloc; nelen = (etype << 30) | oelen; } - udf_add_aext(inode, &epos, &neloc, nelen, 1); + err = udf_add_aext(inode, &epos, &neloc, nelen, 1); brelse(epos.bh); - return (nelen >> 30); + return err; } int8_t udf_delete_aext(struct inode *inode, struct extent_position epos) -- 2.34.1

2 years, 3 months

1
0
0 0

Re: [PATCH 6.5 00/34] 6.5.2-rc1 review

by Ronald Warsow

Hi Greg 6.5.2-rc1 compiles, boots and runs here on x86_64 (Intel Rocket Lake, i5-11400) Thanks Tested-by: Ronald Warsow <rwarsow(a)gmx.de>

2 years, 3 months

1
0
0 0

2025

2024

2023

2022

2021

2020

2019

2018

2017

Linux-stable-mirror September 2023