July 2024 - Linux-stable-mirror

[PATCH 01/24] fs/netfs/fscache_cookie: add missing "n_accesses" check

by David Howells

From: Max Kellermann <max.kellermann(a)ionos.com> This fixes a NULL pointer dereference bug due to a data race which looks like this: BUG: kernel NULL pointer dereference, address: 0000000000000008 #PF: supervisor read access in kernel mode #PF: error_code(0x0000) - not-present page PGD 0 P4D 0 Oops: 0000 [#1] SMP PTI CPU: 33 PID: 16573 Comm: kworker/u97:799 Not tainted 6.8.7-cm4all1-hp+ #43 Hardware name: HP ProLiant DL380 Gen9/ProLiant DL380 Gen9, BIOS P89 10/17/2018 Workqueue: events_unbound netfs_rreq_write_to_cache_work RIP: 0010:cachefiles_prepare_write+0x30/0xa0 Code: 57 41 56 45 89 ce 41 55 49 89 cd 41 54 49 89 d4 55 53 48 89 fb 48 83 ec 08 48 8b 47 08 48 83 7f 10 00 48 89 34 24 48 8b 68 20 <48> 8b 45 08 4c 8b 38 74 45 49 8b 7f 50 e8 4e a9 b0 ff 48 8b 73 10 RSP: 0018:ffffb4e78113bde0 EFLAGS: 00010286 RAX: ffff976126be6d10 RBX: ffff97615cdb8438 RCX: 0000000000020000 RDX: ffff97605e6c4c68 RSI: ffff97605e6c4c60 RDI: ffff97615cdb8438 RBP: 0000000000000000 R08: 0000000000278333 R09: 0000000000000001 R10: ffff97605e6c4600 R11: 0000000000000001 R12: ffff97605e6c4c68 R13: 0000000000020000 R14: 0000000000000001 R15: ffff976064fe2c00 FS: 0000000000000000(0000) GS:ffff9776dfd40000(0000) knlGS:0000000000000000 CS: 0010 DS: 0000 ES: 0000 CR0: 0000000080050033 CR2: 0000000000000008 CR3: 000000005942c002 CR4: 00000000001706f0 Call Trace: <TASK> ? __die+0x1f/0x70 ? page_fault_oops+0x15d/0x440 ? search_module_extables+0xe/0x40 ? fixup_exception+0x22/0x2f0 ? exc_page_fault+0x5f/0x100 ? asm_exc_page_fault+0x22/0x30 ? cachefiles_prepare_write+0x30/0xa0 netfs_rreq_write_to_cache_work+0x135/0x2e0 process_one_work+0x137/0x2c0 worker_thread+0x2e9/0x400 ? __pfx_worker_thread+0x10/0x10 kthread+0xcc/0x100 ? __pfx_kthread+0x10/0x10 ret_from_fork+0x30/0x50 ? __pfx_kthread+0x10/0x10 ret_from_fork_asm+0x1b/0x30 </TASK> Modules linked in: CR2: 0000000000000008 ---[ end trace 0000000000000000 ]--- This happened because fscache_cookie_state_machine() was slow and was still running while another process invoked fscache_unuse_cookie(); this led to a fscache_cookie_lru_do_one() call, setting the FSCACHE_COOKIE_DO_LRU_DISCARD flag, which was picked up by fscache_cookie_state_machine(), withdrawing the cookie via cachefiles_withdraw_cookie(), clearing cookie->cache_priv. At the same time, yet another process invoked cachefiles_prepare_write(), which found a NULL pointer in this code line: struct cachefiles_object *object = cachefiles_cres_object(cres); The next line crashes, obviously: struct cachefiles_cache *cache = object->volume->cache; During cachefiles_prepare_write(), the "n_accesses" counter is non-zero (via fscache_begin_operation()). The cookie must not be withdrawn until it drops to zero. The counter is checked by fscache_cookie_state_machine() before switching to FSCACHE_COOKIE_STATE_RELINQUISHING and FSCACHE_COOKIE_STATE_WITHDRAWING (in "case FSCACHE_COOKIE_STATE_FAILED"), but not for FSCACHE_COOKIE_STATE_LRU_DISCARDING ("case FSCACHE_COOKIE_STATE_ACTIVE"). This patch adds the missing check. With a non-zero access counter, the function returns and the next fscache_end_cookie_access() call will queue another fscache_cookie_state_machine() call to handle the still-pending FSCACHE_COOKIE_DO_LRU_DISCARD. Fixes: 12bb21a29c19 ("fscache: Implement cookie user counting and resource pinning") Signed-off-by: Max Kellermann <max.kellermann(a)ionos.com> Signed-off-by: David Howells <dhowells(a)redhat.com> cc: Jeff Layton <jlayton(a)kernel.org> cc: netfs(a)lists.linux.dev cc: linux-fsdevel(a)vger.kernel.org cc: stable(a)vger.kernel.org --- fs/netfs/fscache_cookie.c | 4 ++++ 1 file changed, 4 insertions(+) diff --git a/fs/netfs/fscache_cookie.c b/fs/netfs/fscache_cookie.c index bce2492186d0..d4d4b3a8b106 100644 --- a/fs/netfs/fscache_cookie.c +++ b/fs/netfs/fscache_cookie.c @@ -741,6 +741,10 @@ static void fscache_cookie_state_machine(struct fscache_cookie *cookie) spin_lock(&cookie->lock); } if (test_bit(FSCACHE_COOKIE_DO_LRU_DISCARD, &cookie->flags)) { + if (atomic_read(&cookie->n_accesses) != 0) + /* still being accessed: postpone it */ + break; + __fscache_set_cookie_state(cookie, FSCACHE_COOKIE_STATE_LRU_DISCARDING); wake = true;

11 months, 3 weeks

1
0
0 0

[PATCH] kcov: properly check for softirq context

by andrey.konovalov＠linux.dev

From: Andrey Konovalov <andreyknvl(a)gmail.com> When collecting coverage from softirqs, KCOV uses in_serving_softirq() to check whether the code is running in the softirq context. Unfortunately, in_serving_softirq() is > 0 even when the code is running in the hardirq or NMI context for hardirqs and NMIs that happened during a softirq. As a result, if a softirq handler contains a remote coverage collection section and a hardirq with another remote coverage collection section happens during handling the softirq, KCOV incorrectly detects a nested softirq coverate collection section and prints a WARNING, as reported by syzbot. This issue was exposed by commit a7f3813e589f ("usb: gadget: dummy_hcd: Switch to hrtimer transfer scheduler"), which switched dummy_hcd to using hrtimer and made the timer's callback be executed in the hardirq context. Change the related checks in KCOV to account for this behavior of in_serving_softirq() and make KCOV ignore remote coverage collection sections in the hardirq and NMI contexts. This prevents the WARNING printed by syzbot but does not fix the inability of KCOV to collect coverage from the __usb_hcd_giveback_urb when dummy_hcd is in use (caused by a7f3813e589f); a separate patch is required for that. Reported-by: syzbot+2388cdaeb6b10f0c13ac(a)syzkaller.appspotmail.com Closes: https://syzkaller.appspot.com/bug?extid=2388cdaeb6b10f0c13ac Fixes: 5ff3b30ab57d ("kcov: collect coverage from interrupts") Cc: stable(a)vger.kernel.org Signed-off-by: Andrey Konovalov <andreyknvl(a)gmail.com> --- kernel/kcov.c | 15 ++++++++++++--- 1 file changed, 12 insertions(+), 3 deletions(-) diff --git a/kernel/kcov.c b/kernel/kcov.c index f0a69d402066e..274b6b7c718de 100644 --- a/kernel/kcov.c +++ b/kernel/kcov.c @@ -161,6 +161,15 @@ static void kcov_remote_area_put(struct kcov_remote_area *area, kmsan_unpoison_memory(&area->list, sizeof(area->list)); } +/* + * Unlike in_serving_softirq(), this function returns false when called during + * a hardirq or an NMI that happened in the softirq context. + */ +static inline bool in_softirq_really(void) +{ + return in_serving_softirq() && !in_hardirq() && !in_nmi(); +} + static notrace bool check_kcov_mode(enum kcov_mode needed_mode, struct task_struct *t) { unsigned int mode; @@ -170,7 +179,7 @@ static notrace bool check_kcov_mode(enum kcov_mode needed_mode, struct task_stru * so we ignore code executed in interrupts, unless we are in a remote * coverage collection section in a softirq. */ - if (!in_task() && !(in_serving_softirq() && t->kcov_softirq)) + if (!in_task() && !(in_softirq_really() && t->kcov_softirq)) return false; mode = READ_ONCE(t->kcov_mode); /* @@ -849,7 +858,7 @@ void kcov_remote_start(u64 handle) if (WARN_ON(!kcov_check_handle(handle, true, true, true))) return; - if (!in_task() && !in_serving_softirq()) + if (!in_task() && !in_softirq_really()) return; local_lock_irqsave(&kcov_percpu_data.lock, flags); @@ -991,7 +1000,7 @@ void kcov_remote_stop(void) int sequence; unsigned long flags; - if (!in_task() && !in_serving_softirq()) + if (!in_task() && !in_softirq_really()) return; local_lock_irqsave(&kcov_percpu_data.lock, flags); -- 2.25.1

11 months, 3 weeks

2
2
0 0

[PATCH v3 0/2] Rust and the shadow call stack sanitizer

by Alice Ryhl

This patch series makes it possible to use Rust together with the shadow call stack sanitizer. The first patch is intended to be backported to ensure that people don't try to use SCS with Rust on older kernel versions. The second patch makes it possible to use Rust with the shadow call stack sanitizer. The second patch in this series doesn't make sense without [1], though it doesn't break the build if [1] is missing. Link: https://lore.kernel.org/rust-for-linux/20240701183625.665574-12-ojeda@kerne… [1] Signed-off-by: Alice Ryhl <aliceryhl(a)google.com> --- Changes in v3: - Use -Zfixed-x18. - Add logic to reject unsupported rustc versions. - Also include a fix to be backported. - Link to v2: https://lore.kernel.org/rust-for-linux/20240305-shadow-call-stack-v2-1-c7b4… Changes in v2: - Add -Cforce-unwind-tables flag. - Link to v1: https://lore.kernel.org/rust-for-linux/20240304-shadow-call-stack-v1-1-f055… --- Alice Ryhl (2): rust: SHADOW_CALL_STACK is incompatible with Rust rust: add flags for shadow call stack sanitizer Makefile | 1 + arch/Kconfig | 1 + arch/arm64/Makefile | 3 +++ 3 files changed, 5 insertions(+) --- base-commit: 83b1e6e4170cf96b2a7c49070dd43749649f454e change-id: 20240304-shadow-call-stack-9c197a4361d9 Best regards, -- Alice Ryhl <aliceryhl(a)google.com>

11 months, 3 weeks

2
3
0 0

[PATCH v1 03/36] soc: fsl: cpm1: tsa: Fix tsa_write8()

by Herve Codina

The tsa_write8() parameter is an u32 value. This is not consistent with the function itself. Indeed, tsa_write8() writes an 8bits value. Be consistent and use an u8 parameter value. Fixes: 1d4ba0b81c1c ("soc: fsl: cpm1: Add support for TSA") Cc: stable(a)vger.kernel.org Signed-off-by: Herve Codina <herve.codina(a)bootlin.com> --- drivers/soc/fsl/qe/tsa.c | 2 +- 1 file changed, 1 insertion(+), 1 deletion(-) diff --git a/drivers/soc/fsl/qe/tsa.c b/drivers/soc/fsl/qe/tsa.c index 6c5741cf5e9d..53968ea84c88 100644 --- a/drivers/soc/fsl/qe/tsa.c +++ b/drivers/soc/fsl/qe/tsa.c @@ -140,7 +140,7 @@ static inline void tsa_write32(void __iomem *addr, u32 val) iowrite32be(val, addr); } -static inline void tsa_write8(void __iomem *addr, u32 val) +static inline void tsa_write8(void __iomem *addr, u8 val) { iowrite8(val, addr); } -- 2.45.0

11 months, 3 weeks

1
0
0 0

[PATCH v1 01/36] soc: fsl: cpm1: qmc: Update TRNSYNC only in transparent mode

by Herve Codina

The TRNSYNC feature is available (and enabled) only in transparent mode. Since commit 7cc9bda9c163 ("soc: fsl: cpm1: qmc: Handle timeslot entries at channel start() and stop()") TRNSYNC register is updated in transparent and hdlc mode. In hdlc mode, the address of the TRNSYNC register is used by the QMC for other internal purpose. Even if no weird results were observed in hdlc mode, touching this register in this mode is wrong. Update TRNSYNC only in transparent mode. Fixes: 7cc9bda9c163 ("soc: fsl: cpm1: qmc: Handle timeslot entries at channel start() and stop()") Cc: stable(a)vger.kernel.org Signed-off-by: Herve Codina <herve.codina(a)bootlin.com> --- drivers/soc/fsl/qe/qmc.c | 24 ++++++++++++++---------- 1 file changed, 14 insertions(+), 10 deletions(-) diff --git a/drivers/soc/fsl/qe/qmc.c b/drivers/soc/fsl/qe/qmc.c index 76bb496305a0..bacabf731dcb 100644 --- a/drivers/soc/fsl/qe/qmc.c +++ b/drivers/soc/fsl/qe/qmc.c @@ -940,11 +940,13 @@ static int qmc_chan_start_rx(struct qmc_chan *chan) goto end; } - ret = qmc_setup_chan_trnsync(chan->qmc, chan); - if (ret) { - dev_err(chan->qmc->dev, "chan %u: setup TRNSYNC failed (%d)\n", - chan->id, ret); - goto end; + if (chan->mode == QMC_TRANSPARENT) { + ret = qmc_setup_chan_trnsync(chan->qmc, chan); + if (ret) { + dev_err(chan->qmc->dev, "chan %u: setup TRNSYNC failed (%d)\n", + chan->id, ret); + goto end; + } } /* Restart the receiver */ @@ -982,11 +984,13 @@ static int qmc_chan_start_tx(struct qmc_chan *chan) goto end; } - ret = qmc_setup_chan_trnsync(chan->qmc, chan); - if (ret) { - dev_err(chan->qmc->dev, "chan %u: setup TRNSYNC failed (%d)\n", - chan->id, ret); - goto end; + if (chan->mode == QMC_TRANSPARENT) { + ret = qmc_setup_chan_trnsync(chan->qmc, chan); + if (ret) { + dev_err(chan->qmc->dev, "chan %u: setup TRNSYNC failed (%d)\n", + chan->id, ret); + goto end; + } } /* -- 2.45.0

11 months, 3 weeks

1
0
0 0

[tip: irq/urgent] irqchip/meson-gpio: Convert meson_gpio_irq_controller::lock to 'raw_spinlock_t'

by tip-bot2 for Arseniy Krasnov

The following commit has been merged into the irq/urgent branch of tip: Commit-ID: f872d4af79fe8c71ae291ce8875b477e1669a6c7 Gitweb: https://git.kernel.org/tip/f872d4af79fe8c71ae291ce8875b477e1669a6c7 Author: Arseniy Krasnov <avkrasnov(a)salutedevices.com> AuthorDate: Mon, 29 Jul 2024 16:18:50 +03:00 Committer: Thomas Gleixner <tglx(a)linutronix.de> CommitterDate: Mon, 29 Jul 2024 15:43:50 +02:00 irqchip/meson-gpio: Convert meson_gpio_irq_controller::lock to 'raw_spinlock_t' This lock is acquired under irq_desc::lock with interrupts disabled. When PREEMPT_RT is enabled, 'spinlock_t' becomes preemptible, which results in invalid lock acquire context; [ BUG: Invalid wait context ] swapper/0/1 is trying to lock: ffff0000008fed30 (&ctl->lock){....}-{3:3}, at: meson_gpio_irq_update_bits0 other info that might help us debug this: context-{5:5} 3 locks held by swapper/0/1: #0: ffff0000003cd0f8 (&dev->mutex){....}-{4:4}, at: __driver_attach+0x90c #1: ffff000004714650 (&desc->request_mutex){+.+.}-{4:4}, at: __setup_irq0 #2: ffff0000047144c8 (&irq_desc_lock_class){-.-.}-{2:2}, at: __setup_irq0 stack backtrace: CPU: 1 PID: 1 Comm: swapper/0 Not tainted 6.9.9-sdkernel #1 Call trace: _raw_spin_lock_irqsave+0x60/0x88 meson_gpio_irq_update_bits+0x34/0x70 meson8_gpio_irq_set_type+0x78/0xc4 meson_gpio_irq_set_type+0x30/0x60 __irq_set_trigger+0x60/0x180 __setup_irq+0x30c/0x6e0 request_threaded_irq+0xec/0x1a4 Fixes: 215f4cc0fb20 ("irqchip/meson: Add support for gpio interrupt controller") Signed-off-by: Arseniy Krasnov <avkrasnov(a)salutedevices.com> Signed-off-by: Thomas Gleixner <tglx(a)linutronix.de> Cc: stable(a)vger.kernel.org Link: https://lore.kernel.org/all/20240729131850.3015508-1-avkrasnov@salutedevice… --- drivers/irqchip/irq-meson-gpio.c | 14 +++++++------- 1 file changed, 7 insertions(+), 7 deletions(-) diff --git a/drivers/irqchip/irq-meson-gpio.c b/drivers/irqchip/irq-meson-gpio.c index 27e30ce..cd789fa 100644 --- a/drivers/irqchip/irq-meson-gpio.c +++ b/drivers/irqchip/irq-meson-gpio.c @@ -178,7 +178,7 @@ struct meson_gpio_irq_controller { void __iomem *base; u32 channel_irqs[MAX_NUM_CHANNEL]; DECLARE_BITMAP(channel_map, MAX_NUM_CHANNEL); - spinlock_t lock; + raw_spinlock_t lock; }; static void meson_gpio_irq_update_bits(struct meson_gpio_irq_controller *ctl, @@ -187,14 +187,14 @@ static void meson_gpio_irq_update_bits(struct meson_gpio_irq_controller *ctl, unsigned long flags; u32 tmp; - spin_lock_irqsave(&ctl->lock, flags); + raw_spin_lock_irqsave(&ctl->lock, flags); tmp = readl_relaxed(ctl->base + reg); tmp &= ~mask; tmp |= val; writel_relaxed(tmp, ctl->base + reg); - spin_unlock_irqrestore(&ctl->lock, flags); + raw_spin_unlock_irqrestore(&ctl->lock, flags); } static void meson_gpio_irq_init_dummy(struct meson_gpio_irq_controller *ctl) @@ -244,12 +244,12 @@ meson_gpio_irq_request_channel(struct meson_gpio_irq_controller *ctl, unsigned long flags; unsigned int idx; - spin_lock_irqsave(&ctl->lock, flags); + raw_spin_lock_irqsave(&ctl->lock, flags); /* Find a free channel */ idx = find_first_zero_bit(ctl->channel_map, ctl->params->nr_channels); if (idx >= ctl->params->nr_channels) { - spin_unlock_irqrestore(&ctl->lock, flags); + raw_spin_unlock_irqrestore(&ctl->lock, flags); pr_err("No channel available\n"); return -ENOSPC; } @@ -257,7 +257,7 @@ meson_gpio_irq_request_channel(struct meson_gpio_irq_controller *ctl, /* Mark the channel as used */ set_bit(idx, ctl->channel_map); - spin_unlock_irqrestore(&ctl->lock, flags); + raw_spin_unlock_irqrestore(&ctl->lock, flags); /* * Setup the mux of the channel to route the signal of the pad @@ -567,7 +567,7 @@ static int meson_gpio_irq_of_init(struct device_node *node, struct device_node * if (!ctl) return -ENOMEM; - spin_lock_init(&ctl->lock); + raw_spin_lock_init(&ctl->lock); ctl->base = of_iomap(node, 0); if (!ctl->base) {

11 months, 3 weeks

1
0
0 0

Re: [PATCH v2 0/2] ALSA: firewire-lib: restore process context workqueue to prevent deadlock

by edmund.raile

> Thank you for your sending the revised patches, it looks better than the > previous one. However, I have an additional request. Allright, patch v3 it is. > [1] https://git-scm.com/docs/git-revert Should have known git has something like that, how handy! > $ git revert -s b5b519965c4c Yes, 5b5 can be removed via revert, but what is the difference in effect? Just time saving? > $ git revert -s 7ba5ca32fe6e This one I'd like to ask you about: The original inline comment in amdtp-stream.c amdtp_domain_stream_pcm_pointer() ``` // This function is called in software IRQ context of // period_work or process context. // // When the software IRQ context was scheduled by software IRQ // context of IT contexts, queued packets were already handled. // Therefore, no need to flush the queue in buffer furthermore. // // When the process context reach here, some packets will be // already queued in the buffer. These packets should be handled // immediately to keep better granularity of PCM pointer. // // Later, the process context will sometimes schedules software // IRQ context of the period_work. Then, no need to flush the // queue by the same reason as described in the above ``` (let's call the above v1) was replaced with ``` // In software IRQ context, the call causes dead-lock to disable the tasklet // synchronously. ``` on occasion of 7ba5ca32fe6e (let's call this v2). I sought to replace it with ``` // use wq to prevent deadlock between process context spin_lock // of snd_pcm_stream_lock_irq() in snd_pcm_status64() and // softIRQ context spin_lock of snd_pcm_stream_lock_irqsave() // in snd_pcm_period_elapsed() ``` to prevent this issue from occurring again (let's call this v3). Should I include v1, v3 or a combination of v1 and v3 in my next patch? > Just for safe, it is preferable to execute 'scripts/checkpatch.pl' in > kernel tree to check the patchset generated by send-email subcommand[3]. Absolutely should have done so, sorry. Thank you for your patience and guidance, Edmund Raile.

11 months, 3 weeks

2
1
0 0

[PATCH net 0/5] mptcp: fix signal endpoint readd

by Matthieu Baerts (NGI0)

Issue #501 [1] showed that the Netlink PM currently doesn't correctly support removal and re-add of signal endpoints. Patches 1 and 2 address the issue: the first one in the userspace path- manager, introduced in v5.19 ; and the second one in the in-kernel path- manager, introduced in v5.7. Patch 3 introduces a related selftest. There is no 'Fixes' tag, because it might be hard to backport it automatically, as missing helpers in Bash will not be caught when compiling the kernel or the selftests. The last two patches address two small issues in the MPTCP selftests, one introduced in v6.6., and the other one in v5.17. Link: https://github.com/multipath-tcp/mptcp_net-next/issues/501 [1] Signed-off-by: Matthieu Baerts (NGI0) <matttbe(a)kernel.org> --- Liu Jing (1): selftests: mptcp: always close input's FD if opened Paolo Abeni (4): mptcp: fix user-space PM announced address accounting mptcp: fix NL PM announced address accounting selftests: mptcp: add explicit test case for remove/readd selftests: mptcp: fix error path net/mptcp/pm_netlink.c | 27 ++++++++++++++------ tools/testing/selftests/net/mptcp/mptcp_connect.c | 8 +++--- tools/testing/selftests/net/mptcp/mptcp_join.sh | 31 ++++++++++++++++++++++- 3 files changed, 53 insertions(+), 13 deletions(-) --- base-commit: 301927d2d2eb8e541357ba850bc7a1a74dbbd670 change-id: 20240726-upstream-net-20240726-mptcp-fix-signal-readd-f3c72bbcbceb Best regards, -- Matthieu Baerts (NGI0) <matttbe(a)kernel.org>

11 months, 3 weeks

2
5
0 0

FAILED: patch "[PATCH] nilfs2: handle inconsistent state in" failed to apply to 4.19-stable tree

by gregkh＠linuxfoundation.org

The patch below does not apply to the 4.19-stable tree. If someone wants it applied there, or to any other stable or longterm tree, then please email the backport, including the original git commit id to <stable(a)vger.kernel.org>. To reproduce the conflict and resubmit, you may use the following commands: git fetch https://git.kernel.org/pub/scm/linux/kernel/git/stable/linux.git/ linux-4.19.y git checkout FETCH_HEAD git cherry-pick -x 4811f7af6090e8f5a398fbdd766f903ef6c0d787 # <resolve conflicts, build, test, etc.> git commit -s git send-email --to '<stable(a)vger.kernel.org>' --in-reply-to '2024072930-unbridle-negotiate-9977@gregkh' --subject-prefix 'PATCH 4.19.y' HEAD^.. Possible dependencies: 4811f7af6090 ("nilfs2: handle inconsistent state in nilfs_btnode_create_block()") thanks, greg k-h ------------------ original commit in Linus's tree ------------------ From 4811f7af6090e8f5a398fbdd766f903ef6c0d787 Mon Sep 17 00:00:00 2001 From: Ryusuke Konishi <konishi.ryusuke(a)gmail.com> Date: Thu, 25 Jul 2024 14:20:07 +0900 Subject: [PATCH] nilfs2: handle inconsistent state in nilfs_btnode_create_block() Syzbot reported that a buffer state inconsistency was detected in nilfs_btnode_create_block(), triggering a kernel bug. It is not appropriate to treat this inconsistency as a bug; it can occur if the argument block address (the buffer index of the newly created block) is a virtual block number and has been reallocated due to corruption of the bitmap used to manage its allocation state. So, modify nilfs_btnode_create_block() and its callers to treat it as a possible filesystem error, rather than triggering a kernel bug. Link: https://lkml.kernel.org/r/20240725052007.4562-1-konishi.ryusuke@gmail.com Fixes: a60be987d45d ("nilfs2: B-tree node cache") Signed-off-by: Ryusuke Konishi <konishi.ryusuke(a)gmail.com> Reported-by: syzbot+89cc4f2324ed37988b60(a)syzkaller.appspotmail.com Closes: https://syzkaller.appspot.com/bug?extid=89cc4f2324ed37988b60 Cc: <stable(a)vger.kernel.org> Signed-off-by: Andrew Morton <akpm(a)linux-foundation.org> diff --git a/fs/nilfs2/btnode.c b/fs/nilfs2/btnode.c index 0131d83b912d..c034080c334b 100644 --- a/fs/nilfs2/btnode.c +++ b/fs/nilfs2/btnode.c @@ -51,12 +51,21 @@ nilfs_btnode_create_block(struct address_space *btnc, __u64 blocknr) bh = nilfs_grab_buffer(inode, btnc, blocknr, BIT(BH_NILFS_Node)); if (unlikely(!bh)) - return NULL; + return ERR_PTR(-ENOMEM); if (unlikely(buffer_mapped(bh) || buffer_uptodate(bh) || buffer_dirty(bh))) { - brelse(bh); - BUG(); + /* + * The block buffer at the specified new address was already + * in use. This can happen if it is a virtual block number + * and has been reallocated due to corruption of the bitmap + * used to manage its allocation state (if not, the buffer + * clearing of an abandoned b-tree node is missing somewhere). + */ + nilfs_error(inode->i_sb, + "state inconsistency probably due to duplicate use of b-tree node block address %llu (ino=%lu)", + (unsigned long long)blocknr, inode->i_ino); + goto failed; } memset(bh->b_data, 0, i_blocksize(inode)); bh->b_bdev = inode->i_sb->s_bdev; @@ -67,6 +76,12 @@ nilfs_btnode_create_block(struct address_space *btnc, __u64 blocknr) folio_unlock(bh->b_folio); folio_put(bh->b_folio); return bh; + +failed: + folio_unlock(bh->b_folio); + folio_put(bh->b_folio); + brelse(bh); + return ERR_PTR(-EIO); } int nilfs_btnode_submit_block(struct address_space *btnc, __u64 blocknr, @@ -217,8 +232,8 @@ int nilfs_btnode_prepare_change_key(struct address_space *btnc, } nbh = nilfs_btnode_create_block(btnc, newkey); - if (!nbh) - return -ENOMEM; + if (IS_ERR(nbh)) + return PTR_ERR(nbh); BUG_ON(nbh == obh); ctxt->newbh = nbh; diff --git a/fs/nilfs2/btree.c b/fs/nilfs2/btree.c index a139970e4804..862bdf23120e 100644 --- a/fs/nilfs2/btree.c +++ b/fs/nilfs2/btree.c @@ -63,8 +63,8 @@ static int nilfs_btree_get_new_block(const struct nilfs_bmap *btree, struct buffer_head *bh; bh = nilfs_btnode_create_block(btnc, ptr); - if (!bh) - return -ENOMEM; + if (IS_ERR(bh)) + return PTR_ERR(bh); set_buffer_nilfs_volatile(bh); *bhp = bh;

11 months, 3 weeks

1
0
0 0

FAILED: patch "[PATCH] nilfs2: handle inconsistent state in" failed to apply to 5.4-stable tree

by gregkh＠linuxfoundation.org

The patch below does not apply to the 5.4-stable tree. If someone wants it applied there, or to any other stable or longterm tree, then please email the backport, including the original git commit id to <stable(a)vger.kernel.org>. To reproduce the conflict and resubmit, you may use the following commands: git fetch https://git.kernel.org/pub/scm/linux/kernel/git/stable/linux.git/ linux-5.4.y git checkout FETCH_HEAD git cherry-pick -x 4811f7af6090e8f5a398fbdd766f903ef6c0d787 # <resolve conflicts, build, test, etc.> git commit -s git send-email --to '<stable(a)vger.kernel.org>' --in-reply-to '2024072929-blip-squad-13da@gregkh' --subject-prefix 'PATCH 5.4.y' HEAD^.. Possible dependencies: 4811f7af6090 ("nilfs2: handle inconsistent state in nilfs_btnode_create_block()") thanks, greg k-h ------------------ original commit in Linus's tree ------------------ From 4811f7af6090e8f5a398fbdd766f903ef6c0d787 Mon Sep 17 00:00:00 2001 From: Ryusuke Konishi <konishi.ryusuke(a)gmail.com> Date: Thu, 25 Jul 2024 14:20:07 +0900 Subject: [PATCH] nilfs2: handle inconsistent state in nilfs_btnode_create_block() Syzbot reported that a buffer state inconsistency was detected in nilfs_btnode_create_block(), triggering a kernel bug. It is not appropriate to treat this inconsistency as a bug; it can occur if the argument block address (the buffer index of the newly created block) is a virtual block number and has been reallocated due to corruption of the bitmap used to manage its allocation state. So, modify nilfs_btnode_create_block() and its callers to treat it as a possible filesystem error, rather than triggering a kernel bug. Link: https://lkml.kernel.org/r/20240725052007.4562-1-konishi.ryusuke@gmail.com Fixes: a60be987d45d ("nilfs2: B-tree node cache") Signed-off-by: Ryusuke Konishi <konishi.ryusuke(a)gmail.com> Reported-by: syzbot+89cc4f2324ed37988b60(a)syzkaller.appspotmail.com Closes: https://syzkaller.appspot.com/bug?extid=89cc4f2324ed37988b60 Cc: <stable(a)vger.kernel.org> Signed-off-by: Andrew Morton <akpm(a)linux-foundation.org> diff --git a/fs/nilfs2/btnode.c b/fs/nilfs2/btnode.c index 0131d83b912d..c034080c334b 100644 --- a/fs/nilfs2/btnode.c +++ b/fs/nilfs2/btnode.c @@ -51,12 +51,21 @@ nilfs_btnode_create_block(struct address_space *btnc, __u64 blocknr) bh = nilfs_grab_buffer(inode, btnc, blocknr, BIT(BH_NILFS_Node)); if (unlikely(!bh)) - return NULL; + return ERR_PTR(-ENOMEM); if (unlikely(buffer_mapped(bh) || buffer_uptodate(bh) || buffer_dirty(bh))) { - brelse(bh); - BUG(); + /* + * The block buffer at the specified new address was already + * in use. This can happen if it is a virtual block number + * and has been reallocated due to corruption of the bitmap + * used to manage its allocation state (if not, the buffer + * clearing of an abandoned b-tree node is missing somewhere). + */ + nilfs_error(inode->i_sb, + "state inconsistency probably due to duplicate use of b-tree node block address %llu (ino=%lu)", + (unsigned long long)blocknr, inode->i_ino); + goto failed; } memset(bh->b_data, 0, i_blocksize(inode)); bh->b_bdev = inode->i_sb->s_bdev; @@ -67,6 +76,12 @@ nilfs_btnode_create_block(struct address_space *btnc, __u64 blocknr) folio_unlock(bh->b_folio); folio_put(bh->b_folio); return bh; + +failed: + folio_unlock(bh->b_folio); + folio_put(bh->b_folio); + brelse(bh); + return ERR_PTR(-EIO); } int nilfs_btnode_submit_block(struct address_space *btnc, __u64 blocknr, @@ -217,8 +232,8 @@ int nilfs_btnode_prepare_change_key(struct address_space *btnc, } nbh = nilfs_btnode_create_block(btnc, newkey); - if (!nbh) - return -ENOMEM; + if (IS_ERR(nbh)) + return PTR_ERR(nbh); BUG_ON(nbh == obh); ctxt->newbh = nbh; diff --git a/fs/nilfs2/btree.c b/fs/nilfs2/btree.c index a139970e4804..862bdf23120e 100644 --- a/fs/nilfs2/btree.c +++ b/fs/nilfs2/btree.c @@ -63,8 +63,8 @@ static int nilfs_btree_get_new_block(const struct nilfs_bmap *btree, struct buffer_head *bh; bh = nilfs_btnode_create_block(btnc, ptr); - if (!bh) - return -ENOMEM; + if (IS_ERR(bh)) + return PTR_ERR(bh); set_buffer_nilfs_volatile(bh); *bhp = bh;

11 months, 3 weeks

1
0
0 0

2025

2024

2023

2022

2021

2020

2019

2018

2017

Linux-stable-mirror July 2024