July 2024 - Linux-stable-mirror

[PATCH AUTOSEL 5.4 1/7] net: mac802154: Fix racy device stats updates by DEV_STATS_INC() and DEV_STATS_ADD()

by Sasha Levin

From: Yunshui Jiang <jiangyunshui(a)kylinos.cn> [ Upstream commit b8ec0dc3845f6c9089573cb5c2c4b05f7fc10728 ] mac802154 devices update their dev->stats fields locklessly. Therefore these counters should be updated atomically. Adopt SMP safe DEV_STATS_INC() and DEV_STATS_ADD() to achieve this. Signed-off-by: Yunshui Jiang <jiangyunshui(a)kylinos.cn> Message-ID: <20240531080739.2608969-1-jiangyunshui(a)kylinos.cn> Signed-off-by: Stefan Schmidt <stefan(a)datenfreihafen.org> Signed-off-by: Sasha Levin <sashal(a)kernel.org> --- net/mac802154/tx.c | 8 ++++---- 1 file changed, 4 insertions(+), 4 deletions(-) diff --git a/net/mac802154/tx.c b/net/mac802154/tx.c index c829e4a753256..7cea95d0b78f9 100644 --- a/net/mac802154/tx.c +++ b/net/mac802154/tx.c @@ -34,8 +34,8 @@ void ieee802154_xmit_worker(struct work_struct *work) if (res) goto err_tx; - dev->stats.tx_packets++; - dev->stats.tx_bytes += skb->len; + DEV_STATS_INC(dev, tx_packets); + DEV_STATS_ADD(dev, tx_bytes, skb->len); ieee802154_xmit_complete(&local->hw, skb, false); @@ -86,8 +86,8 @@ ieee802154_tx(struct ieee802154_local *local, struct sk_buff *skb) goto err_tx; } - dev->stats.tx_packets++; - dev->stats.tx_bytes += len; + DEV_STATS_INC(dev, tx_packets); + DEV_STATS_ADD(dev, tx_bytes, len); } else { local->tx_skb = skb; queue_work(local->workqueue, &local->tx_work); -- 2.43.0

12 months

1
6
0 0

[PATCH AUTOSEL 5.10 1/7] net: mac802154: Fix racy device stats updates by DEV_STATS_INC() and DEV_STATS_ADD()

by Sasha Levin

From: Yunshui Jiang <jiangyunshui(a)kylinos.cn> [ Upstream commit b8ec0dc3845f6c9089573cb5c2c4b05f7fc10728 ] mac802154 devices update their dev->stats fields locklessly. Therefore these counters should be updated atomically. Adopt SMP safe DEV_STATS_INC() and DEV_STATS_ADD() to achieve this. Signed-off-by: Yunshui Jiang <jiangyunshui(a)kylinos.cn> Message-ID: <20240531080739.2608969-1-jiangyunshui(a)kylinos.cn> Signed-off-by: Stefan Schmidt <stefan(a)datenfreihafen.org> Signed-off-by: Sasha Levin <sashal(a)kernel.org> --- net/mac802154/tx.c | 8 ++++---- 1 file changed, 4 insertions(+), 4 deletions(-) diff --git a/net/mac802154/tx.c b/net/mac802154/tx.c index c829e4a753256..7cea95d0b78f9 100644 --- a/net/mac802154/tx.c +++ b/net/mac802154/tx.c @@ -34,8 +34,8 @@ void ieee802154_xmit_worker(struct work_struct *work) if (res) goto err_tx; - dev->stats.tx_packets++; - dev->stats.tx_bytes += skb->len; + DEV_STATS_INC(dev, tx_packets); + DEV_STATS_ADD(dev, tx_bytes, skb->len); ieee802154_xmit_complete(&local->hw, skb, false); @@ -86,8 +86,8 @@ ieee802154_tx(struct ieee802154_local *local, struct sk_buff *skb) goto err_tx; } - dev->stats.tx_packets++; - dev->stats.tx_bytes += len; + DEV_STATS_INC(dev, tx_packets); + DEV_STATS_ADD(dev, tx_bytes, len); } else { local->tx_skb = skb; queue_work(local->workqueue, &local->tx_work); -- 2.43.0

12 months

1
6
0 0

[PATCH AUTOSEL 5.15 1/9] net: mac802154: Fix racy device stats updates by DEV_STATS_INC() and DEV_STATS_ADD()

by Sasha Levin

From: Yunshui Jiang <jiangyunshui(a)kylinos.cn> [ Upstream commit b8ec0dc3845f6c9089573cb5c2c4b05f7fc10728 ] mac802154 devices update their dev->stats fields locklessly. Therefore these counters should be updated atomically. Adopt SMP safe DEV_STATS_INC() and DEV_STATS_ADD() to achieve this. Signed-off-by: Yunshui Jiang <jiangyunshui(a)kylinos.cn> Message-ID: <20240531080739.2608969-1-jiangyunshui(a)kylinos.cn> Signed-off-by: Stefan Schmidt <stefan(a)datenfreihafen.org> Signed-off-by: Sasha Levin <sashal(a)kernel.org> --- net/mac802154/tx.c | 8 ++++---- 1 file changed, 4 insertions(+), 4 deletions(-) diff --git a/net/mac802154/tx.c b/net/mac802154/tx.c index c829e4a753256..7cea95d0b78f9 100644 --- a/net/mac802154/tx.c +++ b/net/mac802154/tx.c @@ -34,8 +34,8 @@ void ieee802154_xmit_worker(struct work_struct *work) if (res) goto err_tx; - dev->stats.tx_packets++; - dev->stats.tx_bytes += skb->len; + DEV_STATS_INC(dev, tx_packets); + DEV_STATS_ADD(dev, tx_bytes, skb->len); ieee802154_xmit_complete(&local->hw, skb, false); @@ -86,8 +86,8 @@ ieee802154_tx(struct ieee802154_local *local, struct sk_buff *skb) goto err_tx; } - dev->stats.tx_packets++; - dev->stats.tx_bytes += len; + DEV_STATS_INC(dev, tx_packets); + DEV_STATS_ADD(dev, tx_bytes, len); } else { local->tx_skb = skb; queue_work(local->workqueue, &local->tx_work); -- 2.43.0

12 months

1
8
0 0

[PATCH AUTOSEL 6.1 01/15] net: mac802154: Fix racy device stats updates by DEV_STATS_INC() and DEV_STATS_ADD()

by Sasha Levin

From: Yunshui Jiang <jiangyunshui(a)kylinos.cn> [ Upstream commit b8ec0dc3845f6c9089573cb5c2c4b05f7fc10728 ] mac802154 devices update their dev->stats fields locklessly. Therefore these counters should be updated atomically. Adopt SMP safe DEV_STATS_INC() and DEV_STATS_ADD() to achieve this. Signed-off-by: Yunshui Jiang <jiangyunshui(a)kylinos.cn> Message-ID: <20240531080739.2608969-1-jiangyunshui(a)kylinos.cn> Signed-off-by: Stefan Schmidt <stefan(a)datenfreihafen.org> Signed-off-by: Sasha Levin <sashal(a)kernel.org> --- net/mac802154/tx.c | 8 ++++---- 1 file changed, 4 insertions(+), 4 deletions(-) diff --git a/net/mac802154/tx.c b/net/mac802154/tx.c index c829e4a753256..7cea95d0b78f9 100644 --- a/net/mac802154/tx.c +++ b/net/mac802154/tx.c @@ -34,8 +34,8 @@ void ieee802154_xmit_worker(struct work_struct *work) if (res) goto err_tx; - dev->stats.tx_packets++; - dev->stats.tx_bytes += skb->len; + DEV_STATS_INC(dev, tx_packets); + DEV_STATS_ADD(dev, tx_bytes, skb->len); ieee802154_xmit_complete(&local->hw, skb, false); @@ -86,8 +86,8 @@ ieee802154_tx(struct ieee802154_local *local, struct sk_buff *skb) goto err_tx; } - dev->stats.tx_packets++; - dev->stats.tx_bytes += len; + DEV_STATS_INC(dev, tx_packets); + DEV_STATS_ADD(dev, tx_bytes, len); } else { local->tx_skb = skb; queue_work(local->workqueue, &local->tx_work); -- 2.43.0

12 months

1
14
0 0

[PATCH AUTOSEL 6.6 01/18] net: mac802154: Fix racy device stats updates by DEV_STATS_INC() and DEV_STATS_ADD()

by Sasha Levin

From: Yunshui Jiang <jiangyunshui(a)kylinos.cn> [ Upstream commit b8ec0dc3845f6c9089573cb5c2c4b05f7fc10728 ] mac802154 devices update their dev->stats fields locklessly. Therefore these counters should be updated atomically. Adopt SMP safe DEV_STATS_INC() and DEV_STATS_ADD() to achieve this. Signed-off-by: Yunshui Jiang <jiangyunshui(a)kylinos.cn> Message-ID: <20240531080739.2608969-1-jiangyunshui(a)kylinos.cn> Signed-off-by: Stefan Schmidt <stefan(a)datenfreihafen.org> Signed-off-by: Sasha Levin <sashal(a)kernel.org> --- net/mac802154/tx.c | 8 ++++---- 1 file changed, 4 insertions(+), 4 deletions(-) diff --git a/net/mac802154/tx.c b/net/mac802154/tx.c index 2a6f1ed763c9b..6fbed5bb5c3e0 100644 --- a/net/mac802154/tx.c +++ b/net/mac802154/tx.c @@ -34,8 +34,8 @@ void ieee802154_xmit_sync_worker(struct work_struct *work) if (res) goto err_tx; - dev->stats.tx_packets++; - dev->stats.tx_bytes += skb->len; + DEV_STATS_INC(dev, tx_packets); + DEV_STATS_ADD(dev, tx_bytes, skb->len); ieee802154_xmit_complete(&local->hw, skb, false); @@ -90,8 +90,8 @@ ieee802154_tx(struct ieee802154_local *local, struct sk_buff *skb) if (ret) goto err_wake_netif_queue; - dev->stats.tx_packets++; - dev->stats.tx_bytes += len; + DEV_STATS_INC(dev, tx_packets); + DEV_STATS_ADD(dev, tx_bytes, len); } else { local->tx_skb = skb; queue_work(local->workqueue, &local->sync_tx_work); -- 2.43.0

12 months

1
17
0 0

[PATCH 6.6] btrfs: tree-checker: add type and sequence check for inline backrefs

by Qu Wenruo

commit 1645c283a87c61f84b2bffd81f50724df959b11a upstream. [BUG] There is a bug report that ntfs2btrfs had a bug that it can lead to transaction abort and the filesystem flips to read-only. [CAUSE] For inline backref items, kernel has a strict requirement for their ordered, they must follow the following rules: - All btrfs_extent_inline_ref::type should be in an ascending order - Within the same type, the items should follow a descending order by their sequence number For EXTENT_DATA_REF type, the sequence number is result from hash_extent_data_ref(). For other types, their sequence numbers are btrfs_extent_inline_ref::offset. Thus if there is any code not following above rules, the resulted inline backrefs can prevent the kernel to locate the needed inline backref and lead to transaction abort. [FIX] Ntrfs2btrfs has already fixed the problem, and btrfs-progs has added the ability to detect such problems. For kernel, let's be more noisy and be more specific about the order, so that the next time kernel hits such problem we would reject it in the first place, without leading to transaction abort. Cc: stable(a)vger.kernel.org # 6.6 Link: https://github.com/kdave/btrfs-progs/pull/622 Reviewed-by: Josef Bacik <josef(a)toxicpanda.com> [ Fix a conflict due to header cleanup. ] Signed-off-by: Qu Wenruo <wqu(a)suse.com> Signed-off-by: David Sterba <dsterba(a)suse.com> --- fs/btrfs/tree-checker.c | 39 +++++++++++++++++++++++++++++++++++++++ 1 file changed, 39 insertions(+) diff --git a/fs/btrfs/tree-checker.c b/fs/btrfs/tree-checker.c index cc6bc5985120..5d6cfa618dc4 100644 --- a/fs/btrfs/tree-checker.c +++ b/fs/btrfs/tree-checker.c @@ -29,6 +29,7 @@ #include "accessors.h" #include "file-item.h" #include "inode-item.h" +#include "extent-tree.h" /* * Error message should follow the following format: @@ -1274,6 +1275,8 @@ static int check_extent_item(struct extent_buffer *leaf, unsigned long ptr; /* Current pointer inside inline refs */ unsigned long end; /* Extent item end */ const u32 item_size = btrfs_item_size(leaf, slot); + u8 last_type = 0; + u64 last_seq = U64_MAX; u64 flags; u64 generation; u64 total_refs; /* Total refs in btrfs_extent_item */ @@ -1320,6 +1323,18 @@ static int check_extent_item(struct extent_buffer *leaf, * 2.2) Ref type specific data * Either using btrfs_extent_inline_ref::offset, or specific * data structure. + * + * All above inline items should follow the order: + * + * - All btrfs_extent_inline_ref::type should be in an ascending + * order + * + * - Within the same type, the items should follow a descending + * order by their sequence number. The sequence number is + * determined by: + * * btrfs_extent_inline_ref::offset for all types other than + * EXTENT_DATA_REF + * * hash_extent_data_ref() for EXTENT_DATA_REF */ if (unlikely(item_size < sizeof(*ei))) { extent_err(leaf, slot, @@ -1401,6 +1416,7 @@ static int check_extent_item(struct extent_buffer *leaf, struct btrfs_extent_inline_ref *iref; struct btrfs_extent_data_ref *dref; struct btrfs_shared_data_ref *sref; + u64 seq; u64 dref_offset; u64 inline_offset; u8 inline_type; @@ -1414,6 +1430,7 @@ static int check_extent_item(struct extent_buffer *leaf, iref = (struct btrfs_extent_inline_ref *)ptr; inline_type = btrfs_extent_inline_ref_type(leaf, iref); inline_offset = btrfs_extent_inline_ref_offset(leaf, iref); + seq = inline_offset; if (unlikely(ptr + btrfs_extent_inline_ref_size(inline_type) > end)) { extent_err(leaf, slot, "inline ref item overflows extent item, ptr %lu iref size %u end %lu", @@ -1444,6 +1461,10 @@ static int check_extent_item(struct extent_buffer *leaf, case BTRFS_EXTENT_DATA_REF_KEY: dref = (struct btrfs_extent_data_ref *)(&iref->offset); dref_offset = btrfs_extent_data_ref_offset(leaf, dref); + seq = hash_extent_data_ref( + btrfs_extent_data_ref_root(leaf, dref), + btrfs_extent_data_ref_objectid(leaf, dref), + btrfs_extent_data_ref_offset(leaf, dref)); if (unlikely(!IS_ALIGNED(dref_offset, fs_info->sectorsize))) { extent_err(leaf, slot, @@ -1470,6 +1491,24 @@ static int check_extent_item(struct extent_buffer *leaf, inline_type); return -EUCLEAN; } + if (inline_type < last_type) { + extent_err(leaf, slot, + "inline ref out-of-order: has type %u, prev type %u", + inline_type, last_type); + return -EUCLEAN; + } + /* Type changed, allow the sequence starts from U64_MAX again. */ + if (inline_type > last_type) + last_seq = U64_MAX; + if (seq > last_seq) { + extent_err(leaf, slot, +"inline ref out-of-order: has type %u offset %llu seq 0x%llx, prev type %u seq 0x%llx", + inline_type, inline_offset, seq, + last_type, last_seq); + return -EUCLEAN; + } + last_type = inline_type; + last_seq = seq; ptr += btrfs_extent_inline_ref_size(inline_type); } /* No padding is allowed */ -- 2.45.2

12 months

2
1
0 0

[PATCH 6.1] sched: Move psi_account_irqtime() out of update_rq_clock_task() hotpath

by John Stultz

commit ddae0ca2a8fe12d0e24ab10ba759c3fbd755ada8 upstream. It was reported that in moving to 6.1, a larger then 10% regression was seen in the performance of clock_gettime(CLOCK_THREAD_CPUTIME_ID,...). Using a simple reproducer, I found: 5.10: 100000000 calls in 24345994193 ns => 243.460 ns per call 100000000 calls in 24288172050 ns => 242.882 ns per call 100000000 calls in 24289135225 ns => 242.891 ns per call 6.1: 100000000 calls in 28248646742 ns => 282.486 ns per call 100000000 calls in 28227055067 ns => 282.271 ns per call 100000000 calls in 28177471287 ns => 281.775 ns per call The cause of this was finally narrowed down to the addition of psi_account_irqtime() in update_rq_clock_task(), in commit 52b1364ba0b1 ("sched/psi: Add PSI_IRQ to track IRQ/SOFTIRQ pressure"). In my initial attempt to resolve this, I leaned towards moving all accounting work out of the clock_gettime() call path, but it wasn't very pretty, so it will have to wait for a later deeper rework. Instead, Peter shared this approach: Rework psi_account_irqtime() to use its own psi_irq_time base for accounting, and move it out of the hotpath, calling it instead from sched_tick() and __schedule(). In testing this, we found the importance of ensuring psi_account_irqtime() is run under the rq_lock, which Johannes Weiner helpfully explained, so also add some lockdep annotations to make that requirement clear. With this change the performance is back in-line with 5.10: 6.1+fix: 100000000 calls in 24297324597 ns => 242.973 ns per call 100000000 calls in 24318869234 ns => 243.189 ns per call 100000000 calls in 24291564588 ns => 242.916 ns per call Reported-by: Jimmy Shiu <jimmyshiu(a)google.com> Originally-by: Peter Zijlstra <peterz(a)infradead.org> Signed-off-by: John Stultz <jstultz(a)google.com> Signed-off-by: Peter Zijlstra (Intel) <peterz(a)infradead.org> Reviewed-by: Chengming Zhou <chengming.zhou(a)linux.dev> Reviewed-by: Qais Yousef <qyousef(a)layalina.io> Link: https://lore.kernel.org/r/20240618215909.4099720-1-jstultz@google.com Fixes: 52b1364ba0b1 ("sched/psi: Add PSI_IRQ to track IRQ/SOFTIRQ pressure") [jstultz: Fixed up minor collisions w/ 6.1-stable] Signed-off-by: John Stultz <jstultz(a)google.com> --- kernel/sched/core.c | 7 +++++-- kernel/sched/psi.c | 21 ++++++++++++++++----- kernel/sched/sched.h | 1 + kernel/sched/stats.h | 11 ++++++++--- 4 files changed, 30 insertions(+), 10 deletions(-) diff --git a/kernel/sched/core.c b/kernel/sched/core.c index d71234729edb..cac41c49bd2f 100644 --- a/kernel/sched/core.c +++ b/kernel/sched/core.c @@ -701,7 +701,6 @@ static void update_rq_clock_task(struct rq *rq, s64 delta) rq->prev_irq_time += irq_delta; delta -= irq_delta; - psi_account_irqtime(rq->curr, irq_delta); #endif #ifdef CONFIG_PARAVIRT_TIME_ACCOUNTING if (static_key_false((&paravirt_steal_rq_enabled))) { @@ -5500,7 +5499,7 @@ void scheduler_tick(void) { int cpu = smp_processor_id(); struct rq *rq = cpu_rq(cpu); - struct task_struct *curr = rq->curr; + struct task_struct *curr; struct rq_flags rf; unsigned long thermal_pressure; u64 resched_latency; @@ -5512,6 +5511,9 @@ void scheduler_tick(void) rq_lock(rq, &rf); + curr = rq->curr; + psi_account_irqtime(rq, curr, NULL); + update_rq_clock(rq); thermal_pressure = arch_scale_thermal_pressure(cpu_of(rq)); update_thermal_load_avg(rq_clock_thermal(rq), rq, thermal_pressure); @@ -6550,6 +6552,7 @@ static void __sched notrace __schedule(unsigned int sched_mode) ++*switch_count; migrate_disable_switch(rq, prev); + psi_account_irqtime(rq, prev, next); psi_sched_switch(prev, next, !task_on_rq_queued(prev)); trace_sched_switch(sched_mode & SM_MASK_PREEMPT, prev, next, prev_state); diff --git a/kernel/sched/psi.c b/kernel/sched/psi.c index 80d8c10e9363..81dbced92df5 100644 --- a/kernel/sched/psi.c +++ b/kernel/sched/psi.c @@ -785,6 +785,7 @@ static void psi_group_change(struct psi_group *group, int cpu, enum psi_states s; u32 state_mask; + lockdep_assert_rq_held(cpu_rq(cpu)); groupc = per_cpu_ptr(group->pcpu, cpu); /* @@ -1003,19 +1004,29 @@ void psi_task_switch(struct task_struct *prev, struct task_struct *next, } #ifdef CONFIG_IRQ_TIME_ACCOUNTING -void psi_account_irqtime(struct task_struct *task, u32 delta) +void psi_account_irqtime(struct rq *rq, struct task_struct *curr, struct task_struct *prev) { - int cpu = task_cpu(task); + int cpu = task_cpu(curr); struct psi_group *group; struct psi_group_cpu *groupc; - u64 now; + u64 now, irq; + s64 delta; - if (!task->pid) + if (!curr->pid) + return; + + lockdep_assert_rq_held(rq); + group = task_psi_group(curr); + if (prev && task_psi_group(prev) == group) return; now = cpu_clock(cpu); + irq = irq_time_read(cpu); + delta = (s64)(irq - rq->psi_irq_time); + if (delta < 0) + return; + rq->psi_irq_time = irq; - group = task_psi_group(task); do { if (!group->enabled) continue; diff --git a/kernel/sched/sched.h b/kernel/sched/sched.h index b62d53d7c264..81d9698f0a1e 100644 --- a/kernel/sched/sched.h +++ b/kernel/sched/sched.h @@ -1084,6 +1084,7 @@ struct rq { #ifdef CONFIG_IRQ_TIME_ACCOUNTING u64 prev_irq_time; + u64 psi_irq_time; #endif #ifdef CONFIG_PARAVIRT u64 prev_steal_time; diff --git a/kernel/sched/stats.h b/kernel/sched/stats.h index 84a188913cc9..b49a96fad1d2 100644 --- a/kernel/sched/stats.h +++ b/kernel/sched/stats.h @@ -110,8 +110,12 @@ __schedstats_from_se(struct sched_entity *se) void psi_task_change(struct task_struct *task, int clear, int set); void psi_task_switch(struct task_struct *prev, struct task_struct *next, bool sleep); -void psi_account_irqtime(struct task_struct *task, u32 delta); - +#ifdef CONFIG_IRQ_TIME_ACCOUNTING +void psi_account_irqtime(struct rq *rq, struct task_struct *curr, struct task_struct *prev); +#else +static inline void psi_account_irqtime(struct rq *rq, struct task_struct *curr, + struct task_struct *prev) {} +#endif /*CONFIG_IRQ_TIME_ACCOUNTING */ /* * PSI tracks state that persists across sleeps, such as iowaits and * memory stalls. As a result, it has to distinguish between sleeps, @@ -206,7 +210,8 @@ static inline void psi_ttwu_dequeue(struct task_struct *p) {} static inline void psi_sched_switch(struct task_struct *prev, struct task_struct *next, bool sleep) {} -static inline void psi_account_irqtime(struct task_struct *task, u32 delta) {} +static inline void psi_account_irqtime(struct rq *rq, struct task_struct *curr, + struct task_struct *prev) {} #endif /* CONFIG_PSI */ #ifdef CONFIG_SCHED_INFO -- 2.45.2.993.g49e7a77208-goog

12 months

2
1
0 0

[PATCH 6.6/6.9] ext4: avoid ptr null pointer dereference

by libaokun＠huaweicloud.com

From: Baokun Li <libaokun1(a)huawei.com> When commit 13df4d44a3aa ("ext4: fix slab-out-of-bounds in ext4_mb_find_good_group_avg_frag_lists()") was backported to stable, the commit f536808adcc3 ("ext4: refactor out ext4_generic_attr_store()") that uniformly determines if the ptr is null is not merged in, so it needs to be judged whether ptr is null or not in each case of the switch, otherwise null pointer dereferencing may occur. Signed-off-by: Baokun Li <libaokun1(a)huawei.com> --- fs/ext4/sysfs.c | 2 ++ 1 file changed, 2 insertions(+) diff --git a/fs/ext4/sysfs.c b/fs/ext4/sysfs.c index 63cbda3700ea..d65dccb44ed5 100644 --- a/fs/ext4/sysfs.c +++ b/fs/ext4/sysfs.c @@ -473,6 +473,8 @@ static ssize_t ext4_attr_store(struct kobject *kobj, *((unsigned int *) ptr) = t; return len; case attr_clusters_in_group: + if (!ptr) + return 0; ret = kstrtouint(skip_spaces(buf), 0, &t); if (ret) return ret; -- 2.39.2

12 months

4
4
0 0

[PATCH 4.19 5.4 5.10 5.15 6.1 6.6] nilfs2: fix kernel bug on rename operation of broken directory

by Ryusuke Konishi

commit a9e1ddc09ca55746079cc479aa3eb6411f0d99d4 upstream. Syzbot reported that in rename directory operation on broken directory on nilfs2, __block_write_begin_int() called to prepare block write may fail BUG_ON check for access exceeding the folio/page size. This is because nilfs_dotdot(), which gets parent directory reference entry ("..") of the directory to be moved or renamed, does not check consistency enough, and may return location exceeding folio/page size for broken directories. Fix this issue by checking required directory entries ("." and "..") in the first chunk of the directory in nilfs_dotdot(). Link: https://lkml.kernel.org/r/20240628165107.9006-1-konishi.ryusuke@gmail.com Signed-off-by: Ryusuke Konishi <konishi.ryusuke(a)gmail.com> Reported-by: syzbot+d3abed1ad3d367fa2627(a)syzkaller.appspotmail.com Closes: https://syzkaller.appspot.com/bug?extid=d3abed1ad3d367fa2627 Fixes: 2ba466d74ed7 ("nilfs2: directory entry operations") Tested-by: Ryusuke Konishi <konishi.ryusuke(a)gmail.com> Cc: <stable(a)vger.kernel.org> Signed-off-by: Andrew Morton <akpm(a)linux-foundation.org> --- Please apply this patch to the stable trees indicated by the subject prefix instead of the patch that failed. This patch is tailored to take page/folio conversion into account and can be applied to these stable trees. Also, all the builds and tests I did on each stable tree passed. Thanks, Ryusuke Konishi fs/nilfs2/dir.c | 32 ++++++++++++++++++++++++++++++-- 1 file changed, 30 insertions(+), 2 deletions(-) diff --git a/fs/nilfs2/dir.c b/fs/nilfs2/dir.c index 51c982ad9608..53e4e63c607e 100644 --- a/fs/nilfs2/dir.c +++ b/fs/nilfs2/dir.c @@ -396,11 +396,39 @@ nilfs_find_entry(struct inode *dir, const struct qstr *qstr, struct nilfs_dir_entry *nilfs_dotdot(struct inode *dir, struct page **p) { - struct nilfs_dir_entry *de = nilfs_get_page(dir, 0, p); + struct page *page; + struct nilfs_dir_entry *de, *next_de; + size_t limit; + char *msg; + de = nilfs_get_page(dir, 0, &page); if (IS_ERR(de)) return NULL; - return nilfs_next_entry(de); + + limit = nilfs_last_byte(dir, 0); /* is a multiple of chunk size */ + if (unlikely(!limit || le64_to_cpu(de->inode) != dir->i_ino || + !nilfs_match(1, ".", de))) { + msg = "missing '.'"; + goto fail; + } + + next_de = nilfs_next_entry(de); + /* + * If "next_de" has not reached the end of the chunk, there is + * at least one more record. Check whether it matches "..". + */ + if (unlikely((char *)next_de == (char *)de + nilfs_chunk_size(dir) || + !nilfs_match(2, "..", next_de))) { + msg = "missing '..'"; + goto fail; + } + *p = page; + return next_de; + +fail: + nilfs_error(dir->i_sb, "directory #%lu %s", dir->i_ino, msg); + nilfs_put_page(page); + return NULL; } ino_t nilfs_inode_by_name(struct inode *dir, const struct qstr *qstr) -- 2.43.5

12 months

2
1
0 0

[PATCH 4.14] SUNRPC: Fix RPC client cleaned up the freed pipefs dentries

by Hagar Hemdan

From: felix <fuzhen5(a)huawei.com> [ Upstream commit bfca5fb4e97c46503ddfc582335917b0cc228264 ] RPC client pipefs dentries cleanup is in separated rpc_remove_pipedir() workqueue,which takes care about pipefs superblock locking. In some special scenarios, when kernel frees the pipefs sb of the current client and immediately alloctes a new pipefs sb, rpc_remove_pipedir function would misjudge the existence of pipefs sb which is not the one it used to hold. As a result, the rpc_remove_pipedir would clean the released freed pipefs dentries. To fix this issue, rpc_remove_pipedir should check whether the current pipefs sb is consistent with the original pipefs sb. This error can be catched by KASAN: ========================================================= [ 250.497700] BUG: KASAN: slab-use-after-free in dget_parent+0x195/0x200 [ 250.498315] Read of size 4 at addr ffff88800a2ab804 by task kworker/0:18/106503 [ 250.500549] Workqueue: events rpc_free_client_work [ 250.501001] Call Trace: [ 250.502880] kasan_report+0xb6/0xf0 [ 250.503209] ? dget_parent+0x195/0x200 [ 250.503561] dget_parent+0x195/0x200 [ 250.503897] ? __pfx_rpc_clntdir_depopulate+0x10/0x10 [ 250.504384] rpc_rmdir_depopulate+0x1b/0x90 [ 250.504781] rpc_remove_client_dir+0xf5/0x150 [ 250.505195] rpc_free_client_work+0xe4/0x230 [ 250.505598] process_one_work+0x8ee/0x13b0 ... [ 22.039056] Allocated by task 244: [ 22.039390] kasan_save_stack+0x22/0x50 [ 22.039758] kasan_set_track+0x25/0x30 [ 22.040109] __kasan_slab_alloc+0x59/0x70 [ 22.040487] kmem_cache_alloc_lru+0xf0/0x240 [ 22.040889] __d_alloc+0x31/0x8e0 [ 22.041207] d_alloc+0x44/0x1f0 [ 22.041514] __rpc_lookup_create_exclusive+0x11c/0x140 [ 22.041987] rpc_mkdir_populate.constprop.0+0x5f/0x110 [ 22.042459] rpc_create_client_dir+0x34/0x150 [ 22.042874] rpc_setup_pipedir_sb+0x102/0x1c0 [ 22.043284] rpc_client_register+0x136/0x4e0 [ 22.043689] rpc_new_client+0x911/0x1020 [ 22.044057] rpc_create_xprt+0xcb/0x370 [ 22.044417] rpc_create+0x36b/0x6c0 ... [ 22.049524] Freed by task 0: [ 22.049803] kasan_save_stack+0x22/0x50 [ 22.050165] kasan_set_track+0x25/0x30 [ 22.050520] kasan_save_free_info+0x2b/0x50 [ 22.050921] __kasan_slab_free+0x10e/0x1a0 [ 22.051306] kmem_cache_free+0xa5/0x390 [ 22.051667] rcu_core+0x62c/0x1930 [ 22.051995] __do_softirq+0x165/0x52a [ 22.052347] [ 22.052503] Last potentially related work creation: [ 22.052952] kasan_save_stack+0x22/0x50 [ 22.053313] __kasan_record_aux_stack+0x8e/0xa0 [ 22.053739] __call_rcu_common.constprop.0+0x6b/0x8b0 [ 22.054209] dentry_free+0xb2/0x140 [ 22.054540] __dentry_kill+0x3be/0x540 [ 22.054900] shrink_dentry_list+0x199/0x510 [ 22.055293] shrink_dcache_parent+0x190/0x240 [ 22.055703] do_one_tree+0x11/0x40 [ 22.056028] shrink_dcache_for_umount+0x61/0x140 [ 22.056461] generic_shutdown_super+0x70/0x590 [ 22.056879] kill_anon_super+0x3a/0x60 [ 22.057234] rpc_kill_sb+0x121/0x200 Fixes: 0157d021d23a ("SUNRPC: handle RPC client pipefs dentries by network namespace aware routines") Signed-off-by: felix <fuzhen5(a)huawei.com> Signed-off-by: Trond Myklebust <trond.myklebust(a)hammerspace.com> Signed-off-by: Hagar Hemdan <hagarhem(a)amazon.com> --- include/linux/sunrpc/clnt.h | 1 + net/sunrpc/clnt.c | 5 ++++- 2 files changed, 5 insertions(+), 1 deletion(-) diff --git a/include/linux/sunrpc/clnt.h b/include/linux/sunrpc/clnt.h index 166fc4e76df6..b324adc24bee 100644 --- a/include/linux/sunrpc/clnt.h +++ b/include/linux/sunrpc/clnt.h @@ -70,6 +70,7 @@ struct rpc_clnt { struct dentry *cl_debugfs; /* debugfs directory */ #endif struct rpc_xprt_iter cl_xpi; + struct super_block *pipefs_sb; }; /* diff --git a/net/sunrpc/clnt.c b/net/sunrpc/clnt.c index de917d45e512..82994e3c0fd0 100644 --- a/net/sunrpc/clnt.c +++ b/net/sunrpc/clnt.c @@ -112,7 +112,8 @@ static void rpc_clnt_remove_pipedir(struct rpc_clnt *clnt) pipefs_sb = rpc_get_sb_net(net); if (pipefs_sb) { - __rpc_clnt_remove_pipedir(clnt); + if (pipefs_sb == clnt->pipefs_sb) + __rpc_clnt_remove_pipedir(clnt); rpc_put_sb_net(net); } } @@ -152,6 +153,8 @@ rpc_setup_pipedir(struct super_block *pipefs_sb, struct rpc_clnt *clnt) { struct dentry *dentry; + clnt->pipefs_sb = pipefs_sb; + if (clnt->cl_program->pipe_dir_name != NULL) { dentry = rpc_setup_pipedir_sb(pipefs_sb, clnt); if (IS_ERR(dentry)) -- 2.40.1

12 months

2
1
0 0

2025

2024

2023

2022

2021

2020

2019

2018

2017

Linux-stable-mirror July 2024