Driver API devm_krealloc() calls alloc_dr() with wrong argument
@total_new_size, so causes more memory to be allocated than required
fix this memory waste by using @new_size as the argument for alloc_dr().
Fixes: f82485722e5d ("devres: provide devm_krealloc()")
Cc: stable(a)vger.kernel.org
Signed-off-by: Zijun Hu <quic_zijuhu(a)quicinc.com>
---
Previous discussion link:
https://lore.kernel.org/all/1718531655-29761-1-git-send-email-quic_zijuhu@q…
Changes since the original one:
- Correct tile and commit message
- Add inline comments and stable tag
drivers/base/devres.c | 5 ++++-
1 file changed, 4 insertions(+), 1 deletion(-)
diff --git a/drivers/base/devres.c b/drivers/base/devres.c
index 3df0025d12aa..ff2247eec43c 100644
--- a/drivers/base/devres.c
+++ b/drivers/base/devres.c
@@ -896,9 +896,12 @@ void *devm_krealloc(struct device *dev, void *ptr, size_t new_size, gfp_t gfp)
/*
* Otherwise: allocate new, larger chunk. We need to allocate before
* taking the lock as most probably the caller uses GFP_KERNEL.
+ * alloc_dr() will call check_dr_size() to reserve extra memory
+ * for struct devres automatically, so size @new_size user request
+ * is delivered to it directly as devm_kmalloc() does.
*/
new_dr = alloc_dr(devm_kmalloc_release,
- total_new_size, gfp, dev_to_node(dev));
+ new_size, gfp, dev_to_node(dev));
if (!new_dr)
return NULL;
--
2.34.1
In the future, please send this to the regressions M/L and CC people
instead of just sending a private message.
For now, I've added the @regressions and @stable mailing lists as this
is an issue you find exposed specifically in the LTS series.
Hi Lars,
Can you please test 6.9.7? If this is still failing, can you please
check 6.10-rc6?
I'd like to understand if we just have a missing commit to backport or
it's a problem in the mainline kernel as well.
From the below description it's specifically with boost in passive
mode, right?
If 6.10-rc6 is still affected, can you please see if this commit helps?
https://git.kernel.org/pub/scm/linux/kernel/git/superm1/linux.git/commit/?h…
This is going into 6.11-rc1.
Perry, Jassmine,
Can you try to repro this using bleeding-edge or linux-next branches?
Thanks,
On 7/1/2024 4:33, Huang, Ray wrote:
> [AMD Official Use Only - AMD Internal Distribution Only]
>
> Hi all,
>
> Could you please help for a quick fix?
>
> -----Original Message-----
> From: Lars Wendler <wendler.lars(a)web.de>
> Sent: Monday, July 1, 2024 5:30 PM
> To: Huang, Ray <Ray.Huang(a)amd.com>
> Cc: gregkh(a)linuxfoundation.org
> Subject: linux-6.6.y: Regression in amd-pstate cpufreq driver since 6.6.34
>
> Hello dear kernel developers,
>
> I might have found a regression in the amd-pstate driver of linux-6.6 stable series. I haven't checked linux-master nor any other LTS branch.
>
>
> Now here's what I have found:
>
> Since linux-6.6.34 the following command fails:
>
> # echo 0 > /sys/devices/system/cpu/cpufreq/boost
> -bash: echo: write error: Invalid argument
>
> and indeed, disabling CPU boost seems to not work:
>
> # cat /sys/devices/system/cpu/cpufreq/boost
> 1
>
> I have bisected the issue to commit
> 8f893e52b9e030a25ea62e31271bf930b01f2f07:
>
> cpufreq: amd-pstate: Fix the inconsistency in max frequency units
>
> commit e4731baaf29438508197d3a8a6d4f5a8c51663f8 upstream.
>
> Reverting that commit (even on latest linux-6.6 release) gives me back the ability to disable CPU boost again.
>
> I can only reproduce this bug on my Zen4 machine:
>
> # lscpu | grep "^Model name:" | sed 's@[[:space:]][[:space:]]\+@ @'
> Model name: AMD Ryzen 7 7745HX with Radeon Graphics
>
> My older Zen3 machines seem not to be affected by this issue. All my Ryzen systems run on latest linux-6.6 kernels and have the following configuration regarding amd-pstate:
>
> # zgrep -F AMD_PSTATE /proc/config.gz
> CONFIG_X86_AMD_PSTATE=y
> CONFIG_X86_AMD_PSTATE_DEFAULT_MODE=2
> # CONFIG_X86_AMD_PSTATE_UT is not set
>
>
> If you need more information, please don't hesitate to ask.
>
> Kind regards
> Lars Wendler
Hi stable team,
Could you please backport [1] to linux-5.10.y?
I noticed a regression caused by [2], which was merged to linux-5.10.y since v5.10.80.
After sock_map_unhash() helper was removed in [2], sock elems added to the bpf sock map
via sock_hash_update_common() cannot be removed if they are in the icsk_accept_queue
of the listener sock. Since they have not been accept()ed, they cannot be removed via
sock_map_close()->sock_map_remove_links() either.
It can be reproduced in network test with short-lived connections. If the server is
stopped during the test, there is a probability that some sock elems will remain in
the bpf sock map.
And with [1], the sock_map_destroy() helper is introduced to invoke sock_map_remove_links()
when inet_csk_listen_stop()->inet_child_forget()->inet_csk_destroy_sock(), to remove the
sock elems from the bpf sock map in such situation.
[1] d8616ee2affc ("bpf, sockmap: Fix sk->sk_forward_alloc warn_on in sk_stream_kill_queues")
(link: https://lore.kernel.org/all/20220524075311.649153-1-wangyufen@huawei.com/)
[2] 8b5c98a67c1b ("bpf, sockmap: Remove unhash handler for BPF sockmap usage")
(link: https://lore.kernel.org/all/20211103204736.248403-3-john.fastabend@gmail.co…)
Thanks!
Wen Gu
In case of the COW file, new updates and GC writes are already
separated to page caches of the atomic file and COW file. As some cases
that use the meta inode for GC, there are some race issues between a
foreground thread and GC thread.
To handle them, we need to take care when to invalidate and wait
writeback of GC pages in COW files as the case of using the meta inode.
Also, a pointer from the COW inode to the original inode is required to
check the state of original pages.
For the former, we can solve the problem by using the meta inode for GC
of COW files. Then let's get a page from the original inode in
move_data_block when GCing the COW file to avoid race condition.
Fixes: 3db1de0e582c ("f2fs: change the current atomic write way")
Cc: stable(a)vger.kernel.org #v5.19+
Reviewed-by: Sungjong Seo <sj1557.seo(a)samsung.com>
Reviewed-by: Yeongjin Gil <youngjin.gil(a)samsung.com>
Signed-off-by: Sunmin Jeong <s_min.jeong(a)samsung.com>
---
fs/f2fs/data.c | 2 +-
fs/f2fs/f2fs.h | 7 ++++++-
fs/f2fs/file.c | 3 +++
fs/f2fs/gc.c | 12 ++++++++++--
fs/f2fs/inline.c | 2 +-
fs/f2fs/inode.c | 3 ++-
6 files changed, 23 insertions(+), 6 deletions(-)
diff --git a/fs/f2fs/data.c b/fs/f2fs/data.c
index 05158f89ef32..90ff0f6f7f7f 100644
--- a/fs/f2fs/data.c
+++ b/fs/f2fs/data.c
@@ -2651,7 +2651,7 @@ bool f2fs_should_update_outplace(struct inode *inode, struct f2fs_io_info *fio)
return true;
if (IS_NOQUOTA(inode))
return true;
- if (f2fs_is_atomic_file(inode))
+ if (f2fs_used_in_atomic_write(inode))
return true;
/* rewrite low ratio compress data w/ OPU mode to avoid fragmentation */
if (f2fs_compressed_file(inode) &&
diff --git a/fs/f2fs/f2fs.h b/fs/f2fs/f2fs.h
index 59c5117e54b1..4f9fd1c1d024 100644
--- a/fs/f2fs/f2fs.h
+++ b/fs/f2fs/f2fs.h
@@ -4267,9 +4267,14 @@ static inline bool f2fs_post_read_required(struct inode *inode)
f2fs_compressed_file(inode);
}
+static inline bool f2fs_used_in_atomic_write(struct inode *inode)
+{
+ return f2fs_is_atomic_file(inode) || f2fs_is_cow_file(inode);
+}
+
static inline bool f2fs_meta_inode_gc_required(struct inode *inode)
{
- return f2fs_post_read_required(inode) || f2fs_is_atomic_file(inode);
+ return f2fs_post_read_required(inode) || f2fs_used_in_atomic_write(inode);
}
/*
diff --git a/fs/f2fs/file.c b/fs/f2fs/file.c
index 25b119cf3499..c9f0ba658cfd 100644
--- a/fs/f2fs/file.c
+++ b/fs/f2fs/file.c
@@ -2116,6 +2116,9 @@ static int f2fs_ioc_start_atomic_write(struct file *filp, bool truncate)
set_inode_flag(fi->cow_inode, FI_COW_FILE);
clear_inode_flag(fi->cow_inode, FI_INLINE_DATA);
+
+ /* Set the COW inode's cow_inode to the atomic inode */
+ F2FS_I(fi->cow_inode)->cow_inode = inode;
} else {
/* Reuse the already created COW inode */
ret = f2fs_do_truncate_blocks(fi->cow_inode, 0, true);
diff --git a/fs/f2fs/gc.c b/fs/f2fs/gc.c
index 136b9e8180a3..76854e732b35 100644
--- a/fs/f2fs/gc.c
+++ b/fs/f2fs/gc.c
@@ -1188,7 +1188,11 @@ static int ra_data_block(struct inode *inode, pgoff_t index)
};
int err;
- page = f2fs_grab_cache_page(mapping, index, true);
+ if (f2fs_is_cow_file(inode))
+ page = f2fs_grab_cache_page(F2FS_I(inode)->cow_inode->i_mapping,
+ index, true);
+ else
+ page = f2fs_grab_cache_page(mapping, index, true);
if (!page)
return -ENOMEM;
@@ -1287,7 +1291,11 @@ static int move_data_block(struct inode *inode, block_t bidx,
CURSEG_ALL_DATA_ATGC : CURSEG_COLD_DATA;
/* do not read out */
- page = f2fs_grab_cache_page(inode->i_mapping, bidx, false);
+ if (f2fs_is_cow_file(inode))
+ page = f2fs_grab_cache_page(F2FS_I(inode)->cow_inode->i_mapping,
+ bidx, false);
+ else
+ page = f2fs_grab_cache_page(inode->i_mapping, bidx, false);
if (!page)
return -ENOMEM;
diff --git a/fs/f2fs/inline.c b/fs/f2fs/inline.c
index ac00423f117b..0186ec049db6 100644
--- a/fs/f2fs/inline.c
+++ b/fs/f2fs/inline.c
@@ -16,7 +16,7 @@
static bool support_inline_data(struct inode *inode)
{
- if (f2fs_is_atomic_file(inode))
+ if (f2fs_used_in_atomic_write(inode))
return false;
if (!S_ISREG(inode->i_mode) && !S_ISLNK(inode->i_mode))
return false;
diff --git a/fs/f2fs/inode.c b/fs/f2fs/inode.c
index c26effdce9aa..c810304e2681 100644
--- a/fs/f2fs/inode.c
+++ b/fs/f2fs/inode.c
@@ -807,8 +807,9 @@ void f2fs_evict_inode(struct inode *inode)
f2fs_abort_atomic_write(inode, true);
- if (fi->cow_inode) {
+ if (fi->cow_inode && f2fs_is_cow_file(fi->cow_inode)) {
clear_inode_flag(fi->cow_inode, FI_COW_FILE);
+ F2FS_I(fi->cow_inode)->cow_inode = NULL;
iput(fi->cow_inode);
fi->cow_inode = NULL;
}
--
2.25.1
The page cache of the atomic file keeps new data pages which will be
stored in the COW file. It can also keep old data pages when GCing the
atomic file. In this case, new data can be overwritten by old data if a
GC thread sets the old data page as dirty after new data page was
evicted.
Also, since all writes to the atomic file are redirected to COW inodes,
GC for the atomic file is not working well as below.
f2fs_gc(gc_type=FG_GC)
- select A as a victim segment
do_garbage_collect
- iget atomic file's inode for block B
move_data_page
f2fs_do_write_data_page
- use dn of cow inode
- set fio->old_blkaddr from cow inode
- seg_freed is 0 since block B is still valid
- goto gc_more and A is selected as victim again
To solve the problem, let's separate GC writes and updates in the atomic
file by using the meta inode for GC writes.
Fixes: 3db1de0e582c ("f2fs: change the current atomic write way")
Cc: stable(a)vger.kernel.org #v5.19+
Reviewed-by: Sungjong Seo <sj1557.seo(a)samsung.com>
Reviewed-by: Yeongjin Gil <youngjin.gil(a)samsung.com>
Signed-off-by: Sunmin Jeong <s_min.jeong(a)samsung.com>
---
fs/f2fs/f2fs.h | 5 +++++
fs/f2fs/gc.c | 6 +++---
fs/f2fs/segment.c | 4 ++--
3 files changed, 10 insertions(+), 5 deletions(-)
diff --git a/fs/f2fs/f2fs.h b/fs/f2fs/f2fs.h
index a000cb024dbe..59c5117e54b1 100644
--- a/fs/f2fs/f2fs.h
+++ b/fs/f2fs/f2fs.h
@@ -4267,6 +4267,11 @@ static inline bool f2fs_post_read_required(struct inode *inode)
f2fs_compressed_file(inode);
}
+static inline bool f2fs_meta_inode_gc_required(struct inode *inode)
+{
+ return f2fs_post_read_required(inode) || f2fs_is_atomic_file(inode);
+}
+
/*
* compress.c
*/
diff --git a/fs/f2fs/gc.c b/fs/f2fs/gc.c
index a079eebfb080..136b9e8180a3 100644
--- a/fs/f2fs/gc.c
+++ b/fs/f2fs/gc.c
@@ -1580,7 +1580,7 @@ static int gc_data_segment(struct f2fs_sb_info *sbi, struct f2fs_summary *sum,
start_bidx = f2fs_start_bidx_of_node(nofs, inode) +
ofs_in_node;
- if (f2fs_post_read_required(inode)) {
+ if (f2fs_meta_inode_gc_required(inode)) {
int err = ra_data_block(inode, start_bidx);
f2fs_up_write(&F2FS_I(inode)->i_gc_rwsem[WRITE]);
@@ -1631,7 +1631,7 @@ static int gc_data_segment(struct f2fs_sb_info *sbi, struct f2fs_summary *sum,
start_bidx = f2fs_start_bidx_of_node(nofs, inode)
+ ofs_in_node;
- if (f2fs_post_read_required(inode))
+ if (f2fs_meta_inode_gc_required(inode))
err = move_data_block(inode, start_bidx,
gc_type, segno, off);
else
@@ -1639,7 +1639,7 @@ static int gc_data_segment(struct f2fs_sb_info *sbi, struct f2fs_summary *sum,
segno, off);
if (!err && (gc_type == FG_GC ||
- f2fs_post_read_required(inode)))
+ f2fs_meta_inode_gc_required(inode)))
submitted++;
if (locked) {
diff --git a/fs/f2fs/segment.c b/fs/f2fs/segment.c
index 7e47b8054413..b55fc4bd416a 100644
--- a/fs/f2fs/segment.c
+++ b/fs/f2fs/segment.c
@@ -3823,7 +3823,7 @@ void f2fs_wait_on_block_writeback(struct inode *inode, block_t blkaddr)
struct f2fs_sb_info *sbi = F2FS_I_SB(inode);
struct page *cpage;
- if (!f2fs_post_read_required(inode))
+ if (!f2fs_meta_inode_gc_required(inode))
return;
if (!__is_valid_data_blkaddr(blkaddr))
@@ -3842,7 +3842,7 @@ void f2fs_wait_on_block_writeback_range(struct inode *inode, block_t blkaddr,
struct f2fs_sb_info *sbi = F2FS_I_SB(inode);
block_t i;
- if (!f2fs_post_read_required(inode))
+ if (!f2fs_meta_inode_gc_required(inode))
return;
for (i = 0; i < len; i++)
--
2.25.1