The patch below does not apply to the 5.15-stable tree.
If someone wants it applied there, or to any other stable or longterm
tree, then please email the backport, including the original git commit
id to <stable(a)vger.kernel.org>.
To reproduce the conflict and resubmit, you may use the following commands:
git fetch https://git.kernel.org/pub/scm/linux/kernel/git/stable/linux.git/ linux-5.15.y
git checkout FETCH_HEAD
git cherry-pick -x e255683c06df572ead96db5efb5d21be30c0efaa
# <resolve conflicts, build, test, etc.>
git commit -s
git send-email --to '<stable(a)vger.kernel.org>' --in-reply-to '2024082621-unluckily-aghast-028b@gregkh' --subject-prefix 'PATCH 5.15.y' HEAD^..
Possible dependencies:
e255683c06df ("mptcp: pm: re-using ID of unused removed ADD_ADDR")
4b317e0eb287 ("mptcp: fix NL PM announced address accounting")
6fa0174a7c86 ("mptcp: more careful RM_ADDR generation")
7d9bf018f907 ("selftests: mptcp: update output info of chk_rm_nr")
327b9a94e2a8 ("selftests: mptcp: more stable join tests-cases")
thanks,
greg k-h
------------------ original commit in Linus's tree ------------------
From e255683c06df572ead96db5efb5d21be30c0efaa Mon Sep 17 00:00:00 2001
From: "Matthieu Baerts (NGI0)" <matttbe(a)kernel.org>
Date: Mon, 19 Aug 2024 21:45:19 +0200
Subject: [PATCH] mptcp: pm: re-using ID of unused removed ADD_ADDR
If no subflow is attached to the 'signal' endpoint that is being
removed, the addr ID will not be marked as available again.
Mark the linked ID as available when removing the address entry from the
list to cover this case.
Fixes: b6c08380860b ("mptcp: remove addr and subflow in PM netlink")
Cc: stable(a)vger.kernel.org
Reviewed-by: Mat Martineau <martineau(a)kernel.org>
Signed-off-by: Matthieu Baerts (NGI0) <matttbe(a)kernel.org>
Link: https://patch.msgid.link/20240819-net-mptcp-pm-reusing-id-v1-1-38035d40de5b…
Signed-off-by: Jakub Kicinski <kuba(a)kernel.org>
diff --git a/net/mptcp/pm_netlink.c b/net/mptcp/pm_netlink.c
index 4cae2aa7be5c..26f0329e16bb 100644
--- a/net/mptcp/pm_netlink.c
+++ b/net/mptcp/pm_netlink.c
@@ -1431,7 +1431,10 @@ static bool mptcp_pm_remove_anno_addr(struct mptcp_sock *msk,
ret = remove_anno_list_by_saddr(msk, addr);
if (ret || force) {
spin_lock_bh(&msk->pm.lock);
- msk->pm.add_addr_signaled -= ret;
+ if (ret) {
+ __set_bit(addr->id, msk->pm.id_avail_bitmap);
+ msk->pm.add_addr_signaled--;
+ }
mptcp_pm_remove_addr(msk, &list);
spin_unlock_bh(&msk->pm.lock);
}
v3:
- Amends the commit log for patch #1 per Johan's suggestion.
- Link to v2: https://lore.kernel.org/r/20240716-linux-next-24-07-13-camss-fixes-v2-0-e60…
v2:
- Updates commits with Johan's Review/Reported tags
- Adds Closes: https://lore.kernel.org/lkml/ZoVNHOTI0PKMNt4_@hovoldconsulting.com
- Cc's stable
- Adds in suggested kernel log to allow others to more easily match kernel
log to fixes
- Link to v1: https://lore.kernel.org/r/20240714-linux-next-24-07-13-camss-fixes-v1-0-8f8…
V1:
Dogfooding with SoftISP has uncovered two bugs in this series which I'm
posting fixes for.
- The first error:
A simple race condition which to be honest I'm surprised I haven't found
earlier nor has anybody else. Simply stated the order we typically
end up loading CAMSS on boot has masked out the pm_runtime_enable() race
condition that has been present in CAMSS for a long time.
If you blacklist qcom-camss in modules.d and then modprobe after boot,
the race condition shows up easily.
Moving the pm_runtime_enable prior to subdevice registration fixes the
problem.
The second error:
Nomenclature:
- CSIPHY: CSI Physical layer analogue to digital domain serialiser
- CSID: CSI Decoder
- VFE: Video Front End
- RDI: Raw Data Interface
- VC: Virtual Channel
In order to support streaming multiple virtual-channels on the same RDI a
V4L2 provided use_count variable is used to decide whether or not to actually
terminate streaming and release buffers for 'msm_vfe_rdiX'.
Unfortunately use_count indicates the number of times msm_vfe_rdiX has
been opened by user-space not the number of concurrent streams on
msm_vfe_rdiX.
Simply stated use_count and stream_count are two different things.
The silicon enabling code to select between VCs is valid but, a different
solution needs to be found to support _concurrent_ VC streams.
Right now the upstream use_count as-is is breaking the non concurrent VC
case and I don't believe there are upstream users of concurrent VCs on
CAMSS.
This series implements a revert for the invalid use_count check,
retaining the ability to select which VC is active on the RDI.
Dogfooding with libcamera's SoftISP in Hangouts, Zoom and multiple runs
of libcamera's "qcam" application is a very different test-case to the
simple capture of frames we previously did when validating the
'use_count' change.
A partial revert in expectation of a renewed push to fixup that
concurrent VC issue is included.
Signed-off-by: Bryan O'Donoghue <bryan.odonoghue(a)linaro.org>
---
Bryan O'Donoghue (2):
media: qcom: camss: Remove use_count guard in stop_streaming
media: qcom: camss: Fix ordering of pm_runtime_enable
drivers/media/platform/qcom/camss/camss-video.c | 6 ------
drivers/media/platform/qcom/camss/camss.c | 5 +++--
2 files changed, 3 insertions(+), 8 deletions(-)
---
base-commit: c6ce8f9ab92edc9726996a0130bfc1c408132d47
change-id: 20240713-linux-next-24-07-13-camss-fixes-fa98c0965a5d
Best regards,
--
Bryan O'Donoghue <bryan.odonoghue(a)linaro.org>
#regzbot introduced: 3ee1a1fc3981
Dear maintainers,
I think I have found a cifs regression in the 6.10 kernel series, which leads
certain programs to write corrupt data.
After upgrading from kernel 6.9.12 to 6.10.6, flatpak and ostree are now
writing bad gpg signatures when exporting signed packages or signing their
repository metadata/summary files, whenever the repository is on a cifs mount.
Instead of writing the signature data, null bytes are written in its place.
Furthermore, ffmpeg and mkvmerge are now intermittently writing corrupt files
to cifs mounts.
No error is reported by the applications or the kernel when it happens.
In the case of flatpak, the problem isn't revealed until something tries to use
the repository and finds signatures full of null bytes. (Of course, this means
the affected repositories have been rendered useless.) In the case of ffmpeg
and mkvmerge, the problem isn't revealed until someone plays the video file and
reaches a corrupt section.
A kernel bisect reveals this:
3ee1a1fc39819906f04d6c62c180e760cd3a689d is the first bad commit
commit 3ee1a1fc39819906f04d6c62c180e760cd3a689d
Author: David Howells <dhowells(a)redhat.com>
Date: Fri Oct 6 18:29:59 2023 +0100
cifs: Cut over to using netfslib
I was unable to determine whether 6.11.0-rc4 fixes it, due to another cifs bug
in that version (which I hope to report soon).
An strace of flatpak (which uses libostree) shows it generating correct
signatures internally, but behaving differently on cifs vs. ext4 when working
with memory-mapped temp files, in which the signatures are stored before being
written to their final outputs. Here's where I reported my initial findings to
those projects:
https://github.com/flatpak/flatpak/issues/5911https://github.com/ostreedev/ostree/issues/3288
Debian Testing and Unstable kernels (6.10.4-1 and 6.10.6-1) are affected:
https://bugs.debian.org/cgi-bin/bugreport.cgi?bug=1079394
The following reproducer script consistently triggers the problem for me. Run
it with two arguments: a path on a cifs mount where an ostree repo should be
created, and a GPG key ID with which to sign a commit.
#!/bin/sh
set -e
if [ "$#" -lt 2 ] || [ "$1" = "-h" ] ; then
echo "usage: $(basename "$0") <repo-dir> <gpg-key-id>"
exit 2
fi
repo=$1
keyid=$2
src="./foo"
echo "creating ostree repo at $repo"
ostree init --repo="$repo"
echo "creating source file tree at $src"
mkdir -p "$src"
echo hi > "$src"/hello
ostree commit --repo="$repo" --branch=foo --gpg-sign="$keyid" "$src"
if ostree show --repo="$repo" foo; then
echo ---
echo success!
else
echo ---
ostree show --repo="$repo" --print-detached-metadata-key=ostree.gpgsigs foo
echo failure!
echo look for null bytes in the above commit signature
fi
Fix a few issues in rescind handling in uio_hv_generic driver.
Patches are based on latest linux-next tip.
Steps to reproduce issue:
* Probe uio_hv_generic driver and create channels to use fcopy
* Disable the guest service on host and then Enable it.
or
* repeatedly do cat "/dev/uioX" on the device created for fcopy.
Changes since v1:
https://lore.kernel.org/all/20240822110912.13735-1-namjain@linux.microsoft.…
* Added stable kernel list to cc
* Updated commit messages for more information
* Explicitly handle rescind callback for primary channel only, and add
comment: Saurabh, Michael.
* Rebase to latest tip.
Naman Jain (1):
Drivers: hv: vmbus: Fix rescind handling in uio_hv_generic
Saurabh Sengar (1):
uio_hv_generic: Fix kernel NULL pointer dereference in hv_uio_rescind
drivers/hv/vmbus_drv.c | 1 +
drivers/uio/uio_hv_generic.c | 11 ++++++++++-
2 files changed, 11 insertions(+), 1 deletion(-)
base-commit: 195a402a75791e6e0d96d9da27ca77671bc656a8
--
2.34.1
From: Anirudh Rayabharam (Microsoft) <anirudh(a)anirudhrb.com>
commit 9636be85cc5b ("x86/hyperv: Fix hyperv_pcpu_input_arg handling when
CPUs go online/offline") introduces a new cpuhp state for hyperv
initialization.
cpuhp_setup_state() returns the state number if state is
CPUHP_AP_ONLINE_DYN or CPUHP_BP_PREPARE_DYN and 0 for all other states.
For the hyperv case, since a new cpuhp state was introduced it would
return 0. However, in hv_machine_shutdown(), the cpuhp_remove_state() call
is conditioned upon "hyperv_init_cpuhp > 0". This will never be true and
so hv_cpu_die() won't be called on all CPUs. This means the VP assist page
won't be reset. When the kexec kernel tries to setup the VP assist page
again, the hypervisor corrupts the memory region of the old VP assist page
causing a panic in case the kexec kernel is using that memory elsewhere.
This was originally fixed in commit dfe94d4086e4 ("x86/hyperv: Fix kexec
panic/hang issues").
Get rid of hyperv_init_cpuhp entirely since we are no longer using a
dynamic cpuhp state and use CPUHP_AP_HYPERV_ONLINE directly with
cpuhp_remove_state().
Cc: stable(a)vger.kernel.org
Fixes: 9636be85cc5b ("x86/hyperv: Fix hyperv_pcpu_input_arg handling when CPUs go online/offline")
Signed-off-by: Anirudh Rayabharam (Microsoft) <anirudh(a)anirudhrb.com>
---
v1->v2:
- Remove hyperv_init_cpuhp entirely and use CPUHP_AP_HYPERV_ONLINE directly
with cpuhp_remove_state().
v1: https://lore.kernel.org/linux-hyperv/87wmk2xt5i.fsf@redhat.com/T/#m54b8ae17…
---
arch/x86/hyperv/hv_init.c | 5 +----
arch/x86/include/asm/mshyperv.h | 1 -
arch/x86/kernel/cpu/mshyperv.c | 4 ++--
3 files changed, 3 insertions(+), 7 deletions(-)
diff --git a/arch/x86/hyperv/hv_init.c b/arch/x86/hyperv/hv_init.c
index 17a71e92a343..95eada2994e1 100644
--- a/arch/x86/hyperv/hv_init.c
+++ b/arch/x86/hyperv/hv_init.c
@@ -35,7 +35,6 @@
#include <clocksource/hyperv_timer.h>
#include <linux/highmem.h>
-int hyperv_init_cpuhp;
u64 hv_current_partition_id = ~0ull;
EXPORT_SYMBOL_GPL(hv_current_partition_id);
@@ -607,8 +606,6 @@ void __init hyperv_init(void)
register_syscore_ops(&hv_syscore_ops);
- hyperv_init_cpuhp = cpuhp;
-
if (cpuid_ebx(HYPERV_CPUID_FEATURES) & HV_ACCESS_PARTITION_ID)
hv_get_partition_id();
@@ -637,7 +634,7 @@ void __init hyperv_init(void)
clean_guest_os_id:
wrmsrl(HV_X64_MSR_GUEST_OS_ID, 0);
hv_ivm_msr_write(HV_X64_MSR_GUEST_OS_ID, 0);
- cpuhp_remove_state(cpuhp);
+ cpuhp_remove_state(CPUHP_AP_HYPERV_ONLINE);
free_ghcb_page:
free_percpu(hv_ghcb_pg);
free_vp_assist_page:
diff --git a/arch/x86/include/asm/mshyperv.h b/arch/x86/include/asm/mshyperv.h
index 390c4d13956d..5f0bc6a6d025 100644
--- a/arch/x86/include/asm/mshyperv.h
+++ b/arch/x86/include/asm/mshyperv.h
@@ -40,7 +40,6 @@ static inline unsigned char hv_get_nmi_reason(void)
}
#if IS_ENABLED(CONFIG_HYPERV)
-extern int hyperv_init_cpuhp;
extern bool hyperv_paravisor_present;
extern void *hv_hypercall_pg;
diff --git a/arch/x86/kernel/cpu/mshyperv.c b/arch/x86/kernel/cpu/mshyperv.c
index e0fd57a8ba84..e98db51f25ba 100644
--- a/arch/x86/kernel/cpu/mshyperv.c
+++ b/arch/x86/kernel/cpu/mshyperv.c
@@ -199,8 +199,8 @@ static void hv_machine_shutdown(void)
* Call hv_cpu_die() on all the CPUs, otherwise later the hypervisor
* corrupts the old VP Assist Pages and can crash the kexec kernel.
*/
- if (kexec_in_progress && hyperv_init_cpuhp > 0)
- cpuhp_remove_state(hyperv_init_cpuhp);
+ if (kexec_in_progress)
+ cpuhp_remove_state(CPUHP_AP_HYPERV_ONLINE);
/* The function calls stop_other_cpus(). */
native_machine_shutdown();
--
2.45.2
From: yangge <yangge1116(a)126.com>
If a large number of CMA memory are configured in system (for example, the
CMA memory accounts for 50% of the system memory), starting a virtual
virtual machine, it will call pin_user_pages_remote(..., FOLL_LONGTERM,
...) to pin memory. Normally if a page is present and in CMA area,
pin_user_pages_remote() will migrate the page from CMA area to non-CMA
area because of FOLL_LONGTERM flag. But the current code will cause the
migration failure due to unexpected page refcounts, and eventually cause
the virtual machine fail to start.
If a page is added in LRU batch, its refcount increases one, remove the
page from LRU batch decreases one. Page migration requires the page is not
referenced by others except page mapping. Before migrating a page, we
should try to drain the page from LRU batch in case the page is in it,
however, folio_test_lru() is not sufficient to tell whether the page is
in LRU batch or not, if the page is in LRU batch, the migration will fail.
To solve the problem above, we modify the logic of adding to LRU batch.
Before adding a page to LRU batch, we clear the LRU flag of the page so
that we can check whether the page is in LRU batch by folio_test_lru(page).
Seems making the LRU flag of the page invisible a long time is no problem,
because a new page is allocated from buddy and added to the lru batch,
its LRU flag is also not visible for a long time.
Cc: <stable(a)vger.kernel.org>
Signed-off-by: yangge <yangge1116(a)126.com>
---
mm/swap.c | 43 +++++++++++++++++++++++++++++++------------
1 file changed, 31 insertions(+), 12 deletions(-)
diff --git a/mm/swap.c b/mm/swap.c
index dc205bd..9caf6b0 100644
--- a/mm/swap.c
+++ b/mm/swap.c
@@ -211,10 +211,6 @@ static void folio_batch_move_lru(struct folio_batch *fbatch, move_fn_t move_fn)
for (i = 0; i < folio_batch_count(fbatch); i++) {
struct folio *folio = fbatch->folios[i];
- /* block memcg migration while the folio moves between lru */
- if (move_fn != lru_add_fn && !folio_test_clear_lru(folio))
- continue;
-
folio_lruvec_relock_irqsave(folio, &lruvec, &flags);
move_fn(lruvec, folio);
@@ -255,11 +251,16 @@ static void lru_move_tail_fn(struct lruvec *lruvec, struct folio *folio)
void folio_rotate_reclaimable(struct folio *folio)
{
if (!folio_test_locked(folio) && !folio_test_dirty(folio) &&
- !folio_test_unevictable(folio) && folio_test_lru(folio)) {
+ !folio_test_unevictable(folio)) {
struct folio_batch *fbatch;
unsigned long flags;
folio_get(folio);
+ if (!folio_test_clear_lru(folio)) {
+ folio_put(folio);
+ return;
+ }
+
local_lock_irqsave(&lru_rotate.lock, flags);
fbatch = this_cpu_ptr(&lru_rotate.fbatch);
folio_batch_add_and_move(fbatch, folio, lru_move_tail_fn);
@@ -352,11 +353,15 @@ static void folio_activate_drain(int cpu)
void folio_activate(struct folio *folio)
{
- if (folio_test_lru(folio) && !folio_test_active(folio) &&
- !folio_test_unevictable(folio)) {
+ if (!folio_test_active(folio) && !folio_test_unevictable(folio)) {
struct folio_batch *fbatch;
folio_get(folio);
+ if (!folio_test_clear_lru(folio)) {
+ folio_put(folio);
+ return;
+ }
+
local_lock(&cpu_fbatches.lock);
fbatch = this_cpu_ptr(&cpu_fbatches.activate);
folio_batch_add_and_move(fbatch, folio, folio_activate_fn);
@@ -700,6 +705,11 @@ void deactivate_file_folio(struct folio *folio)
return;
folio_get(folio);
+ if (!folio_test_clear_lru(folio)) {
+ folio_put(folio);
+ return;
+ }
+
local_lock(&cpu_fbatches.lock);
fbatch = this_cpu_ptr(&cpu_fbatches.lru_deactivate_file);
folio_batch_add_and_move(fbatch, folio, lru_deactivate_file_fn);
@@ -716,11 +726,16 @@ void deactivate_file_folio(struct folio *folio)
*/
void folio_deactivate(struct folio *folio)
{
- if (folio_test_lru(folio) && !folio_test_unevictable(folio) &&
- (folio_test_active(folio) || lru_gen_enabled())) {
+ if (!folio_test_unevictable(folio) && (folio_test_active(folio) ||
+ lru_gen_enabled())) {
struct folio_batch *fbatch;
folio_get(folio);
+ if (!folio_test_clear_lru(folio)) {
+ folio_put(folio);
+ return;
+ }
+
local_lock(&cpu_fbatches.lock);
fbatch = this_cpu_ptr(&cpu_fbatches.lru_deactivate);
folio_batch_add_and_move(fbatch, folio, lru_deactivate_fn);
@@ -737,12 +752,16 @@ void folio_deactivate(struct folio *folio)
*/
void folio_mark_lazyfree(struct folio *folio)
{
- if (folio_test_lru(folio) && folio_test_anon(folio) &&
- folio_test_swapbacked(folio) && !folio_test_swapcache(folio) &&
- !folio_test_unevictable(folio)) {
+ if (folio_test_anon(folio) && folio_test_swapbacked(folio) &&
+ !folio_test_swapcache(folio) && !folio_test_unevictable(folio)) {
struct folio_batch *fbatch;
folio_get(folio);
+ if (!folio_test_clear_lru(folio)) {
+ folio_put(folio);
+ return;
+ }
+
local_lock(&cpu_fbatches.lock);
fbatch = this_cpu_ptr(&cpu_fbatches.lru_lazyfree);
folio_batch_add_and_move(fbatch, folio, lru_lazyfree_fn);
--
2.7.4
Hi all,
As some of you have noticed, there's a TON of failure messages being
sent out for AMD gpu driver commits that are tagged for stable
backports. In short, you all are doing something really wrong with how
you are tagging these.
Please fix it up to NOT have duplicates in multiple branches that end up
in Linus's tree at different times. Or if you MUST do that, then give
us a chance to figure out that it IS a duplicate. As-is, it's not
working at all, and I think I need to just drop all patches for this
driver that are tagged for stable going forward and rely on you all to
provide a proper set of backported fixes when you say they are needed.
Again, what you are doing today is NOT ok and is broken. Please fix.
greg k-h
Hi, all.
Recently syzbot reported a bug as following:
kernel BUG at fs/f2fs/inode.c:896!
CPU: 1 UID: 0 PID: 5217 Comm: syz-executor605 Not tainted 6.11.0-rc4-syzkaller-00033-g872cf28b8df9 #0
RIP: 0010:f2fs_evict_inode+0x1598/0x15c0 fs/f2fs/inode.c:896
Call Trace:
<TASK>
evict+0x532/0x950 fs/inode.c:704
dispose_list fs/inode.c:747 [inline]
evict_inodes+0x5f9/0x690 fs/inode.c:797
generic_shutdown_super+0x9d/0x2d0 fs/super.c:627
kill_block_super+0x44/0x90 fs/super.c:1696
kill_f2fs_super+0x344/0x690 fs/f2fs/super.c:4898
deactivate_locked_super+0xc4/0x130 fs/super.c:473
cleanup_mnt+0x41f/0x4b0 fs/namespace.c:1373
task_work_run+0x24f/0x310 kernel/task_work.c:228
ptrace_notify+0x2d2/0x380 kernel/signal.c:2402
ptrace_report_syscall include/linux/ptrace.h:415 [inline]
ptrace_report_syscall_exit include/linux/ptrace.h:477 [inline]
syscall_exit_work+0xc6/0x190 kernel/entry/common.c:173
syscall_exit_to_user_mode_prepare kernel/entry/common.c:200 [inline]
__syscall_exit_to_user_mode_work kernel/entry/common.c:205 [inline]
syscall_exit_to_user_mode+0x279/0x370 kernel/entry/common.c:218
do_syscall_64+0x100/0x230 arch/x86/entry/common.c:89
entry_SYSCALL_64_after_hwframe+0x77/0x7f
The syzbot constructed the following scenario: concurrently
creating directories and setting the file system to read-only.
In this case, while f2fs was making dir, the filesystem switched to
readonly, and when it tried to clear the dirty flag, it triggered this
code path: f2fs_mkdir()-> f2fs_sync_fs()->f2fs_write_checkpoint()
->f2fs_readonly(). This resulted FI_DIRTY_INODE flag not being cleared,
which eventually led to a bug being triggered during the FI_DIRTY_INODE
check in f2fs_evict_inode().
In this case, we cannot do anything further, so if filesystem is readonly,
do not trigger the BUG. Instead, clean up resources to the best of our
ability to prevent triggering subsequent resource leak checks.
If there is anything important I'm missing, please let me know, thanks.
Reported-by: syzbot+ebea2790904673d7c618(a)syzkaller.appspotmail.com
Closes: https://syzkaller.appspot.com/bug?extid=ebea2790904673d7c618
Fixes: ca7d802a7d8e ("f2fs: detect dirty inode in evict_inode")
CC: stable(a)vger.kernel.org
Signed-off-by: Julian Sun <sunjunchao2870(a)gmail.com>
---
fs/f2fs/inode.c | 3 ++-
1 file changed, 2 insertions(+), 1 deletion(-)
diff --git a/fs/f2fs/inode.c b/fs/f2fs/inode.c
index aef57172014f..ebf825dba0a5 100644
--- a/fs/f2fs/inode.c
+++ b/fs/f2fs/inode.c
@@ -892,7 +892,8 @@ void f2fs_evict_inode(struct inode *inode)
atomic_read(&fi->i_compr_blocks));
if (likely(!f2fs_cp_error(sbi) &&
- !is_sbi_flag_set(sbi, SBI_CP_DISABLED)))
+ !is_sbi_flag_set(sbi, SBI_CP_DISABLED)) &&
+ !f2fs_readonly(sbi->sb))
f2fs_bug_on(sbi, is_inode_flag_set(inode, FI_DIRTY_INODE));
else
f2fs_inode_synced(inode);
--
2.39.2