The patch titled
Subject: mm/hugetlb: fix potential race in __update_and_free_hugetlb_folio()
has been added to the -mm mm-hotfixes-unstable branch. Its filename is
mm-hugetlb-fix-potential-race-in-__update_and_free_hugetlb_folio.patch
This patch will shortly appear at
https://git.kernel.org/pub/scm/linux/kernel/git/akpm/25-new.git/tree/patche…
This patch will later appear in the mm-hotfixes-unstable branch at
git://git.kernel.org/pub/scm/linux/kernel/git/akpm/mm
Before you just go and hit "reply", please:
a) Consider who else should be cc'ed
b) Prefer to cc a suitable mailing list as well
c) Ideally: find the original patch on the mailing list and do a
reply-to-all to that, adding suitable additional cc's
*** Remember to use Documentation/process/submit-checklist.rst when testing your code ***
The -mm tree is included into linux-next via the mm-everything
branch at git://git.kernel.org/pub/scm/linux/kernel/git/akpm/mm
and is updated there every 2-3 working days
------------------------------------------------------
From: Miaohe Lin <linmiaohe(a)huawei.com>
Subject: mm/hugetlb: fix potential race in __update_and_free_hugetlb_folio()
Date: Mon, 8 Jul 2024 10:51:27 +0800
There is a potential race between __update_and_free_hugetlb_folio() and
try_memory_failure_hugetlb():
CPU1 CPU2
__update_and_free_hugetlb_folio try_memory_failure_hugetlb
folio_test_hugetlb
-- It's still hugetlb folio.
folio_clear_hugetlb_hwpoison
spin_lock_irq(&hugetlb_lock);
__get_huge_page_for_hwpoison
folio_set_hugetlb_hwpoison
spin_unlock_irq(&hugetlb_lock);
spin_lock_irq(&hugetlb_lock);
__folio_clear_hugetlb(folio);
-- Hugetlb flag is cleared but too late.
spin_unlock_irq(&hugetlb_lock);
When the above race occurs, raw error page info will be leaked. Even
worse, raw error pages won't have hwpoisoned flag set and hit
pcplists/buddy. Fix this issue by deferring
folio_clear_hugetlb_hwpoison() until __folio_clear_hugetlb() is done. So
all raw error pages will have hwpoisoned flag set.
Link: https://lkml.kernel.org/r/20240708025127.107713-1-linmiaohe@huawei.com
Fixes: 32c877191e02 ("hugetlb: do not clear hugetlb dtor until allocating vmemmap")
Signed-off-by: Miaohe Lin <linmiaohe(a)huawei.com>
Acked-by: Muchun Song <muchun.song(a)linux.dev>
Reviewed-by: Oscar Salvador <osalvador(a)suse.de>
Cc: <stable(a)vger.kernel.org>
Signed-off-by: Andrew Morton <akpm(a)linux-foundation.org>
---
mm/hugetlb.c | 14 +++++++-------
1 file changed, 7 insertions(+), 7 deletions(-)
--- a/mm/hugetlb.c~mm-hugetlb-fix-potential-race-in-__update_and_free_hugetlb_folio
+++ a/mm/hugetlb.c
@@ -1726,13 +1726,6 @@ static void __update_and_free_hugetlb_fo
}
/*
- * Move PageHWPoison flag from head page to the raw error pages,
- * which makes any healthy subpages reusable.
- */
- if (unlikely(folio_test_hwpoison(folio)))
- folio_clear_hugetlb_hwpoison(folio);
-
- /*
* If vmemmap pages were allocated above, then we need to clear the
* hugetlb flag under the hugetlb lock.
*/
@@ -1742,6 +1735,13 @@ static void __update_and_free_hugetlb_fo
spin_unlock_irq(&hugetlb_lock);
}
+ /*
+ * Move PageHWPoison flag from head page to the raw error pages,
+ * which makes any healthy subpages reusable.
+ */
+ if (unlikely(folio_test_hwpoison(folio)))
+ folio_clear_hugetlb_hwpoison(folio);
+
folio_ref_unfreeze(folio, 1);
/*
_
Patches currently in -mm which might be from linmiaohe(a)huawei.com are
mm-hugetlb-fix-potential-race-in-__update_and_free_hugetlb_folio.patch
mm-memory-failure-remove-obsolete-mf_msg_different_compound.patch
By default, an address assigned to the output interface is selected when
the source address is not specified. This is problematic when a route,
configured in a vrf, uses an interface from another vrf (aka route leak).
The original vrf does not own the selected source address.
Let's add a check against the output interface and call the appropriate
function to select the source address.
CC: stable(a)vger.kernel.org
Fixes: 8cbb512c923d ("net: Add source address lookup op for VRF")
Signed-off-by: Nicolas Dichtel <nicolas.dichtel(a)6wind.com>
---
net/ipv4/fib_semantics.c | 11 +++++++++--
1 file changed, 9 insertions(+), 2 deletions(-)
diff --git a/net/ipv4/fib_semantics.c b/net/ipv4/fib_semantics.c
index f669da98d11d..459082f4936d 100644
--- a/net/ipv4/fib_semantics.c
+++ b/net/ipv4/fib_semantics.c
@@ -2270,6 +2270,13 @@ void fib_select_path(struct net *net, struct fib_result *res,
fib_select_default(fl4, res);
check_saddr:
- if (!fl4->saddr)
- fl4->saddr = fib_result_prefsrc(net, res);
+ if (!fl4->saddr) {
+ struct net_device *l3mdev = dev_get_by_index_rcu(net, fl4->flowi4_l3mdev);
+
+ if (!l3mdev ||
+ l3mdev_master_dev_rcu(FIB_RES_DEV(*res)) == l3mdev)
+ fl4->saddr = fib_result_prefsrc(net, res);
+ else
+ fl4->saddr = inet_select_addr(l3mdev, 0, RT_SCOPE_LINK);
+ }
}
--
2.43.1
When the source address is selected, the scope must be checked. For
example, if a loopback address is assigned to the vrf device, it must not
be chosen for packets sent outside.
CC: stable(a)vger.kernel.org
Fixes: afbac6010aec ("net: ipv6: Address selection needs to consider L3 domains")
Signed-off-by: Nicolas Dichtel <nicolas.dichtel(a)6wind.com>
Reviewed-by: David Ahern <dsahern(a)kernel.org>
---
net/ipv6/addrconf.c | 3 ++-
1 file changed, 2 insertions(+), 1 deletion(-)
diff --git a/net/ipv6/addrconf.c b/net/ipv6/addrconf.c
index 5c424a0e7232..4f2c5cc31015 100644
--- a/net/ipv6/addrconf.c
+++ b/net/ipv6/addrconf.c
@@ -1873,7 +1873,8 @@ int ipv6_dev_get_saddr(struct net *net, const struct net_device *dst_dev,
master, &dst,
scores, hiscore_idx);
- if (scores[hiscore_idx].ifa)
+ if (scores[hiscore_idx].ifa &&
+ scores[hiscore_idx].scopedist >= 0)
goto out;
}
--
2.43.1
On Sun, Jul 07, 2024 at 10:58:55AM -0400, Sasha Levin wrote:
> This is a note to let you know that I've just added the patch titled
>
> gpiolib: of: add polarity quirk for TSC2005
>
> to the 5.15-stable tree which can be found at:
> http://www.kernel.org/git/?p=linux/kernel/git/stable/stable-queue.git;a=sum…
>
> The filename of the patch is:
> gpiolib-of-add-polarity-quirk-for-tsc2005.patch
> and it can be found in the queue-5.15 subdirectory.
>
> If you, or anyone else, feels it should not be added to the stable tree,
> please let <stable(a)vger.kernel.org> know about it.
Sasha, you are picking up a fix for an issue that was there for 7
years (since 2017), that nobody reported, for a device that there
probably 1 or 2 users in the world attempt to boot on mainline kernel.
For this you are forklifting bunch of other changes as dependencies.
I questioned myself when I was sending this change to mainline if it was
even worth it, and I am very sure the risks far outweigh the benefits(?)
of having this in stable.
The same goes for other cherry-picks to other stable branches.
Thanks.
--
Dmitry
From: Wang Yufen <wangyufen(a)huawei.com>
[ Upstream commit d8616ee2affcff37c5d315310da557a694a3303d ]
During TCP sockmap redirect pressure test, the following warning is triggered:
WARNING: CPU: 3 PID: 2145 at net/core/stream.c:205 sk_stream_kill_queues+0xbc/0xd0
CPU: 3 PID: 2145 Comm: iperf Kdump: loaded Tainted: G W 5.10.0+ #9
Call Trace:
inet_csk_destroy_sock+0x55/0x110
inet_csk_listen_stop+0xbb/0x380
tcp_close+0x41b/0x480
inet_release+0x42/0x80
__sock_release+0x3d/0xa0
sock_close+0x11/0x20
__fput+0x9d/0x240
task_work_run+0x62/0x90
exit_to_user_mode_prepare+0x110/0x120
syscall_exit_to_user_mode+0x27/0x190
entry_SYSCALL_64_after_hwframe+0x44/0xa9
The reason we observed is that:
When the listener is closing, a connection may have completed the three-way
handshake but not accepted, and the client has sent some packets. The child
sks in accept queue release by inet_child_forget()->inet_csk_destroy_sock(),
but psocks of child sks have not released.
To fix, add sock_map_destroy to release psocks.
Signed-off-by: Wang Yufen <wangyufen(a)huawei.com>
Signed-off-by: Daniel Borkmann <daniel(a)iogearbox.net>
Signed-off-by: Andrii Nakryiko <andrii(a)kernel.org>
Acked-by: Jakub Sitnicki <jakub(a)cloudflare.com>
Acked-by: John Fastabend <john.fastabend(a)gmail.com>
Link: https://lore.kernel.org/bpf/20220524075311.649153-1-wangyufen@huawei.com
Stable-dep-of: 8bbabb3fddcd ("bpf, sock_map: Move cancel_work_sync() out of sock lock")
Signed-off-by: Sasha Levin <sashal(a)kernel.org>
[Conflict in include/linux/bpf.h due to function declaration position
and remove non-existed sk_psock_stop helper from sock_map_destroy.]
Signed-off-by: Wen Gu <guwen(a)linux.alibaba.com>
---
background:
Link: https://lore.kernel.org/stable/d11bc7e6-a2c7-445a-8561-3599eafb07b0@linux.a…
@stable team:
This backport has 2 changes compared to the original patch:
- fix conflict due to sock_map_destroy declaration position in include/linux/bpf.h;
- remove the non-existed sk_psock_stop helper from sock_map_destroy. This helper is
introduced by 799aa7f98d53 ("skmsg: Avoid lock_sock() in sk_psock_backlog()") after
v5.10, it is not a fix and hard to backport. Considering that what did in
sk_psock_stop is done in sk_psock_drop and neither sock_map_close nor sock_map_unhash
in v5.10 introduces sk_psock_stop, I removed it from sock_map_destroy too.
I tested it in my environment, the regression was gone.
Cc: Wang Yufen <wangyufen(a)huawei.com>
@Yufen, if I missed anything, please point it out, thanks!
---
include/linux/bpf.h | 1 +
include/linux/skmsg.h | 1 +
net/core/skmsg.c | 1 +
net/core/sock_map.c | 22 ++++++++++++++++++++++
net/ipv4/tcp_bpf.c | 1 +
5 files changed, 26 insertions(+)
diff --git a/include/linux/bpf.h b/include/linux/bpf.h
index a75faf437e75..340f4fef5b5a 100644
--- a/include/linux/bpf.h
+++ b/include/linux/bpf.h
@@ -1800,6 +1800,7 @@ int sock_map_get_from_fd(const union bpf_attr *attr, struct bpf_prog *prog);
int sock_map_prog_detach(const union bpf_attr *attr, enum bpf_prog_type ptype);
int sock_map_update_elem_sys(struct bpf_map *map, void *key, void *value, u64 flags);
void sock_map_unhash(struct sock *sk);
+void sock_map_destroy(struct sock *sk);
void sock_map_close(struct sock *sk, long timeout);
#else
static inline int sock_map_prog_update(struct bpf_map *map,
diff --git a/include/linux/skmsg.h b/include/linux/skmsg.h
index 1138dd3071db..e2af013ec05f 100644
--- a/include/linux/skmsg.h
+++ b/include/linux/skmsg.h
@@ -98,6 +98,7 @@ struct sk_psock {
spinlock_t link_lock;
refcount_t refcnt;
void (*saved_unhash)(struct sock *sk);
+ void (*saved_destroy)(struct sock *sk);
void (*saved_close)(struct sock *sk, long timeout);
void (*saved_write_space)(struct sock *sk);
struct proto *sk_proto;
diff --git a/net/core/skmsg.c b/net/core/skmsg.c
index bb4fbc60b272..51792dda1b73 100644
--- a/net/core/skmsg.c
+++ b/net/core/skmsg.c
@@ -599,6 +599,7 @@ struct sk_psock *sk_psock_init(struct sock *sk, int node)
psock->eval = __SK_NONE;
psock->sk_proto = prot;
psock->saved_unhash = prot->unhash;
+ psock->saved_destroy = prot->destroy;
psock->saved_close = prot->close;
psock->saved_write_space = sk->sk_write_space;
diff --git a/net/core/sock_map.c b/net/core/sock_map.c
index 52e395a189df..d1d0ee2dbfaa 100644
--- a/net/core/sock_map.c
+++ b/net/core/sock_map.c
@@ -1566,6 +1566,28 @@ void sock_map_unhash(struct sock *sk)
saved_unhash(sk);
}
+void sock_map_destroy(struct sock *sk)
+{
+ void (*saved_destroy)(struct sock *sk);
+ struct sk_psock *psock;
+
+ rcu_read_lock();
+ psock = sk_psock_get(sk);
+ if (unlikely(!psock)) {
+ rcu_read_unlock();
+ if (sk->sk_prot->destroy)
+ sk->sk_prot->destroy(sk);
+ return;
+ }
+
+ saved_destroy = psock->saved_destroy;
+ sock_map_remove_links(sk, psock);
+ rcu_read_unlock();
+ sk_psock_put(sk, psock);
+ saved_destroy(sk);
+}
+EXPORT_SYMBOL_GPL(sock_map_destroy);
+
void sock_map_close(struct sock *sk, long timeout)
{
void (*saved_close)(struct sock *sk, long timeout);
diff --git a/net/ipv4/tcp_bpf.c b/net/ipv4/tcp_bpf.c
index d0ca1fc325cd..f909e440bb22 100644
--- a/net/ipv4/tcp_bpf.c
+++ b/net/ipv4/tcp_bpf.c
@@ -582,6 +582,7 @@ static void tcp_bpf_rebuild_protos(struct proto prot[TCP_BPF_NUM_CFGS],
struct proto *base)
{
prot[TCP_BPF_BASE] = *base;
+ prot[TCP_BPF_BASE].destroy = sock_map_destroy;
prot[TCP_BPF_BASE].close = sock_map_close;
prot[TCP_BPF_BASE].recvmsg = tcp_bpf_recvmsg;
prot[TCP_BPF_BASE].stream_memory_read = tcp_bpf_stream_read;
--
2.32.0.3.g01195cf9f
The patch below does not apply to the 6.6-stable tree.
If someone wants it applied there, or to any other stable or longterm
tree, then please email the backport, including the original git commit
id to <stable(a)vger.kernel.org>.
To reproduce the conflict and resubmit, you may use the following commands:
git fetch https://git.kernel.org/pub/scm/linux/kernel/git/stable/linux.git/ linux-6.6.y
git checkout FETCH_HEAD
git cherry-pick -x 3c1f81a1b554f49e99b34ca45324b35948c885db
# <resolve conflicts, build, test, etc.>
git commit -s
git send-email --to '<stable(a)vger.kernel.org>' --in-reply-to '2024070822-unfixed-paced-a31d@gregkh' --subject-prefix 'PATCH 6.6.y' HEAD^..
Possible dependencies:
3c1f81a1b554 ("riscv: dts: starfive: Set EMMC vqmmc maximum voltage to 3.3V on JH7110 boards")
ac9a37e2d6b6 ("riscv: dts: starfive: introduce a common board dtsi for jh7110 based boards")
07da6ddf510b ("riscv: dts: starfive: visionfive 2: add "disable-wp" for tfcard")
0ffce9d49abd ("riscv: dts: starfive: visionfive 2: add tf cd-gpios")
ffddddf4aa8d ("riscv: dts: starfive: visionfive 2: use cpus label for timebase freq")
b9a1481f259c ("riscv: dts: starfive: visionfive 2: update sound and codec dt node name")
e0503d47e93d ("riscv: dts: starfive: visionfive 2: Remove non-existing I2S hardware")
dcde4e97b122 ("riscv: dts: starfive: visionfive 2: Remove non-existing TDM hardware")
0f74c64f0a9f ("riscv: dts: starfive: Remove PMIC interrupt info for Visionfive 2 board")
28ecaaa5af19 ("riscv: dts: starfive: jh7110: Add camera subsystem nodes")
8d01f741a046 ("riscv: dts: starfive: jh7110: Add PWM node and pins configuration")
79384a047535 ("Merge tag 'riscv-dt-for-v6.7' of https://git.kernel.org/pub/scm/linux/kernel/git/conor/linux into soc/dt")
thanks,
greg k-h
------------------ original commit in Linus's tree ------------------
From 3c1f81a1b554f49e99b34ca45324b35948c885db Mon Sep 17 00:00:00 2001
From: Shengyu Qu <wiagn233(a)outlook.com>
Date: Wed, 12 Jun 2024 18:33:31 +0800
Subject: [PATCH] riscv: dts: starfive: Set EMMC vqmmc maximum voltage to 3.3V
on JH7110 boards
Currently, for JH7110 boards with EMMC slot, vqmmc voltage for EMMC is
fixed to 1.8V, while the spec needs it to be 3.3V on low speed mode and
should support switching to 1.8V when using higher speed mode. Since
there are no other peripherals using the same voltage source of EMMC's
vqmmc(ALDO4) on every board currently supported by mainline kernel,
regulator-max-microvolt of ALDO4 should be set to 3.3V.
Cc: stable(a)vger.kernel.org
Signed-off-by: Shengyu Qu <wiagn233(a)outlook.com>
Fixes: 7dafcfa79cc9 ("riscv: dts: starfive: enable DCDC1&ALDO4 node in axp15060")
Signed-off-by: Conor Dooley <conor.dooley(a)microchip.com>
diff --git a/arch/riscv/boot/dts/starfive/jh7110-common.dtsi b/arch/riscv/boot/dts/starfive/jh7110-common.dtsi
index 8ff6ea64f048..68d16717db8c 100644
--- a/arch/riscv/boot/dts/starfive/jh7110-common.dtsi
+++ b/arch/riscv/boot/dts/starfive/jh7110-common.dtsi
@@ -244,7 +244,7 @@ emmc_vdd: aldo4 {
regulator-boot-on;
regulator-always-on;
regulator-min-microvolt = <1800000>;
- regulator-max-microvolt = <1800000>;
+ regulator-max-microvolt = <3300000>;
regulator-name = "emmc_vdd";
};
};