Hi,
This series is a v6.12-only backport (based on v6.12.62) of fixes for
system register initialization issues affecting protected KVM on arm64.
This affects some contemporary and upcoming hardware which will run the
v6.12.y stable kernel, or something derived from it, such as the Android
common kernel.
The FEAT_E2H0 patches fix code introduced after v6.6, and so only
need to be backported to v6.12.
The SCTLR_EL1 patch fixes code introduced in v5.11, but practically
speaking only affects recent hardware which is unlikely to run
something older than v6.12.
Note: Marc Zyngier performed the initial backport, which I have
rebased and tested, hence both of our sign-offs being added to the
tags from the upstream commits.
I have tested the backport and observed they solve the problems as
expected.
Ahmed Genidi (1):
KVM: arm64: Initialize SCTLR_EL1 in __kvm_hyp_init_cpu()
Marc Zyngier (1):
arm64: Revamp HCR_EL2.E2H RES1 detection
Mark Rutland (1):
KVM: arm64: Initialize HCR_EL2.E2H early
arch/arm64/include/asm/el2_setup.h | 57 +++++++++++++++++++++++++---
arch/arm64/kernel/head.S | 22 ++---------
arch/arm64/kvm/hyp/nvhe/hyp-init.S | 10 +++--
arch/arm64/kvm/hyp/nvhe/psci-relay.c | 3 ++
4 files changed, 65 insertions(+), 27 deletions(-)
--
2.43.0
From: Ilya Maximets <i.maximets(a)ovn.org>
[ Upstream commit 5ace7ef87f059d68b5f50837ef3e8a1a4870c36e ]
The push_nsh() action structure looks like this:
OVS_ACTION_ATTR_PUSH_NSH(OVS_KEY_ATTR_NSH(OVS_NSH_KEY_ATTR_BASE,...))
The outermost OVS_ACTION_ATTR_PUSH_NSH attribute is OK'ed by the
nla_for_each_nested() inside __ovs_nla_copy_actions(). The innermost
OVS_NSH_KEY_ATTR_BASE/MD1/MD2 are OK'ed by the nla_for_each_nested()
inside nsh_key_put_from_nlattr(). But nothing checks if the attribute
in the middle is OK. We don't even check that this attribute is the
OVS_KEY_ATTR_NSH. We just do a double unwrap with a pair of nla_data()
calls - first time directly while calling validate_push_nsh() and the
second time as part of the nla_for_each_nested() macro, which isn't
safe, potentially causing invalid memory access if the size of this
attribute is incorrect. The failure may not be noticed during
validation due to larger netlink buffer, but cause trouble later during
action execution where the buffer is allocated exactly to the size:
BUG: KASAN: slab-out-of-bounds in nsh_hdr_from_nlattr+0x1dd/0x6a0 [openvswitch]
Read of size 184 at addr ffff88816459a634 by task a.out/22624
CPU: 8 UID: 0 PID: 22624 6.18.0-rc7+ #115 PREEMPT(voluntary)
Call Trace:
<TASK>
dump_stack_lvl+0x51/0x70
print_address_description.constprop.0+0x2c/0x390
kasan_report+0xdd/0x110
kasan_check_range+0x35/0x1b0
__asan_memcpy+0x20/0x60
nsh_hdr_from_nlattr+0x1dd/0x6a0 [openvswitch]
push_nsh+0x82/0x120 [openvswitch]
do_execute_actions+0x1405/0x2840 [openvswitch]
ovs_execute_actions+0xd5/0x3b0 [openvswitch]
ovs_packet_cmd_execute+0x949/0xdb0 [openvswitch]
genl_family_rcv_msg_doit+0x1d6/0x2b0
genl_family_rcv_msg+0x336/0x580
genl_rcv_msg+0x9f/0x130
netlink_rcv_skb+0x11f/0x370
genl_rcv+0x24/0x40
netlink_unicast+0x73e/0xaa0
netlink_sendmsg+0x744/0xbf0
__sys_sendto+0x3d6/0x450
do_syscall_64+0x79/0x2c0
entry_SYSCALL_64_after_hwframe+0x76/0x7e
</TASK>
Let's add some checks that the attribute is properly sized and it's
the only one attribute inside the action. Technically, there is no
real reason for OVS_KEY_ATTR_NSH to be there, as we know that we're
pushing an NSH header already, it just creates extra nesting, but
that's how uAPI works today. So, keeping as it is.
Fixes: b2d0f5d5dc53 ("openvswitch: enable NSH support")
Reported-by: Junvy Yang <zhuque(a)tencent.com>
Signed-off-by: Ilya Maximets <i.maximets(a)ovn.org>
Acked-by: Eelco Chaudron echaudro(a)redhat.com
Reviewed-by: Aaron Conole <aconole(a)redhat.com>
Link: https://patch.msgid.link/20251204105334.900379-1-i.maximets@ovn.org
Signed-off-by: Jakub Kicinski <kuba(a)kernel.org>
(cherry picked from commit 5ace7ef87f059d68b5f50837ef3e8a1a4870c36e)
Signed-off-by: Adrian Yip <adrian.ytw(a)gmail.com>
---
net/openvswitch/flow_netlink.c | 13 ++++++++++---
1 file changed, 10 insertions(+), 3 deletions(-)
diff --git a/net/openvswitch/flow_netlink.c b/net/openvswitch/flow_netlink.c
index d0b6e5872081..d4c8b4aa98b1 100644
--- a/net/openvswitch/flow_netlink.c
+++ b/net/openvswitch/flow_netlink.c
@@ -2786,13 +2786,20 @@ static int validate_and_copy_set_tun(const struct nlattr *attr,
return err;
}
-static bool validate_push_nsh(const struct nlattr *attr, bool log)
+static bool validate_push_nsh(const struct nlattr *a, bool log)
{
+ struct nlattr *nsh_key = nla_data(a);
struct sw_flow_match match;
struct sw_flow_key key;
+ /* There must be one and only one NSH header. */
+ if (!nla_ok(nsh_key, nla_len(a)) ||
+ nla_total_size(nla_len(nsh_key)) != nla_len(a) ||
+ nla_type(nsh_key) != OVS_KEY_ATTR_NSH)
+ return false;
+
ovs_match_init(&match, &key, true, NULL);
- return !nsh_key_put_from_nlattr(attr, &match, false, true, log);
+ return !nsh_key_put_from_nlattr(nsh_key, &match, false, true, log);
}
/* Return false if there are any non-masked bits set.
@@ -3346,7 +3353,7 @@ static int __ovs_nla_copy_actions(struct net *net, const struct nlattr *attr,
return -EINVAL;
}
mac_proto = MAC_PROTO_NONE;
- if (!validate_push_nsh(nla_data(a), log))
+ if (!validate_push_nsh(a, log))
return -EINVAL;
break;
--
2.52.0
If SMT is disabled or a partial SMT state is enabled, when a new kernel
image is loaded for kexec, on reboot the following warning is observed:
kexec: Waking offline cpu 228.
WARNING: CPU: 0 PID: 9062 at arch/powerpc/kexec/core_64.c:223 kexec_prepare_cpus+0x1b0/0x1bc
[snip]
NIP kexec_prepare_cpus+0x1b0/0x1bc
LR kexec_prepare_cpus+0x1a0/0x1bc
Call Trace:
kexec_prepare_cpus+0x1a0/0x1bc (unreliable)
default_machine_kexec+0x160/0x19c
machine_kexec+0x80/0x88
kernel_kexec+0xd0/0x118
__do_sys_reboot+0x210/0x2c4
system_call_exception+0x124/0x320
system_call_vectored_common+0x15c/0x2ec
This occurs as add_cpu() fails due to cpu_bootable() returning false for
CPUs that fail the cpu_smt_thread_allowed() check or non primary
threads if SMT is disabled.
Fix the issue by enabling SMT and resetting the number of SMT threads to
the number of threads per core, before attempting to wake up all present
CPUs.
Fixes: 38253464bc82 ("cpu/SMT: Create topology_smt_thread_allowed()")
Reported-by: Sachin P Bappalige <sachinpb(a)linux.ibm.com>
Cc: stable(a)vger.kernel.org # v6.6+
Signed-off-by: Nysal Jan K.A. <nysal(a)linux.ibm.com>
---
arch/powerpc/kexec/core_64.c | 5 +++++
1 file changed, 5 insertions(+)
diff --git a/arch/powerpc/kexec/core_64.c b/arch/powerpc/kexec/core_64.c
index 222aa326dace..ff6df43720c4 100644
--- a/arch/powerpc/kexec/core_64.c
+++ b/arch/powerpc/kexec/core_64.c
@@ -216,6 +216,11 @@ static void wake_offline_cpus(void)
{
int cpu = 0;
+ lock_device_hotplug();
+ cpu_smt_num_threads = threads_per_core;
+ cpu_smt_control = CPU_SMT_ENABLED;
+ unlock_device_hotplug();
+
for_each_present_cpu(cpu) {
if (!cpu_online(cpu)) {
printk(KERN_INFO "kexec: Waking offline cpu %d.\n",
--
2.51.0
erofs readahead could fail with ENOMEM under the memory pressure because
it tries to alloc_page with GFP_NOWAIT | GFP_NORETRY, while GFP_KERNEL
for a regular read. And if readahead fails (with non-uptodate folios),
the original request will then fall back to synchronous read, and
`.read_folio()` should return appropriate errnos.
However, in scenarios where readahead and read operations compete,
read operation could return an unintended EIO because of an incorrect
error propagation.
To resolve this, this patch modifies the behavior so that, when the
PCL is for read(which means pcl.besteffort is true), it attempts actual
decompression instead of propagating the privios error except initial EIO.
- Page size: 4K
- The original size of FileA: 16K
- Compress-ratio per PCL: 50% (Uncompressed 8K -> Compressed 4K)
[page0, page1] [page2, page3]
[PCL0]---------[PCL1]
- functions declaration:
. pread(fd, buf, count, offset)
. readahead(fd, offset, count)
- Thread A tries to read the last 4K
- Thread B tries to do readahead 8K from 4K
- RA, besteffort == false
- R, besteffort == true
<process A> <process B>
pread(FileA, buf, 4K, 12K)
do readahead(page3) // failed with ENOMEM
wait_lock(page3)
if (!uptodate(page3))
goto do_read
readahead(FileA, 4K, 8K)
// Here create PCL-chain like below:
// [null, page1] [page2, null]
// [PCL0:RA]-----[PCL1:RA]
...
do read(page3) // found [PCL1:RA] and add page3 into it,
// and then, change PCL1 from RA to R
...
// Now, PCL-chain is as below:
// [null, page1] [page2, page3]
// [PCL0:RA]-----[PCL1:R]
// try to decompress PCL-chain...
z_erofs_decompress_queue
err = 0;
// failed with ENOMEM, so page 1
// only for RA will not be uptodated.
// it's okay.
err = decompress([PCL0:RA], err)
// However, ENOMEM propagated to next
// PCL, even though PCL is not only
// for RA but also for R. As a result,
// it just failed with ENOMEM without
// trying any decompression, so page2
// and page3 will not be uptodated.
** BUG HERE ** --> err = decompress([PCL1:R], err)
return err as ENOMEM
...
wait_lock(page3)
if (!uptodate(page3))
return EIO <-- Return an unexpected EIO!
...
Fixes: 2349d2fa02db ("erofs: sunset unneeded NOFAILs")
Cc: stable(a)vger.kernel.org
Reviewed-by: Jaewook Kim <jw5454.kim(a)samsung.com>
Reviewed-by: Sungjong Seo <sj1557.seo(a)samsung.com>
Signed-off-by: Junbeom Yeom <junbeom.yeom(a)samsung.com>
---
fs/erofs/zdata.c | 6 +++++-
1 file changed, 5 insertions(+), 1 deletion(-)
diff --git a/fs/erofs/zdata.c b/fs/erofs/zdata.c
index 27b1f44d10ce..86bf6e087d34 100644
--- a/fs/erofs/zdata.c
+++ b/fs/erofs/zdata.c
@@ -1414,11 +1414,15 @@ static int z_erofs_decompress_queue(const struct z_erofs_decompressqueue *io,
};
struct z_erofs_pcluster *next;
int err = io->eio ? -EIO : 0;
+ int io_err = err;
for (; be.pcl != Z_EROFS_PCLUSTER_TAIL; be.pcl = next) {
+ int propagate_err;
+
DBG_BUGON(!be.pcl);
next = READ_ONCE(be.pcl->next);
- err = z_erofs_decompress_pcluster(&be, err) ?: err;
+ propagate_err = READ_ONCE(be.pcl->besteffort) ? io_err : err;
+ err = z_erofs_decompress_pcluster(&be, propagate_err) ?: err;
}
return err;
}
--
2.34.1
In the non-RT kernel, local_bh_disable() merely disables preemption,
whereas it maps to an actual spin lock in the RT kernel. Consequently,
when attempting to refill RX buffers via netdev_alloc_skb() in
macb_mac_link_up(), a deadlock scenario arises as follows:
Chain caused by macb_mac_link_up():
&bp->lock --> (softirq_ctrl.lock)
Chain caused by macb_start_xmit():
(softirq_ctrl.lock) --> _xmit_ETHER#2 --> &bp->lock
Notably, invoking the mog_init_rings() callback upon link establishment
is unnecessary. Instead, we can exclusively call mog_init_rings() within
the ndo_open() callback. This adjustment resolves the deadlock issue.
Given that mog_init_rings() is only applicable to
non-MACB_CAPS_MACB_IS_EMAC cases, we can simply move it to macb_open()
and simultaneously eliminate the MACB_CAPS_MACB_IS_EMAC check.
Fixes: 633e98a711ac ("net: macb: use resolved link config in mac_link_up()")
Cc: stable(a)vger.kernel.org
Suggested-by: Kevin Hao <kexin.hao(a)windriver.com>
Signed-off-by: Xiaolei Wang <xiaolei.wang(a)windriver.com>
---
V1: https://patchwork.kernel.org/project/netdevbpf/patch/20251128103647.351259-…
V2: Update the correct lock dependency chain and add the Fix tag.
drivers/net/ethernet/cadence/macb_main.c | 3 ++-
1 file changed, 2 insertions(+), 1 deletion(-)
diff --git a/drivers/net/ethernet/cadence/macb_main.c b/drivers/net/ethernet/cadence/macb_main.c
index ca2386b83473..064fccdcf699 100644
--- a/drivers/net/ethernet/cadence/macb_main.c
+++ b/drivers/net/ethernet/cadence/macb_main.c
@@ -744,7 +744,6 @@ static void macb_mac_link_up(struct phylink_config *config,
/* Initialize rings & buffers as clearing MACB_BIT(TE) in link down
* cleared the pipeline and control registers.
*/
- bp->macbgem_ops.mog_init_rings(bp);
macb_init_buffers(bp);
for (q = 0, queue = bp->queues; q < bp->num_queues; ++q, ++queue)
@@ -2991,6 +2990,8 @@ static int macb_open(struct net_device *dev)
goto pm_exit;
}
+ bp->macbgem_ops.mog_init_rings(bp);
+
for (q = 0, queue = bp->queues; q < bp->num_queues; ++q, ++queue) {
napi_enable(&queue->napi_rx);
napi_enable(&queue->napi_tx);
--
2.43.0
The local variable 'i' is initialized with -EINVAL, but the for loop
immediately overwrites it and -EINVAL is never returned.
If no empty compression mode can be found, the function would return the
out-of-bounds index IAA_COMP_MODES_MAX, which would cause an invalid
array access in add_iaa_compression_mode().
Fix both issues by returning either a valid index or -EINVAL.
Cc: stable(a)vger.kernel.org
Fixes: b190447e0fa3 ("crypto: iaa - Add compression mode management along with fixed mode")
Signed-off-by: Thorsten Blum <thorsten.blum(a)linux.dev>
---
drivers/crypto/intel/iaa/iaa_crypto_main.c | 12 +++++-------
1 file changed, 5 insertions(+), 7 deletions(-)
diff --git a/drivers/crypto/intel/iaa/iaa_crypto_main.c b/drivers/crypto/intel/iaa/iaa_crypto_main.c
index 23f585219fb4..8ee2a55ec449 100644
--- a/drivers/crypto/intel/iaa/iaa_crypto_main.c
+++ b/drivers/crypto/intel/iaa/iaa_crypto_main.c
@@ -221,15 +221,13 @@ static struct iaa_compression_mode *iaa_compression_modes[IAA_COMP_MODES_MAX];
static int find_empty_iaa_compression_mode(void)
{
- int i = -EINVAL;
+ int i;
- for (i = 0; i < IAA_COMP_MODES_MAX; i++) {
- if (iaa_compression_modes[i])
- continue;
- break;
- }
+ for (i = 0; i < IAA_COMP_MODES_MAX; i++)
+ if (!iaa_compression_modes[i])
+ return i;
- return i;
+ return -EINVAL;
}
static struct iaa_compression_mode *find_iaa_compression_mode(const char *name, int *idx)
--
Thorsten Blum <thorsten.blum(a)linux.dev>
GPG: 1D60 735E 8AEF 3BE4 73B6 9D84 7336 78FD 8DFE EAD4
OTX_CPT_UCODE_NAME_LENGTH limits the microcode name to 64 bytes. If a
user writes a string of exactly 64 characters, the original code used
'strlen(buf) > 64' to check the length, but then strscpy() copies only
63 characters before adding a NUL terminator, silently truncating the
copied string.
Fix this off-by-one error by using 'count' directly for the length check
to ensure long names are rejected early and copied without truncation.
Cc: stable(a)vger.kernel.org
Fixes: d9110b0b01ff ("crypto: marvell - add support for OCTEON TX CPT engine")
Signed-off-by: Thorsten Blum <thorsten.blum(a)linux.dev>
---
drivers/crypto/marvell/octeontx/otx_cptpf_ucode.c | 2 +-
1 file changed, 1 insertion(+), 1 deletion(-)
diff --git a/drivers/crypto/marvell/octeontx/otx_cptpf_ucode.c b/drivers/crypto/marvell/octeontx/otx_cptpf_ucode.c
index 9f5601c0280b..417a48f41350 100644
--- a/drivers/crypto/marvell/octeontx/otx_cptpf_ucode.c
+++ b/drivers/crypto/marvell/octeontx/otx_cptpf_ucode.c
@@ -1326,7 +1326,7 @@ static ssize_t ucode_load_store(struct device *dev,
int del_grp_idx = -1;
int ucode_idx = 0;
- if (strlen(buf) > OTX_CPT_UCODE_NAME_LENGTH)
+ if (count >= OTX_CPT_UCODE_NAME_LENGTH)
return -EINVAL;
eng_grps = container_of(attr, struct otx_cpt_eng_grps, ucode_load_attr);
--
Thorsten Blum <thorsten.blum(a)linux.dev>
GPG: 1D60 735E 8AEF 3BE4 73B6 9D84 7336 78FD 8DFE EAD4
A recent change fixing a device reference leak in a UDC driver
introduced a potential use-after-free in the non-OF case as the
isp1301_get_client() helper only increases the reference count for the
returned I2C device in the OF case.
Increment the reference count also for non-OF so that the caller can
decrement it unconditionally.
Note that this is inherently racy just as using the returned I2C device
is since nothing is preventing the PHY driver from being unbound while
in use.
Fixes: c84117912bdd ("USB: lpc32xx_udc: Fix error handling in probe")
Cc: stable(a)vger.kernel.org
Cc: Ma Ke <make24(a)iscas.ac.cn>
Signed-off-by: Johan Hovold <johan(a)kernel.org>
---
drivers/usb/phy/phy-isp1301.c | 7 ++++++-
1 file changed, 6 insertions(+), 1 deletion(-)
diff --git a/drivers/usb/phy/phy-isp1301.c b/drivers/usb/phy/phy-isp1301.c
index f9b5c411aee4..2940f0c84e1b 100644
--- a/drivers/usb/phy/phy-isp1301.c
+++ b/drivers/usb/phy/phy-isp1301.c
@@ -149,7 +149,12 @@ struct i2c_client *isp1301_get_client(struct device_node *node)
return client;
/* non-DT: only one ISP1301 chip supported */
- return isp1301_i2c_client;
+ if (isp1301_i2c_client) {
+ get_device(&isp1301_i2c_client->dev);
+ return isp1301_i2c_client;
+ }
+
+ return NULL;
}
EXPORT_SYMBOL_GPL(isp1301_get_client);
--
2.51.2