The submit queue polling threads are userland threads that just never
exit to the userland. When creating the thread with IORING_SETUP_SQ_AFF,
the affinity of the poller thread is set to the cpu specified in
sq_thread_cpu. However, this CPU can be outside of the cpuset defined
by the cgroup cpuset controller. This violates the rules defined by the
cpuset controller and is a potential issue for realtime applications.
In b7ed6d8ffd6 we fixed the default affinity of the poller thread, in
case no explicit pinning is required by inheriting the one of the
creating task. In case of explicit pinning, the check is more
complicated, as also a cpu outside of the parent cpumask is allowed.
We implemented this by using cpuset_cpus_allowed (that has support for
cgroup cpusets) and testing if the requested cpu is in the set.
Fixes: 37d1e2e3642e ("io_uring: move SQPOLL thread io-wq forked worker")
Cc: stable(a)vger.kernel.org # 6.1+
Signed-off-by: Felix Moessbauer <felix.moessbauer(a)siemens.com>
---
Hi,
that's hopefully the last fix of cpu pinnings of the sq poller threads.
However, there is more to come on the io-wq side. E.g the syscalls for
IORING_REGISTER_IOWQ_AFF that can be used to change the affinites are
not yet protected. I'm currently just lacking good reproducers for that.
I also have to admit that I don't feel too comfortable making changes to
the wq part, given that I don't have good tests.
While fixing this, I'm wondering if it makes sense to add tests for the
combination of pinning and cpuset. If yes, where should these tests be
added?
Best regards,
Felix Moessbauer
Siemens AG
io_uring/sqpoll.c | 5 ++++-
1 file changed, 4 insertions(+), 1 deletion(-)
diff --git a/io_uring/sqpoll.c b/io_uring/sqpoll.c
index 713be7c29388..b8ec8fec99b8 100644
--- a/io_uring/sqpoll.c
+++ b/io_uring/sqpoll.c
@@ -10,6 +10,7 @@
#include <linux/slab.h>
#include <linux/audit.h>
#include <linux/security.h>
+#include <linux/cpuset.h>
#include <linux/io_uring.h>
#include <uapi/linux/io_uring.h>
@@ -459,10 +460,12 @@ __cold int io_sq_offload_create(struct io_ring_ctx *ctx,
return 0;
if (p->flags & IORING_SETUP_SQ_AFF) {
+ struct cpumask allowed_mask;
int cpu = p->sq_thread_cpu;
ret = -EINVAL;
- if (cpu >= nr_cpu_ids || !cpu_online(cpu))
+ cpuset_cpus_allowed(current, &allowed_mask);
+ if (!cpumask_test_cpu(cpu, &allowed_mask))
goto err_sqpoll;
sqd->sq_cpu = cpu;
} else {
--
2.39.2
On 13/09/2024 22:12, Sasha Levin wrote:
>
> This is a note to let you know that I've just added the patch titled
>
> riscv: dts: starfive: jh7110-common: Fix lower rate of CPUfreq by setting PLL0
> rate to 1.5GHz
>
> to the 6.10-stable tree which can be found at:
> http://www.kernel.org/git/?p=linux/kernel/git/stable/stable-
> queue.git;a=summary
>
> The filename of the patch is:
> riscv-dts-starfive-jh7110-common-fix-lower-rate-of-c.patch
> and it can be found in the queue-6.10 subdirectory.
>
> If you, or anyone else, feels it should not be added to the stable tree, please let
> <stable(a)vger.kernel.org> know about it.
>
Hi Sasha,
This patch only has the part of DTS without the clock driver patch[1].
[1]: https://lore.kernel.org/all/20240826080430.179788-2-xingyu.wu@starfivetech.…
I don't know your plan about this driver patch, or maybe I missed it.
But the DTS changes really needs the driver patch to work and you should add the driver patch.
Thanks,
Xingyu Wu
>
>
> commit 67b60bf9777bd340c7179adb5376dcdd3f0c260c
> Author: Xingyu Wu <xingyu.wu(a)starfivetech.com>
> Date: Mon Aug 26 16:04:30 2024 +0800
>
> riscv: dts: starfive: jh7110-common: Fix lower rate of CPUfreq by setting PLL0
> rate to 1.5GHz
>
> [ Upstream commit 61f2e8a3a94175dbbaad6a54f381b2a505324610 ]
>
> CPUfreq supports 4 cpu frequency loads on 375/500/750/1500MHz.
> But now PLL0 rate is 1GHz and the cpu frequency loads become
> 250/333/500/1000MHz in fact.
>
> The PLL0 rate should be default set to 1.5GHz and set the
> cpu_core rate to 500MHz in safe.
>
> Fixes: e2c510d6d630 ("riscv: dts: starfive: Add cpu scaling for JH7110 SoC")
> Signed-off-by: Xingyu Wu <xingyu.wu(a)starfivetech.com>
> Reviewed-by: Hal Feng <hal.feng(a)starfivetech.com>
> Signed-off-by: Conor Dooley <conor.dooley(a)microchip.com>
> Signed-off-by: Sasha Levin <sashal(a)kernel.org>
>
> diff --git a/arch/riscv/boot/dts/starfive/jh7110-common.dtsi
> b/arch/riscv/boot/dts/starfive/jh7110-common.dtsi
> index 68d16717db8c..51d85f447626 100644
> --- a/arch/riscv/boot/dts/starfive/jh7110-common.dtsi
> +++ b/arch/riscv/boot/dts/starfive/jh7110-common.dtsi
> @@ -354,6 +354,12 @@ spi_dev0: spi@0 {
> };
> };
>
> +&syscrg {
> + assigned-clocks = <&syscrg JH7110_SYSCLK_CPU_CORE>,
> + <&pllclk JH7110_PLLCLK_PLL0_OUT>;
> + assigned-clock-rates = <500000000>, <1500000000>; };
> +
> &sysgpio {
> i2c0_pins: i2c0-0 {
> i2c-pins {
Otherwise when the tracer changes syscall number to -1, the kernel fails
to initialize a0 with -ENOSYS and subsequently fails to return the error
code of the failed syscall to userspace. For example, it will break
strace syscall tampering.
Fixes: 52449c17bdd1 ("riscv: entry: set a0 = -ENOSYS only when syscall != -1")
Reported-by: "Dmitry V. Levin" <ldv(a)strace.io>
Reviewed-by: Björn Töpel <bjorn(a)rivosinc.com>
Cc: stable(a)vger.kernel.org
Signed-off-by: Celeste Liu <CoelacanthusHex(a)gmail.com>
---
arch/riscv/kernel/traps.c | 4 ++--
1 file changed, 2 insertions(+), 2 deletions(-)
diff --git a/arch/riscv/kernel/traps.c b/arch/riscv/kernel/traps.c
index 05a16b1f0aee..51ebfd23e007 100644
--- a/arch/riscv/kernel/traps.c
+++ b/arch/riscv/kernel/traps.c
@@ -319,6 +319,7 @@ void do_trap_ecall_u(struct pt_regs *regs)
regs->epc += 4;
regs->orig_a0 = regs->a0;
+ regs->a0 = -ENOSYS;
riscv_v_vstate_discard(regs);
@@ -328,8 +329,7 @@ void do_trap_ecall_u(struct pt_regs *regs)
if (syscall >= 0 && syscall < NR_syscalls)
syscall_handler(regs, syscall);
- else if (syscall != -1)
- regs->a0 = -ENOSYS;
+
/*
* Ultimately, this value will get limited by KSTACK_OFFSET_MAX(),
* so the maximum stack offset is 1k bytes (10 bits).
--
2.45.2
This reverts commit ad6bcdad2b6724e113f191a12f859a9e8456b26d. I had
nak'd it, and Greg said on the thread that it links that he wasn't going
to take it either, especially since it's not his code or his tree, but
then, seemingly accidentally, it got pushed up some months later, in
what looks like a mistake, with no further discussion in the linked
thread. So revert it, since it's clearly not intended.
Fixes: ad6bcdad2b67 ("vmgenid: emit uevent when VMGENID updates")
Cc: stable(a)vger.kernel.org
Cc: Greg Kroah-Hartman <gregkh(a)linuxfoundation.org>
Link: https://lore.kernel.org/r/20230531095119.11202-2-bchalios@amazon.es
Signed-off-by: Jason A. Donenfeld <Jason(a)zx2c4.com>
---
drivers/virt/vmgenid.c | 2 --
1 file changed, 2 deletions(-)
diff --git a/drivers/virt/vmgenid.c b/drivers/virt/vmgenid.c
index b67a28da4702..a1c467a0e9f7 100644
--- a/drivers/virt/vmgenid.c
+++ b/drivers/virt/vmgenid.c
@@ -68,7 +68,6 @@ static int vmgenid_add(struct acpi_device *device)
static void vmgenid_notify(struct acpi_device *device, u32 event)
{
struct vmgenid_state *state = acpi_driver_data(device);
- char *envp[] = { "NEW_VMGENID=1", NULL };
u8 old_id[VMGENID_SIZE];
memcpy(old_id, state->this_id, sizeof(old_id));
@@ -76,7 +75,6 @@ static void vmgenid_notify(struct acpi_device *device, u32 event)
if (!memcmp(old_id, state->this_id, sizeof(old_id)))
return;
add_vmfork_randomness(state->this_id, sizeof(state->this_id));
- kobject_uevent_env(&device->dev.kobj, KOBJ_CHANGE, envp);
}
static const struct acpi_device_id vmgenid_ids[] = {
--
2.44.0
[ Upstream commit feabecaff5902f896531dde90646ca5dfa9d4f7d ]
If ipi_send_{mask|single}() is called with an invalid interrupt number, all
the local variables there will be NULL. ipi_send_verify() which is invoked
from these functions does verify its 'data' parameter, resulting in a
kernel oops in irq_data_get_affinity_mask() as the passed NULL pointer gets
dereferenced.
Add a missing NULL pointer check in ipi_send_verify()...
Found by Linux Verification Center (linuxtesting.org) with the SVACE static
analysis tool.
Fixes: 3b8e29a82dd1 ("genirq: Implement ipi_send_mask/single()")
Signed-off-by: Sergey Shtylyov <s.shtylyov(a)omp.ru>
Signed-off-by: Thomas Gleixner <tglx(a)linutronix.de>
Link: https://lore.kernel.org/r/b541232d-c2b6-1fe9-79b4-a7129459e4d0@omp.ru
---
kernel/irq/ipi.c | 8 ++++++--
1 file changed, 6 insertions(+), 2 deletions(-)
Index: linux-stable/kernel/irq/ipi.c
===================================================================
--- linux-stable.orig/kernel/irq/ipi.c
+++ linux-stable/kernel/irq/ipi.c
@@ -186,9 +186,9 @@ EXPORT_SYMBOL_GPL(ipi_get_hwirq);
static int ipi_send_verify(struct irq_chip *chip, struct irq_data *data,
const struct cpumask *dest, unsigned int cpu)
{
- struct cpumask *ipimask = irq_data_get_affinity_mask(data);
+ struct cpumask *ipimask;
- if (!chip || !ipimask)
+ if (!chip || !data)
return -EINVAL;
if (!chip->ipi_send_single && !chip->ipi_send_mask)
@@ -197,6 +197,10 @@ static int ipi_send_verify(struct irq_ch
if (cpu >= nr_cpu_ids)
return -EINVAL;
+ ipimask = irq_data_get_affinity_mask(data);
+ if (!ipimask)
+ return -EINVAL;
+
if (dest) {
if (!cpumask_subset(dest, ipimask))
return -EINVAL;
Hi Greg, Sasha,
This batch contains a backport for fixes for 5.15-stable:
The following list shows the backported patches, I am using original commit
IDs for reference:
1) 29b359cf6d95 ("netfilter: nft_set_pipapo: walk over current view on netlink dump")
2) efefd4f00c96 ("netfilter: nf_tables: missing iterator type in lookup walk")
Please, apply,
Thanks
Pablo Neira Ayuso (2):
netfilter: nft_set_pipapo: walk over current view on netlink dump
netfilter: nf_tables: missing iterator type in lookup walk
include/net/netfilter/nf_tables.h | 13 +++++++++++++
net/netfilter/nf_tables_api.c | 5 +++++
net/netfilter/nft_lookup.c | 1 +
net/netfilter/nft_set_pipapo.c | 6 ++++--
4 files changed, 23 insertions(+), 2 deletions(-)
--
2.30.2
Hi Greg, Sasha,
This batch contains a backport for fixes for 6.1-stable:
The following list shows the backported patches, I am using original commit
IDs for reference:
1) 29b359cf6d95 ("netfilter: nft_set_pipapo: walk over current view on netlink dump")
2) efefd4f00c96 ("netfilter: nf_tables: missing iterator type in lookup walk")
Please, apply,
Thanks
Pablo Neira Ayuso (2):
netfilter: nft_set_pipapo: walk over current view on netlink dump
netfilter: nf_tables: missing iterator type in lookup walk
include/net/netfilter/nf_tables.h | 13 +++++++++++++
net/netfilter/nf_tables_api.c | 5 +++++
net/netfilter/nft_lookup.c | 1 +
net/netfilter/nft_set_pipapo.c | 6 ++++--
4 files changed, 23 insertions(+), 2 deletions(-)
--
2.30.2
Hi Greg, Sasha,
This batch contains a backport for fixes for 6.6-stable:
The following list shows the backported patches, I am using original commit
IDs for reference:
1) 29b359cf6d95 ("netfilter: nft_set_pipapo: walk over current view on netlink dump")
2) efefd4f00c96 ("netfilter: nf_tables: missing iterator type in lookup walk")
Please, apply,
Thanks
Pablo Neira Ayuso (2):
netfilter: nft_set_pipapo: walk over current view on netlink dump
netfilter: nf_tables: missing iterator type in lookup walk
include/net/netfilter/nf_tables.h | 13 +++++++++++++
net/netfilter/nf_tables_api.c | 5 +++++
net/netfilter/nft_lookup.c | 1 +
net/netfilter/nft_set_pipapo.c | 6 ++++--
4 files changed, 23 insertions(+), 2 deletions(-)
--
2.30.2
No upstream commit exists for this commit.
Pointer '&pdevs[i]' is dereferenced at x86_android_tablet_init()
after the referenced memory was deallocated by calling function
'x86_android_tablet_cleanup()'.
Found by Linux Verification Center (linuxtesting.org) with SVACE.
Fixes: 5eba0141206e ("platform/x86: x86-android-tablets: Add support for instantiating platform-devs")
Signed-off-by: Aleksandr Burakov <a.burakov(a)rosalinux.ru>
---
drivers/platform/x86/x86-android-tablets.c | 3 ++-
1 file changed, 2 insertions(+), 1 deletion(-)
diff --git a/drivers/platform/x86/x86-android-tablets.c b/drivers/platform/x86/x86-android-tablets.c
index 9178076d9d7d..9838c5332201 100644
--- a/drivers/platform/x86/x86-android-tablets.c
+++ b/drivers/platform/x86/x86-android-tablets.c
@@ -1853,8 +1853,9 @@ static __init int x86_android_tablet_init(void)
for (i = 0; i < pdev_count; i++) {
pdevs[i] = platform_device_register_full(&dev_info->pdev_info[i]);
if (IS_ERR(pdevs[i])) {
+ int ret = PTR_ERR(pdevs[i]);
x86_android_tablet_cleanup();
- return PTR_ERR(pdevs[i]);
+ return ret;
}
}
--
2.25.1