This fix is the deadline version of the change made to the rt scheduler
titled: "sched/rt: Fix race in push_rt_task".
Here is the summary of the issue:
When a CPU chooses to call push_dl_task and picks a task to push to
another CPU's runqueue then it will call find_lock_later_rq method
which would take a double lock on both CPUs' runqueues. If one of the
locks aren't readily available, it may lead to dropping the current
runqueue lock and reacquiring both the locks at once. During this window
it is possible that the task is already migrated and is running on some
other CPU. These cases are already handled. However, if the task is
migrated and has already been executed and another CPU is now trying to
wake it up (ttwu) such that it is queued again on the runqeue
(on_rq is 1) and also if the task was run by the same CPU, then the
current checks will pass even though the task was migrated out and is no
longer in the pushable tasks list.
Please go through the original change for more details on the issue.
In this fix, after the lock is obtained inside the find_lock_later_rq we
ensure that the task is still at the head of pushable tasks list. Also
removed some checks that are no longer needed with the addition this new
check.
However, the check of pushable tasks list only applies when
find_lock_later_rq is called by push_dl_task. For the other caller i.e.
dl_task_offline_migration, we use the existing checks.
Signed-off-by: Harshit Agarwal <harshit(a)nutanix.com>
Cc: stable(a)vger.kernel.org
---
Changes in v2:
- As per Juri's suggestion, moved the check inside find_lock_later_rq
similar to rt change. Here we distinguish among the push_dl_task
caller vs dl_task_offline_migration by checking if the task is
throttled or not.
- Fixed the commit message to refer to the rt change by title.
- Link to v1:
https://lore.kernel.org/lkml/20250307204255.60640-1-harshit@nutanix.com/
---
kernel/sched/deadline.c | 66 ++++++++++++++++++++++++++---------------
1 file changed, 42 insertions(+), 24 deletions(-)
diff --git a/kernel/sched/deadline.c b/kernel/sched/deadline.c
index 38e4537790af..2366801b4557 100644
--- a/kernel/sched/deadline.c
+++ b/kernel/sched/deadline.c
@@ -2621,6 +2621,25 @@ static int find_later_rq(struct task_struct *task)
return -1;
}
+static struct task_struct *pick_next_pushable_dl_task(struct rq *rq)
+{
+ struct task_struct *p;
+
+ if (!has_pushable_dl_tasks(rq))
+ return NULL;
+
+ p = __node_2_pdl(rb_first_cached(&rq->dl.pushable_dl_tasks_root));
+
+ WARN_ON_ONCE(rq->cpu != task_cpu(p));
+ WARN_ON_ONCE(task_current(rq, p));
+ WARN_ON_ONCE(p->nr_cpus_allowed <= 1);
+
+ WARN_ON_ONCE(!task_on_rq_queued(p));
+ WARN_ON_ONCE(!dl_task(p));
+
+ return p;
+}
+
/* Locks the rq it finds */
static struct rq *find_lock_later_rq(struct task_struct *task, struct rq *rq)
{
@@ -2648,12 +2667,30 @@ static struct rq *find_lock_later_rq(struct task_struct *task, struct rq *rq)
/* Retry if something changed. */
if (double_lock_balance(rq, later_rq)) {
- if (unlikely(task_rq(task) != rq ||
+ /*
+ * We had to unlock the run queue. In the meantime,
+ * task could have migrated already or had its affinity
+ * changed.
+ * It is possible the task was scheduled, set
+ * "migrate_disabled" and then got preempted, so we must
+ * check the task migration disable flag here too.
+ * For throttled task (dl_task_offline_migration), we
+ * check if the task is migrated to a different rq or
+ * is not a dl task anymore.
+ * For the non-throttled task (push_dl_task), the check
+ * to ensure that this task is still at the head of the
+ * pushable tasks list is enough.
+ */
+ if (unlikely(is_migration_disabled(task) ||
!cpumask_test_cpu(later_rq->cpu, &task->cpus_mask) ||
- task_on_cpu(rq, task) ||
- !dl_task(task) ||
- is_migration_disabled(task) ||
- !task_on_rq_queued(task))) {
+ (task->dl.dl_throttled &&
+ (task_rq(task) != rq ||
+ task_on_cpu(rq, task) ||
+ !dl_task(task) ||
+ !task_on_rq_queued(task))) ||
+ (!task->dl.dl_throttled &&
+ task != pick_next_pushable_dl_task(rq)))) {
+
double_unlock_balance(rq, later_rq);
later_rq = NULL;
break;
@@ -2676,25 +2713,6 @@ static struct rq *find_lock_later_rq(struct task_struct *task, struct rq *rq)
return later_rq;
}
-static struct task_struct *pick_next_pushable_dl_task(struct rq *rq)
-{
- struct task_struct *p;
-
- if (!has_pushable_dl_tasks(rq))
- return NULL;
-
- p = __node_2_pdl(rb_first_cached(&rq->dl.pushable_dl_tasks_root));
-
- WARN_ON_ONCE(rq->cpu != task_cpu(p));
- WARN_ON_ONCE(task_current(rq, p));
- WARN_ON_ONCE(p->nr_cpus_allowed <= 1);
-
- WARN_ON_ONCE(!task_on_rq_queued(p));
- WARN_ON_ONCE(!dl_task(p));
-
- return p;
-}
-
/*
* See if the non running -deadline tasks on this rq
* can be sent to some other CPU where they can preempt
--
2.39.3
Some peripheral subsystems request IRQ_TYPE_EDGE_BOTH interrupt type and
report request failures on LIOINTC. To avoid such failures we support to
set IRQ_TYPE_EDGE_BOTH type on LIOINTC, by setting LIOINTC_REG_INTC_EDGE
to true and keep LIOINTC_REG_INTC_POL as is.
Cc: stable(a)vger.kernel.org
Signed-off-by: Yinbo Zhu <zhuyinbo(a)loongson.cn>
Signed-off-by: Huacai Chen <chenhuacai(a)loongson.cn>
---
drivers/irqchip/irq-loongson-liointc.c | 4 ++++
1 file changed, 4 insertions(+)
diff --git a/drivers/irqchip/irq-loongson-liointc.c b/drivers/irqchip/irq-loongson-liointc.c
index 2b1bd4a96665..c0c8ef8d27cf 100644
--- a/drivers/irqchip/irq-loongson-liointc.c
+++ b/drivers/irqchip/irq-loongson-liointc.c
@@ -128,6 +128,10 @@ static int liointc_set_type(struct irq_data *data, unsigned int type)
liointc_set_bit(gc, LIOINTC_REG_INTC_EDGE, mask, false);
liointc_set_bit(gc, LIOINTC_REG_INTC_POL, mask, true);
break;
+ case IRQ_TYPE_EDGE_BOTH:
+ liointc_set_bit(gc, LIOINTC_REG_INTC_EDGE, mask, true);
+ /* Requester need "both", keep LIOINTC_REG_INTC_POL as is */
+ break;
case IRQ_TYPE_EDGE_RISING:
liointc_set_bit(gc, LIOINTC_REG_INTC_EDGE, mask, true);
liointc_set_bit(gc, LIOINTC_REG_INTC_POL, mask, false);
--
2.47.1
tpm2_start_auth_session() does not mask TPM RC correctly from the callers:
[ 28.766528] tpm tpm0: A TPM error (2307) occurred start auth session
Process TPM RCs inside tpm2_start_auth_session(), and map them to POSIX
error codes.
Cc: stable(a)vger.kernel.org # v6.10+
Fixes: 699e3efd6c64 ("tpm: Add HMAC session start and end functions")
Reported-by: Herbert Xu <herbert(a)gondor.apana.org.au>
Closes: https://lore.kernel.org/linux-integrity/Z_NgdRHuTKP6JK--@gondor.apana.org.a…
Signed-off-by: Jarkko Sakkinen <jarkko(a)kernel.org>
---
drivers/char/tpm/tpm2-sessions.c | 34 ++++++++++++++++----------------
include/linux/tpm.h | 1 +
2 files changed, 18 insertions(+), 17 deletions(-)
diff --git a/drivers/char/tpm/tpm2-sessions.c b/drivers/char/tpm/tpm2-sessions.c
index 3f89635ba5e8..1ed23375e4cb 100644
--- a/drivers/char/tpm/tpm2-sessions.c
+++ b/drivers/char/tpm/tpm2-sessions.c
@@ -40,11 +40,6 @@
*
* These are the usage functions:
*
- * tpm2_start_auth_session() which allocates the opaque auth structure
- * and gets a session from the TPM. This must be called before
- * any of the following functions. The session is protected by a
- * session_key which is derived from a random salt value
- * encrypted to the NULL seed.
* tpm2_end_auth_session() kills the session and frees the resources.
* Under normal operation this function is done by
* tpm_buf_check_hmac_response(), so this is only to be used on
@@ -963,16 +958,13 @@ static int tpm2_load_null(struct tpm_chip *chip, u32 *null_key)
}
/**
- * tpm2_start_auth_session() - create a HMAC authentication session with the TPM
- * @chip: the TPM chip structure to create the session with
+ * tpm2_start_auth_session() - Create an a HMAC authentication session
+ * @chip: A TPM chip
*
- * This function loads the NULL seed from its saved context and starts
- * an authentication session on the null seed, fills in the
- * @chip->auth structure to contain all the session details necessary
- * for performing the HMAC, encrypt and decrypt operations and
- * returns. The NULL seed is flushed before this function returns.
+ * Loads the ephemeral key (null seed), and starts an HMAC authenticated
+ * session. The null seed is flushed before the return.
*
- * Return: zero on success or actual error encountered.
+ * Returns zero on success, or a POSIX error code.
*/
int tpm2_start_auth_session(struct tpm_chip *chip)
{
@@ -1024,11 +1016,19 @@ int tpm2_start_auth_session(struct tpm_chip *chip)
/* hash algorithm for session */
tpm_buf_append_u16(&buf, TPM_ALG_SHA256);
- rc = tpm_transmit_cmd(chip, &buf, 0, "start auth session");
+ rc = tpm_transmit_cmd(chip, &buf, 0, "StartAuthSession");
tpm2_flush_context(chip, null_key);
-
- if (rc == TPM2_RC_SUCCESS)
- rc = tpm2_parse_start_auth_session(auth, &buf);
+ switch (rc) {
+ case TPM2_RC_SUCCESS:
+ break;
+ case TPM2_RC_SESSION_MEMORY:
+ rc = -ENOMEM;
+ goto out;
+ default:
+ rc = -EFAULT;
+ goto out;
+ }
+ rc = tpm2_parse_start_auth_session(auth, &buf);
tpm_buf_destroy(&buf);
diff --git a/include/linux/tpm.h b/include/linux/tpm.h
index 6c3125300c00..c1d3d60b416f 100644
--- a/include/linux/tpm.h
+++ b/include/linux/tpm.h
@@ -257,6 +257,7 @@ enum tpm2_return_codes {
TPM2_RC_TESTING = 0x090A, /* RC_WARN */
TPM2_RC_REFERENCE_H0 = 0x0910,
TPM2_RC_RETRY = 0x0922,
+ TPM2_RC_SESSION_MEMORY = 0x0903,
};
enum tpm2_command_codes {
--
2.39.5
From: Wenlin Kang <wenlin.kang(a)windriver.com>
The selftest tpdir2 terminated with a 'Segmentation fault' during loading.
root@localhost:~# cd linux-kenel/tools/testing/selftests/arm64/abi && make
root@localhost:~/linux-kernel/tools/testing/selftests/arm64/abi# ./tpidr2
Segmentation fault
The cause of this is the __arch_clear_user() failure.
load_elf_binary() [fs/binfmt_elf.c]
-> if (likely(elf_bss != elf_brk) && unlikely(padzero(elf_bes)))
-> padzero()
-> clear_user() [arch/arm64/include/asm/uaccess.h]
-> __arch_clear_user() [arch/arm64/lib/clear_user.S]
For more details, please see:
https://lore.kernel.org/lkml/1d0342f3-0474-482b-b6db-81ca7820a462@t-8ch.de/…
This issue has been fixed in the mainline. Here I have backported
the relevant commits for the linux-6.6.y branch and attached them.
With these patches, tpdir2 works as:
root@localhost:~/linux-kernel/tools/testing/selftests/arm64/abi# ./tpidr2
TAP version 13
1..5
ok 0 skipped, TPIDR2 not supported
ok 1 skipped, TPIDR2 not supported
ok 2 skipped, TPIDR2 not supported
ok 3 skipped, TPIDR2 not supported
ok 4 skipped, TPIDR2 not supported
This issue is resolved by the first patch. However, to ensure
functional completeness, all related patches were backported
according to the following link.
https://lore.kernel.org/all/20230929031716.it.155-kees@kernel.org/#t
Eric W. Biederman (1):
binfmt_elf: Support segments with 0 filesz and misaligned starts
Kees Cook (5):
binfmt_elf: elf_bss no longer used by load_elf_binary()
binfmt_elf: Use elf_load() for interpreter
binfmt_elf: Use elf_load() for library
binfmt_elf: Only report padzero() errors when PROT_WRITE
mm: Remove unused vm_brk()
fs/binfmt_elf.c | 215 ++++++++++++++++-----------------------------
include/linux/mm.h | 3 +-
mm/mmap.c | 6 --
mm/nommu.c | 5 --
4 files changed, 76 insertions(+), 153 deletions(-)
--
2.43.0
Hello,
New build issue found on stable-rc/linux-5.15.y:
---
variable 'base_clk' is used uninitialized whenever 'if' condition is
true [-Werror,-Wsometimes-uninitialized] in
drivers/mmc/host/sdhci-brcmstb.o (drivers/mmc/host/sdhci-brcmstb.c)
[logspec:kbuild,kbuild.compiler.error]
---
- dashboard: https://d.kernelci.org/i/maestro:eb9b0da83cc077e6176b9903d98f0f78704ac17f
- giturl: https://git.kernel.org/pub/scm/linux/kernel/git/stable/linux-stable-rc.git
- commit HEAD: 0b4857306c618d2052f6455b90747ef1df364ecd
Log excerpt:
=====================================================
drivers/mmc/host/sdhci-brcmstb.c:303:6: error: variable 'base_clk' is
used uninitialized whenever 'if' condition is true
[-Werror,-Wsometimes-uninitialized]
303 | if (res)
| ^~~
drivers/mmc/host/sdhci-brcmstb.c:377:24: note: uninitialized use occurs here
377 | clk_disable_unprepare(base_clk);
| ^~~~~~~~
drivers/mmc/host/sdhci-brcmstb.c:303:2: note: remove the 'if' if its
condition is always false
303 | if (res)
| ^~~~~~~~
304 | goto err;
| ~~~~~~~~
drivers/mmc/host/sdhci-brcmstb.c:296:6: error: variable 'base_clk' is
used uninitialized whenever 'if' condition is true
[-Werror,-Wsometimes-uninitialized]
296 | if (IS_ERR(priv->cfg_regs)) {
| ^~~~~~~~~~~~~~~~~~~~~~
drivers/mmc/host/sdhci-brcmstb.c:377:24: note: uninitialized use occurs here
377 | clk_disable_unprepare(base_clk);
| ^~~~~~~~
drivers/mmc/host/sdhci-brcmstb.c:296:2: note: remove the 'if' if its
condition is always false
296 | if (IS_ERR(priv->cfg_regs)) {
| ^~~~~~~~~~~~~~~~~~~~~~~~~~~~~
297 | res = PTR_ERR(priv->cfg_regs);
| ~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~
298 | goto err;
| ~~~~~~~~~
299 | }
| ~
drivers/mmc/host/sdhci-brcmstb.c:281:6: error: variable 'base_clk' is
used uninitialized whenever 'if' condition is true
[-Werror,-Wsometimes-uninitialized]
281 | if (IS_ERR(host)) {
| ^~~~~~~~~~~~
drivers/mmc/host/sdhci-brcmstb.c:377:24: note: uninitialized use occurs here
377 | clk_disable_unprepare(base_clk);
| ^~~~~~~~
drivers/mmc/host/sdhci-brcmstb.c:281:2: note: remove the 'if' if its
condition is always false
281 | if (IS_ERR(host)) {
| ^~~~~~~~~~~~~~~~~~~
282 | res = PTR_ERR(host);
| ~~~~~~~~~~~~~~~~~~~~
283 | goto err_clk;
| ~~~~~~~~~~~~~
284 | }
| ~
drivers/mmc/host/sdhci-brcmstb.c:260:22: note: initialize the variable
'base_clk' to silence this warning
260 | struct clk *base_clk;
| ^
| = NULL
3 errors generated.
CC [M] drivers/gpu/drm/drm_gem.o
CC [M] drivers/staging/rtl8723bs/hal/odm_EdcaTurboCheck.o
CC [M] drivers/net/ethernet/rocker/rocker_tlv.o
CC [M] drivers/staging/nvec/nvec.o
CC [M] drivers/staging/media/zoran/zoran_card.o
=====================================================
# Builds where the incident occurred:
## defconfig+allmodconfig+CONFIG_FRAME_WARN=2048 on (arm):
- compiler: clang-17
- dashboard: https://d.kernelci.org/build/maestro:67f509a66fa43d168f278a2b
#kernelci issue maestro:eb9b0da83cc077e6176b9903d98f0f78704ac17f
Reported-by: kernelci.org bot <bot(a)kernelci.org>
--
This is an experimental report format. Please send feedback in!
Talk to us at kernelci(a)lists.linux.dev
Made with love by the KernelCI team - https://kernelci.org
Replace kzalloc with kvzalloc for the exit_dump buffer allocation, which
can require large contiguous memory (up to order=9) depending on the
implementation. This change prevents allocation failures by allowing the
system to fall back to vmalloc when contiguous memory allocation fails.
Since this buffer is only used for debugging purposes, physical memory
contiguity is not required, making vmalloc a suitable alternative.
Cc: stable(a)vger.kernel.org
Fixes: 07814a9439a3b0 ("sched_ext: Print debug dump after an error exit")
Suggested-by: Rik van Riel <riel(a)surriel.com>
Signed-off-by: Breno Leitao <leitao(a)debian.org>
Acked-by: Andrea Righi <arighi(a)nvidia.com>
---
Changes in v2:
- Use kvfree() on the free path as well.
- Link to v1: https://lore.kernel.org/r/20250407-scx-v1-1-774ba74a2c17@debian.org
---
kernel/sched/ext.c | 4 ++--
1 file changed, 2 insertions(+), 2 deletions(-)
diff --git a/kernel/sched/ext.c b/kernel/sched/ext.c
index 66bcd40a28ca1..db9af6a3c04fd 100644
--- a/kernel/sched/ext.c
+++ b/kernel/sched/ext.c
@@ -4623,7 +4623,7 @@ static void scx_ops_bypass(bool bypass)
static void free_exit_info(struct scx_exit_info *ei)
{
- kfree(ei->dump);
+ kvfree(ei->dump);
kfree(ei->msg);
kfree(ei->bt);
kfree(ei);
@@ -4639,7 +4639,7 @@ static struct scx_exit_info *alloc_exit_info(size_t exit_dump_len)
ei->bt = kcalloc(SCX_EXIT_BT_LEN, sizeof(ei->bt[0]), GFP_KERNEL);
ei->msg = kzalloc(SCX_EXIT_MSG_LEN, GFP_KERNEL);
- ei->dump = kzalloc(exit_dump_len, GFP_KERNEL);
+ ei->dump = kvzalloc(exit_dump_len, GFP_KERNEL);
if (!ei->bt || !ei->msg || !ei->dump) {
free_exit_info(ei);
---
base-commit: 0af2f6be1b4281385b618cb86ad946eded089ac8
change-id: 20250407-scx-11dbf94803c3
Best regards,
--
Breno Leitao <leitao(a)debian.org>