The patch below does not apply to the 6.17-stable tree.
If someone wants it applied there, or to any other stable or longterm
tree, then please email the backport, including the original git commit
id to <stable(a)vger.kernel.org>.
To reproduce the conflict and resubmit, you may use the following commands:
git fetch https://git.kernel.org/pub/scm/linux/kernel/git/stable/linux.git/ linux-6.17.y
git checkout FETCH_HEAD
git cherry-pick -x 7d9f7d390f6af3a29614e81e802e2b9c238eb7b2
# <resolve conflicts, build, test, etc.>
git commit -s
git send-email --to '<stable(a)vger.kernel.org>' --in-reply-to '2025112031-catalyze-sleep-ba6e@gregkh' --subject-prefix 'PATCH 6.17.y' HEAD^..
Possible dependencies:
thanks,
greg k-h
------------------ original commit in Linus's tree ------------------
From 7d9f7d390f6af3a29614e81e802e2b9c238eb7b2 Mon Sep 17 00:00:00 2001
From: Carlos Llamas <cmllamas(a)google.com>
Date: Thu, 30 Oct 2025 01:03:33 +0000
Subject: [PATCH] scripts/decode_stacktrace.sh: fix build ID and PC source
parsing
Support for parsing PC source info in stacktraces (e.g. '(P)') was added
in commit 2bff77c665ed ("scripts/decode_stacktrace.sh: fix decoding of
lines with an additional info"). However, this logic was placed after the
build ID processing. This incorrect order fails to parse lines containing
both elements, e.g.:
drm_gem_mmap_obj+0x114/0x200 [drm 03d0564e0529947d67bb2008c3548be77279fd27] (P)
This patch fixes the problem by extracting the PC source info first and
then processing the module build ID. With this change, the line above is
now properly parsed as such:
drm_gem_mmap_obj (./include/linux/mmap_lock.h:212 ./include/linux/mm.h:811 drivers/gpu/drm/drm_gem.c:1177) drm (P)
While here, also add a brief explanation the build ID section.
Link: https://lkml.kernel.org/r/20251030010347.2731925-1-cmllamas@google.com
Fixes: 2bff77c665ed ("scripts/decode_stacktrace.sh: fix decoding of lines with an additional info")
Signed-off-by: Carlos Llamas <cmllamas(a)google.com>
Reviewed-by: Matthieu Baerts (NGI0) <matttbe(a)kernel.org>
Reviewed-by: Luca Ceresoli <luca.ceresoli(a)bootlin.com>
Cc: Breno Leitao <leitao(a)debian.org>
Cc: Catalin Marinas <catalin.marinas(a)arm.com>
Cc: Marc Rutland <mark.rutland(a)arm.com>
Cc: Mark Brown <broonie(a)kernel.org>
Cc: Matthieu Baerts <matttbe(a)kernel.org>
Cc: Miroslav Benes <mbenes(a)suse.cz>
Cc: Puranjay Mohan <puranjay(a)kernel.org>
Cc: <stable(a)vger.kernel.org>
Signed-off-by: Andrew Morton <akpm(a)linux-foundation.org>
diff --git a/scripts/decode_stacktrace.sh b/scripts/decode_stacktrace.sh
index c73cb802a0a3..8d01b741de62 100755
--- a/scripts/decode_stacktrace.sh
+++ b/scripts/decode_stacktrace.sh
@@ -277,12 +277,6 @@ handle_line() {
fi
done
- if [[ ${words[$last]} =~ ^[0-9a-f]+\] ]]; then
- words[$last-1]="${words[$last-1]} ${words[$last]}"
- unset words[$last] spaces[$last]
- last=$(( $last - 1 ))
- fi
-
# Extract info after the symbol if present. E.g.:
# func_name+0x54/0x80 (P)
# ^^^
@@ -295,6 +289,14 @@ handle_line() {
last=$(( $last - 1 ))
fi
+ # Join module name with its build id if present, as these were
+ # split during tokenization (e.g. "[module" and "modbuildid]").
+ if [[ ${words[$last]} =~ ^[0-9a-f]+\] ]]; then
+ words[$last-1]="${words[$last-1]} ${words[$last]}"
+ unset words[$last] spaces[$last]
+ last=$(( $last - 1 ))
+ fi
+
if [[ ${words[$last]} =~ \[([^]]+)\] ]]; then
module=${words[$last]}
# some traces format is "(%pS)", which like "(foo+0x0/0x1 [bar])"
Greg Kroah-Hartman <gregkh(a)linuxfoundation.org> wrote:
> This is the start of the stable review cycle for the 6.12.58 release.
> There are 565 patches in this series, all will be posted as a response
> to this one. If anyone has any issues with these being applied, please
> let me know.
[SNIP]
> Zizhi Wo <wozizhi(a)huaweicloud.com>
> tty/vt: Add missing return value for VT_RESIZE in vt_ioctl()
Locking seems to be messed up in backport of above mentioned patch.
That patch is viewable here:
https://git.kernel.org/pub/scm/linux/kernel/git/stable/linux-stable-rc.git/…
Upstream uses guard() locking:
| case VT_RESIZE:
| {
| ....
| guard(console_lock)();
| ^^^^^^^^^^^^^^^^^^^^^-------this generates auto-unlock code
| ....
| ret = __vc_resize(vc_cons[i].d, cc, ll, true);
| if (ret)
| return ret;
| ^^^^^^^^^^----------this releases console lock
| ....
| break;
| }
Older stable branches use old-school locking:
| case VT_RESIZE:
| {
| ....
| console_lock();
| ....
| ret = __vc_resize(vc_cons[i].d, cc, ll, true);
| if (ret)
| return ret;
| ^^^^^^^^^^----------this does not release console lock
| ....
| console_unlock();
| break;
| }
Backporting upstream fixes that use guard() locking to older stable
branches that use old-school locking need "extra sports".
Please consider dropping or fixing above mentioned patch.
--
Jari Ruusu 4096R/8132F189 12D6 4C3A DCDA 0AA4 27BD ACDF F073 3C80 8132 F189
From: Smita Koralahalli <Smita.KoralahalliChannabasappa(a)amd.com>
Extend the logic of handling CMCI storms to AMD threshold interrupts.
Rely on the similar approach as of Intel's CMCI to mitigate storms per CPU and
per bank. But, unlike CMCI, do not set thresholds and reduce interrupt rate on
a storm. Rather, disable the interrupt on the corresponding CPU and bank.
Re-enable back the interrupts if enough consecutive polls of the bank show no
corrected errors (30, as programmed by Intel).
Turning off the threshold interrupts would be a better solution on AMD systems
as other error severities will still be handled even if the threshold
interrupts are disabled.
Also, AMD systems currently allow banks to be managed by both polling and
interrupts. So don't modify the polling banks set after a storm ends.
[Tony: Small tweak because mce_handle_storm() isn't a pointer now]
[Yazen: Rebase and simplify]
Stable backport notes:
1. Currently, when a Machine check interrupt storm is detected, the bank's
corresponding bit in mce_poll_banks per-CPU variable is cleared by
cmci_storm_end(). As a result, on AMD's SMCA systems, errors injected or
encountered after the storm subsides are not logged since polling on that
bank has been disabled. Polling banks set on AMD systems should not be
modified when a storm subsides.
2. This patch is a snippet from the CMCI storm handling patch (link below)
that has been accepted into tip for v6.19. While backporting the patch
would have been the preferred way, the same cannot be undertaken since
its part of a larger set. As such, this fix will be temporary. When the
original patch and its set is integrated into stable, this patch should be
reverted.
Signed-off-by: Smita Koralahalli <Smita.KoralahalliChannabasappa(a)amd.com>
Signed-off-by: Tony Luck <tony.luck(a)intel.com>
Signed-off-by: Yazen Ghannam <yazen.ghannam(a)amd.com>
Signed-off-by: Borislav Petkov (AMD) <bp(a)alien8.de>
Reviewed-by: Qiuxu Zhuo <qiuxu.zhuo(a)intel.com>
Link: https://lore.kernel.org/20251104-wip-mca-updates-v8-0-66c8eacf67b9@amd.com
Signed-off-by: Avadhut Naik <avadhut.naik(a)amd.com>
---
This is somewhat of a new scenario for me. Not really sure about the
procedure. Hence, haven't modified the commit message and removed the
tags. If required, will rework both.
Also, while this issue can be encountered on AMD systems using v6.8 and
later stable kernels, we would specifically prefer for this fix to be
backported to v6.12 since its LTS.
---
arch/x86/kernel/cpu/mce/threshold.c | 3 ++-
1 file changed, 2 insertions(+), 1 deletion(-)
diff --git a/arch/x86/kernel/cpu/mce/threshold.c b/arch/x86/kernel/cpu/mce/threshold.c
index f4a007616468..61eaa1774931 100644
--- a/arch/x86/kernel/cpu/mce/threshold.c
+++ b/arch/x86/kernel/cpu/mce/threshold.c
@@ -85,7 +85,8 @@ void cmci_storm_end(unsigned int bank)
{
struct mca_storm_desc *storm = this_cpu_ptr(&storm_desc);
- __clear_bit(bank, this_cpu_ptr(mce_poll_banks));
+ if (!mce_flags.amd_threshold)
+ __clear_bit(bank, this_cpu_ptr(mce_poll_banks));
storm->banks[bank].history = 0;
storm->banks[bank].in_storm_mode = false;
base-commit: 8b690556d8fe074b4f9835075050fba3fb180e93
--
2.43.0
The retry in lock_vma_under_rcu() drops the rcu read lock before
reacquiring the lock and trying again. This may cause a use-after-free
if the maple node the maple state was using was freed.
The maple state is protected by the rcu read lock. When the lock is
dropped, the state cannot be reused as it tracks pointers to objects
that may be freed during the time where the lock was not held.
Any time the rcu read lock is dropped, the maple state must be
invalidated. Resetting the address and state to MA_START is the safest
course of action, which will result in the next operation starting from
the top of the tree.
Prior to commit 0b16f8bed19c ("mm: change vma_start_read() to drop RCU
lock on failure"), the rcu read lock was dropped and NULL was returned,
so the retry would not have happened. However, now that the read lock
is dropped regardless of the return, we may use a freed maple tree node
cached in the maple state on retry.
Cc: Suren Baghdasaryan <surenb(a)google.com>
Cc: stable(a)vger.kernel.org
Fixes: 0b16f8bed19c ("mm: change vma_start_read() to drop RCU lock on failure")
Reported-by: syzbot+131f9eb2b5807573275c(a)syzkaller.appspotmail.com
Closes: https://syzkaller.appspot.com/bug?extid=131f9eb2b5807573275c
Signed-off-by: Liam R. Howlett <Liam.Howlett(a)oracle.com>
---
mm/mmap_lock.c | 1 +
1 file changed, 1 insertion(+)
diff --git a/mm/mmap_lock.c b/mm/mmap_lock.c
index 39f341caf32c0..f2532af6208c0 100644
--- a/mm/mmap_lock.c
+++ b/mm/mmap_lock.c
@@ -257,6 +257,7 @@ struct vm_area_struct *lock_vma_under_rcu(struct mm_struct *mm,
if (PTR_ERR(vma) == -EAGAIN) {
count_vm_vma_lock_event(VMA_LOCK_MISS);
/* The area was replaced with another one */
+ mas_set(&mas, address);
goto retry;
}
--
2.47.2
The return type of __modify_irte_ga() is int, but modify_irte_ga()
treats it as a bool. Casting the int to bool discards the error code.
To fix the issue, change the type of ret to int in modify_irte_ga().
Fixes: 57cdb720eaa5 ("iommu/amd: Do not flush IRTE when only updating isRun and destination fields")
Cc: stable(a)vger.kernel.org
Signed-off-by: Jinhui Guo <guojinhui.liam(a)bytedance.com>
Reviewed-by: Ankit Soni <Ankit.Soni(a)amd.com>
---
v1: https://lore.kernel.org/all/20251120154725.435-1-guojinhui.liam@bytedance.c…
Changelog in v1 -> v2 (suggested by Ankit Soni)
- Trim subject line to the recommanded length
drivers/iommu/amd/iommu.c | 2 +-
1 file changed, 1 insertion(+), 1 deletion(-)
diff --git a/drivers/iommu/amd/iommu.c b/drivers/iommu/amd/iommu.c
index 2e1865daa1ce..a38304f1a8df 100644
--- a/drivers/iommu/amd/iommu.c
+++ b/drivers/iommu/amd/iommu.c
@@ -3354,7 +3354,7 @@ static int __modify_irte_ga(struct amd_iommu *iommu, u16 devid, int index,
static int modify_irte_ga(struct amd_iommu *iommu, u16 devid, int index,
struct irte_ga *irte)
{
- bool ret;
+ int ret;
ret = __modify_irte_ga(iommu, devid, index, irte);
if (ret)
--
2.20.1
The return type of __modify_irte_ga() is int, but modify_irte_ga()
treats it as a bool. Casting the int to bool discards the error code.
To fix the issue, change the type of ret to int in modify_irte_ga().
Fixes: 57cdb720eaa5 ("iommu/amd: Do not flush IRTE when only updating isRun and destination fields")
Cc: stable(a)vger.kernel.org
Signed-off-by: Jinhui Guo <guojinhui.liam(a)bytedance.com>
---
drivers/iommu/amd/iommu.c | 2 +-
1 file changed, 1 insertion(+), 1 deletion(-)
diff --git a/drivers/iommu/amd/iommu.c b/drivers/iommu/amd/iommu.c
index 2e1865daa1ce..a38304f1a8df 100644
--- a/drivers/iommu/amd/iommu.c
+++ b/drivers/iommu/amd/iommu.c
@@ -3354,7 +3354,7 @@ static int __modify_irte_ga(struct amd_iommu *iommu, u16 devid, int index,
static int modify_irte_ga(struct amd_iommu *iommu, u16 devid, int index,
struct irte_ga *irte)
{
- bool ret;
+ int ret;
ret = __modify_irte_ga(iommu, devid, index, irte);
if (ret)
--
2.20.1
The patch below does not apply to the 6.1-stable tree.
If someone wants it applied there, or to any other stable or longterm
tree, then please email the backport, including the original git commit
id to <stable(a)vger.kernel.org>.
To reproduce the conflict and resubmit, you may use the following commands:
git fetch https://git.kernel.org/pub/scm/linux/kernel/git/stable/linux.git/ linux-6.1.y
git checkout FETCH_HEAD
git cherry-pick -x 0c43094f8cc9d3d99d835c0ac9c4fe1ccc62babd
# <resolve conflicts, build, test, etc.>
git commit -s
git send-email --to '<stable(a)vger.kernel.org>' --in-reply-to '2025101616-causal-numerator-0c1e@gregkh' --subject-prefix 'PATCH 6.1.y' HEAD^..
Possible dependencies:
thanks,
greg k-h
------------------ original commit in Linus's tree ------------------
From 0c43094f8cc9d3d99d835c0ac9c4fe1ccc62babd Mon Sep 17 00:00:00 2001
From: Nam Cao <namcao(a)linutronix.de>
Date: Tue, 15 Jul 2025 14:46:34 +0200
Subject: [PATCH] eventpoll: Replace rwlock with spinlock
The ready event list of an epoll object is protected by read-write
semaphore:
- The consumer (waiter) acquires the write lock and takes items.
- the producer (waker) takes the read lock and adds items.
The point of this design is enabling epoll to scale well with large number
of producers, as multiple producers can hold the read lock at the same
time.
Unfortunately, this implementation may cause scheduling priority inversion
problem. Suppose the consumer has higher scheduling priority than the
producer. The consumer needs to acquire the write lock, but may be blocked
by the producer holding the read lock. Since read-write semaphore does not
support priority-boosting for the readers (even with CONFIG_PREEMPT_RT=y),
we have a case of priority inversion: a higher priority consumer is blocked
by a lower priority producer. This problem was reported in [1].
Furthermore, this could also cause stall problem, as described in [2].
Fix this problem by replacing rwlock with spinlock.
This reduces the event bandwidth, as the producers now have to contend with
each other for the spinlock. According to the benchmark from
https://github.com/rouming/test-tools/blob/master/stress-epoll.c:
On 12 x86 CPUs:
Before After Diff
threads events/ms events/ms
8 7162 4956 -31%
16 8733 5383 -38%
32 7968 5572 -30%
64 10652 5739 -46%
128 11236 5931 -47%
On 4 riscv CPUs:
Before After Diff
threads events/ms events/ms
8 2958 2833 -4%
16 3323 3097 -7%
32 3451 3240 -6%
64 3554 3178 -11%
128 3601 3235 -10%
Although the numbers look bad, it should be noted that this benchmark
creates multiple threads who do nothing except constantly generating new
epoll events, thus contention on the spinlock is high. For real workload,
the event rate is likely much lower, and the performance drop is not as
bad.
Using another benchmark (perf bench epoll wait) where spinlock contention
is lower, improvement is even observed on x86:
On 12 x86 CPUs:
Before: Averaged 110279 operations/sec (+- 1.09%), total secs = 8
After: Averaged 114577 operations/sec (+- 2.25%), total secs = 8
On 4 riscv CPUs:
Before: Averaged 175767 operations/sec (+- 0.62%), total secs = 8
After: Averaged 167396 operations/sec (+- 0.23%), total secs = 8
In conclusion, no one is likely to be upset over this change. After all,
spinlock was used originally for years, and the commit which converted to
rwlock didn't mention a real workload, just that the benchmark numbers are
nice.
This patch is not exactly the revert of commit a218cc491420 ("epoll: use
rwlock in order to reduce ep_poll_callback() contention"), because git
revert conflicts in some places which are not obvious on the resolution.
This patch is intended to be backported, therefore go with the obvious
approach:
- Replace rwlock_t with spinlock_t one to one
- Delete list_add_tail_lockless() and chain_epi_lockless(). These were
introduced to allow producers to concurrently add items to the list.
But now that spinlock no longer allows producers to touch the event
list concurrently, these two functions are not necessary anymore.
Fixes: a218cc491420 ("epoll: use rwlock in order to reduce ep_poll_callback() contention")
Signed-off-by: Nam Cao <namcao(a)linutronix.de>
Link: https://lore.kernel.org/ec92458ea357ec503c737ead0f10b2c6e4c37d47.1752581388…
Tested-by: K Prateek Nayak <kprateek.nayak(a)amd.com>
Cc: stable(a)vger.kernel.org
Reported-by: Frederic Weisbecker <frederic(a)kernel.org>
Closes: https://lore.kernel.org/linux-rt-users/20210825132754.GA895675@lothringen/ [1]
Reported-by: Valentin Schneider <vschneid(a)redhat.com>
Closes: https://lore.kernel.org/linux-rt-users/xhsmhttqvnall.mognet@vschneid.remote… [2]
Signed-off-by: Christian Brauner <brauner(a)kernel.org>
diff --git a/fs/eventpoll.c b/fs/eventpoll.c
index b22d6f819f78..ee7c4b683ec3 100644
--- a/fs/eventpoll.c
+++ b/fs/eventpoll.c
@@ -46,10 +46,10 @@
*
* 1) epnested_mutex (mutex)
* 2) ep->mtx (mutex)
- * 3) ep->lock (rwlock)
+ * 3) ep->lock (spinlock)
*
* The acquire order is the one listed above, from 1 to 3.
- * We need a rwlock (ep->lock) because we manipulate objects
+ * We need a spinlock (ep->lock) because we manipulate objects
* from inside the poll callback, that might be triggered from
* a wake_up() that in turn might be called from IRQ context.
* So we can't sleep inside the poll callback and hence we need
@@ -195,7 +195,7 @@ struct eventpoll {
struct list_head rdllist;
/* Lock which protects rdllist and ovflist */
- rwlock_t lock;
+ spinlock_t lock;
/* RB tree root used to store monitored fd structs */
struct rb_root_cached rbr;
@@ -741,10 +741,10 @@ static void ep_start_scan(struct eventpoll *ep, struct list_head *txlist)
* in a lockless way.
*/
lockdep_assert_irqs_enabled();
- write_lock_irq(&ep->lock);
+ spin_lock_irq(&ep->lock);
list_splice_init(&ep->rdllist, txlist);
WRITE_ONCE(ep->ovflist, NULL);
- write_unlock_irq(&ep->lock);
+ spin_unlock_irq(&ep->lock);
}
static void ep_done_scan(struct eventpoll *ep,
@@ -752,7 +752,7 @@ static void ep_done_scan(struct eventpoll *ep,
{
struct epitem *epi, *nepi;
- write_lock_irq(&ep->lock);
+ spin_lock_irq(&ep->lock);
/*
* During the time we spent inside the "sproc" callback, some
* other events might have been queued by the poll callback.
@@ -793,7 +793,7 @@ static void ep_done_scan(struct eventpoll *ep,
wake_up(&ep->wq);
}
- write_unlock_irq(&ep->lock);
+ spin_unlock_irq(&ep->lock);
}
static void ep_get(struct eventpoll *ep)
@@ -868,10 +868,10 @@ static bool __ep_remove(struct eventpoll *ep, struct epitem *epi, bool force)
rb_erase_cached(&epi->rbn, &ep->rbr);
- write_lock_irq(&ep->lock);
+ spin_lock_irq(&ep->lock);
if (ep_is_linked(epi))
list_del_init(&epi->rdllink);
- write_unlock_irq(&ep->lock);
+ spin_unlock_irq(&ep->lock);
wakeup_source_unregister(ep_wakeup_source(epi));
/*
@@ -1152,7 +1152,7 @@ static int ep_alloc(struct eventpoll **pep)
return -ENOMEM;
mutex_init(&ep->mtx);
- rwlock_init(&ep->lock);
+ spin_lock_init(&ep->lock);
init_waitqueue_head(&ep->wq);
init_waitqueue_head(&ep->poll_wait);
INIT_LIST_HEAD(&ep->rdllist);
@@ -1239,100 +1239,10 @@ struct file *get_epoll_tfile_raw_ptr(struct file *file, int tfd,
}
#endif /* CONFIG_KCMP */
-/*
- * Adds a new entry to the tail of the list in a lockless way, i.e.
- * multiple CPUs are allowed to call this function concurrently.
- *
- * Beware: it is necessary to prevent any other modifications of the
- * existing list until all changes are completed, in other words
- * concurrent list_add_tail_lockless() calls should be protected
- * with a read lock, where write lock acts as a barrier which
- * makes sure all list_add_tail_lockless() calls are fully
- * completed.
- *
- * Also an element can be locklessly added to the list only in one
- * direction i.e. either to the tail or to the head, otherwise
- * concurrent access will corrupt the list.
- *
- * Return: %false if element has been already added to the list, %true
- * otherwise.
- */
-static inline bool list_add_tail_lockless(struct list_head *new,
- struct list_head *head)
-{
- struct list_head *prev;
-
- /*
- * This is simple 'new->next = head' operation, but cmpxchg()
- * is used in order to detect that same element has been just
- * added to the list from another CPU: the winner observes
- * new->next == new.
- */
- if (!try_cmpxchg(&new->next, &new, head))
- return false;
-
- /*
- * Initially ->next of a new element must be updated with the head
- * (we are inserting to the tail) and only then pointers are atomically
- * exchanged. XCHG guarantees memory ordering, thus ->next should be
- * updated before pointers are actually swapped and pointers are
- * swapped before prev->next is updated.
- */
-
- prev = xchg(&head->prev, new);
-
- /*
- * It is safe to modify prev->next and new->prev, because a new element
- * is added only to the tail and new->next is updated before XCHG.
- */
-
- prev->next = new;
- new->prev = prev;
-
- return true;
-}
-
-/*
- * Chains a new epi entry to the tail of the ep->ovflist in a lockless way,
- * i.e. multiple CPUs are allowed to call this function concurrently.
- *
- * Return: %false if epi element has been already chained, %true otherwise.
- */
-static inline bool chain_epi_lockless(struct epitem *epi)
-{
- struct eventpoll *ep = epi->ep;
-
- /* Fast preliminary check */
- if (epi->next != EP_UNACTIVE_PTR)
- return false;
-
- /* Check that the same epi has not been just chained from another CPU */
- if (cmpxchg(&epi->next, EP_UNACTIVE_PTR, NULL) != EP_UNACTIVE_PTR)
- return false;
-
- /* Atomically exchange tail */
- epi->next = xchg(&ep->ovflist, epi);
-
- return true;
-}
-
/*
* This is the callback that is passed to the wait queue wakeup
* mechanism. It is called by the stored file descriptors when they
* have events to report.
- *
- * This callback takes a read lock in order not to contend with concurrent
- * events from another file descriptor, thus all modifications to ->rdllist
- * or ->ovflist are lockless. Read lock is paired with the write lock from
- * ep_start/done_scan(), which stops all list modifications and guarantees
- * that lists state is seen correctly.
- *
- * Another thing worth to mention is that ep_poll_callback() can be called
- * concurrently for the same @epi from different CPUs if poll table was inited
- * with several wait queues entries. Plural wakeup from different CPUs of a
- * single wait queue is serialized by wq.lock, but the case when multiple wait
- * queues are used should be detected accordingly. This is detected using
- * cmpxchg() operation.
*/
static int ep_poll_callback(wait_queue_entry_t *wait, unsigned mode, int sync, void *key)
{
@@ -1343,7 +1253,7 @@ static int ep_poll_callback(wait_queue_entry_t *wait, unsigned mode, int sync, v
unsigned long flags;
int ewake = 0;
- read_lock_irqsave(&ep->lock, flags);
+ spin_lock_irqsave(&ep->lock, flags);
ep_set_busy_poll_napi_id(epi);
@@ -1372,12 +1282,15 @@ static int ep_poll_callback(wait_queue_entry_t *wait, unsigned mode, int sync, v
* chained in ep->ovflist and requeued later on.
*/
if (READ_ONCE(ep->ovflist) != EP_UNACTIVE_PTR) {
- if (chain_epi_lockless(epi))
+ if (epi->next == EP_UNACTIVE_PTR) {
+ epi->next = READ_ONCE(ep->ovflist);
+ WRITE_ONCE(ep->ovflist, epi);
ep_pm_stay_awake_rcu(epi);
+ }
} else if (!ep_is_linked(epi)) {
/* In the usual case, add event to ready list. */
- if (list_add_tail_lockless(&epi->rdllink, &ep->rdllist))
- ep_pm_stay_awake_rcu(epi);
+ list_add_tail(&epi->rdllink, &ep->rdllist);
+ ep_pm_stay_awake_rcu(epi);
}
/*
@@ -1410,7 +1323,7 @@ static int ep_poll_callback(wait_queue_entry_t *wait, unsigned mode, int sync, v
pwake++;
out_unlock:
- read_unlock_irqrestore(&ep->lock, flags);
+ spin_unlock_irqrestore(&ep->lock, flags);
/* We have to call this outside the lock */
if (pwake)
@@ -1745,7 +1658,7 @@ static int ep_insert(struct eventpoll *ep, const struct epoll_event *event,
}
/* We have to drop the new item inside our item list to keep track of it */
- write_lock_irq(&ep->lock);
+ spin_lock_irq(&ep->lock);
/* record NAPI ID of new item if present */
ep_set_busy_poll_napi_id(epi);
@@ -1762,7 +1675,7 @@ static int ep_insert(struct eventpoll *ep, const struct epoll_event *event,
pwake++;
}
- write_unlock_irq(&ep->lock);
+ spin_unlock_irq(&ep->lock);
/* We have to call this outside the lock */
if (pwake)
@@ -1826,7 +1739,7 @@ static int ep_modify(struct eventpoll *ep, struct epitem *epi,
* list, push it inside.
*/
if (ep_item_poll(epi, &pt, 1)) {
- write_lock_irq(&ep->lock);
+ spin_lock_irq(&ep->lock);
if (!ep_is_linked(epi)) {
list_add_tail(&epi->rdllink, &ep->rdllist);
ep_pm_stay_awake(epi);
@@ -1837,7 +1750,7 @@ static int ep_modify(struct eventpoll *ep, struct epitem *epi,
if (waitqueue_active(&ep->poll_wait))
pwake++;
}
- write_unlock_irq(&ep->lock);
+ spin_unlock_irq(&ep->lock);
}
/* We have to call this outside the lock */
@@ -2089,7 +2002,7 @@ static int ep_poll(struct eventpoll *ep, struct epoll_event __user *events,
init_wait(&wait);
wait.func = ep_autoremove_wake_function;
- write_lock_irq(&ep->lock);
+ spin_lock_irq(&ep->lock);
/*
* Barrierless variant, waitqueue_active() is called under
* the same lock on wakeup ep_poll_callback() side, so it
@@ -2108,7 +2021,7 @@ static int ep_poll(struct eventpoll *ep, struct epoll_event __user *events,
if (!eavail)
__add_wait_queue_exclusive(&ep->wq, &wait);
- write_unlock_irq(&ep->lock);
+ spin_unlock_irq(&ep->lock);
if (!eavail)
timed_out = !ep_schedule_timeout(to) ||
@@ -2124,7 +2037,7 @@ static int ep_poll(struct eventpoll *ep, struct epoll_event __user *events,
eavail = 1;
if (!list_empty_careful(&wait.entry)) {
- write_lock_irq(&ep->lock);
+ spin_lock_irq(&ep->lock);
/*
* If the thread timed out and is not on the wait queue,
* it means that the thread was woken up after its
@@ -2135,7 +2048,7 @@ static int ep_poll(struct eventpoll *ep, struct epoll_event __user *events,
if (timed_out)
eavail = list_empty(&wait.entry);
__remove_wait_queue(&ep->wq, &wait);
- write_unlock_irq(&ep->lock);
+ spin_unlock_irq(&ep->lock);
}
}
}
On (25/11/21 12:55), Sergey Senozhatsky wrote:
> Hi Christian,
>
> On (25/11/20 10:15), Christian Loehle wrote:
> > On 11/20/25 04:45, Sergey Senozhatsky wrote:
> > > Hi,
> > >
> > > We are observing a performance regression on one of our arm64 boards.
> > > We tracked it down to the linux-6.6.y commit ada8d7fa0ad4 ("sched/cpufreq:
> > > Rework schedutil governor performance estimation").
> > >
> > > UI speedometer benchmark:
> > > w/commit: 395 +/-38
> > > w/o commit: 439 +/-14
> > >
> >
> > Hi Sergey,
> > Would be nice to get some details. What board?
>
> It's an MT8196 chromebook.
>
> > What do the OPPs look like?
>
> How do I find that out?
>
> > Does this system use uclamp during the benchmark? How?
>
> How do I find that out?
>
> > Given how large the stddev given by speedometer (version 3?) itself is, can we get the
> > stats of a few runs?
>
> v2.1
>
> w/o patch w/ patch
> 440 +/-30 406 +/-11
> 440 +/-14 413 +/-16
> 444 +/-12 403 +/-14
> 442 +/-12 412 +/-15
>
> > Maybe traces of cpu_frequency for both w/ and w/o?
>
> trace-cmd record -e power:cpu_frequency attached.
>
> "base" is with ada8d7fa0ad4
> "revert" is ada8d7fa0ad4 reverted.
Am getting failed delivery notifications. I guess attaching those as
text files wasn't a good idea after all. Vincent, Christian, did you
receive that email?
amdgpu_atpx_detect() uses pci_get_class() in two while loops to
iterate through VGA and OTHER display class PCI devices. Each call to
pci_get_class() increments the reference count of the returned PCI
device. However, after the loops complete, there are no corresponding
pci_dev_put() to decrement these reference counts.
Add pci_dev_put() after each while loop to release reference counts
held by the last devices found in each class.
Found by code review.
Cc: stable(a)vger.kernel.org
Fixes: 5d30ed3c2c74 ("Revert "drm/amdgpu: simplify ATPX detection"")
Signed-off-by: Ma Ke <make24(a)iscas.ac.cn>
---
drivers/gpu/drm/amd/amdgpu/amdgpu_atpx_handler.c | 5 +++++
1 file changed, 5 insertions(+)
diff --git a/drivers/gpu/drm/amd/amdgpu/amdgpu_atpx_handler.c b/drivers/gpu/drm/amd/amdgpu/amdgpu_atpx_handler.c
index 3893e6fc2f03..9eb776a2e8bb 100644
--- a/drivers/gpu/drm/amd/amdgpu/amdgpu_atpx_handler.c
+++ b/drivers/gpu/drm/amd/amdgpu/amdgpu_atpx_handler.c
@@ -617,6 +617,9 @@ static bool amdgpu_atpx_detect(void)
amdgpu_atpx_get_quirks(pdev);
}
+ pci_dev_put(pdev);
+ pdev = NULL;
+
while ((pdev = pci_get_class(PCI_CLASS_DISPLAY_OTHER << 8, pdev)) != NULL) {
vga_count++;
@@ -627,6 +630,8 @@ static bool amdgpu_atpx_detect(void)
amdgpu_atpx_get_quirks(pdev);
}
+ pci_dev_put(pdev);
+
if (has_atpx && vga_count == 2) {
acpi_get_name(amdgpu_atpx_priv.atpx.handle, ACPI_FULL_PATHNAME, &buffer);
pr_info("vga_switcheroo: detected switching method %s handle\n",
--
2.17.1