Re: [REGRESSION] bisected: perf: hang when using async-profiler caused by perf: Fix the POLL_HUP delivery breakage

13 Oct 2025

On 10/13/2025 4:05 PM, Peter Zijlstra wrote:
...
On Mon, Oct 13, 2025 at 10:34:27AM +0800, Mi, Dapeng wrote:
...
It looks the issue described in the link
(https://lore.kernel.org/all/20250606192546.915765-1-kan.liang@linux.intel.co...)
happens again but in a different way. :(
As the commit message above link described,  cpu-clock (and task-clock) is
a specific SW event which rely on hrtimer. The hrtimer handler calls
__perf_event_overflow() and then event_stop (cpu_clock_event_stop()) and
eventually call hrtimer_cancel() which traps into a dead loop which waits
for the calling hrtimer handler finishes.
As the
change (https://lore.kernel.org/all/20250606192546.915765-1-kan.liang@linux.intel.co...),
it should be enough to just disable the event and don't need an extra event
stop.
@Octavia, could you please check if the change below can fix this issue?
Thanks.

diff --git a/kernel/events/core.c b/kernel/events/core.c
index 7541f6f85fcb..883b0e1fa5d3 100644
--- a/kernel/events/core.c
+++ b/kernel/events/core.c
@@ -10343,7 +10343,20 @@ static int __perf_event_overflow(struct perf_event
*event,
                ret = 1;
                event->pending_kill = POLL_HUP;
                perf_event_disable_inatomic(event);
-               event->pmu->stop(event, 0);



+               /*
+                * The cpu-clock and task-clock are two special SW events,
+                * which rely on the hrtimer. The __perf_event_overflow()
+                * is invoked from the hrtimer handler for these 2 events.
+                * Avoid to call event_stop()->hrtimer_cancel() for these
+                * 2 events since hrtimer_cancel() waits for the hrtimer
+                * handler to finish, which would trigger a deadlock.
+                * Only disabling the events is enough to stop the hrtimer.
+                * See perf_swevent_cancel_hrtimer().
+                */
+               if (event->attr.config != PERF_COUNT_SW_CPU_CLOCK &&
+                   event->attr.config != PERF_COUNT_SW_TASK_CLOCK)
+                       event->pmu->stop(event, 0);
This is broken though; you cannot test config without first knowing
which PMU you're dealing with.
Ah, yes. Just ignore this.
...
Also, that timer really should get stopped, we can't know for certain
this overflow is of the timer itself or not, it could be a related
event.
Something like the below might do -- but please carefully consider the
cases where hrtimer_try_to_cancel() might fail; in those cases we'll
have set HES_STOPPED and the hrtimer callback *SHOULD* observe this and
NORESTART.
But I didn't check all the details.
The only reason that hrtimer_try_to_cancel() could fail is that the hrtimer
callback is currently executing, so current change should be fine.
...
diff --git a/kernel/events/core.c b/kernel/events/core.c
index 820127536e62..a91481d57841 100644
--- a/kernel/events/core.c
+++ b/kernel/events/core.c
@@ -11756,7 +11756,8 @@ static enum hrtimer_restart perf_swevent_hrtimer(struct hrtimer *hrtimer)
 
   event = container_of(hrtimer, struct perf_event, hw.hrtimer);

if (event->state != PERF_EVENT_STATE_ACTIVE)


if (event->state != PERF_EVENT_STATE_ACTIVE ||
   event->hw.state & PERF_HES_STOPPED)

return HRTIMER_NORESTART;

event->pmu->read(event);
@@ -11810,7 +11811,7 @@ static void perf_swevent_cancel_hrtimer(struct perf_event *event)
   	ktime_t remaining = hrtimer_get_remaining(&hwc->hrtimer);
   	local64_set(&hwc->period_left, ktime_to_ns(remaining));

hrtimer_cancel(&hwc->hrtimer);




hrtimer_try_to_cancel(&hwc->hrtimer);

}

}
 
@@ -11854,12 +11855,14 @@ static void cpu_clock_event_update(struct perf_event *event)
 
 static void cpu_clock_event_start(struct perf_event *event, int flags)
 {

event->hw.state = 0;
local64_set(&event->hw.prev_count, local_clock());
perf_swevent_start_hrtimer(event);

}
 
 static void cpu_clock_event_stop(struct perf_event *event, int flags)
 {

event->hw.state = PERF_HES_STOPPED;
perf_swevent_cancel_hrtimer(event);
if (flags & PERF_EF_UPDATE)
cpu_clock_event_update(event);

Besides cpu-clock, task-clock should need similar change as well. I would
post a complete change later.

    

2025

2024

2023

2022

2021

2020

2019

2018

2017

Re: [REGRESSION] bisected: perf: hang when using async-profiler caused by perf: Fix the POLL_HUP delivery breakage