Hi all,
We found the following issue with Linux v6.12-rc7 on KunPeng920 platform. A very low frequency problem of pc value mismatch occurs when using perf hardware breakpoints to trigger a signal handler and comparing pc from u_context with its desired value.
Attached please find the reproducer for this issue. It applies perf events to set a hardware breakpoint to an address while binding a signal to the perf event fd. When stepping into the breakpoint address, the signal handler compares pc value copied from signal ucontext with the real breakpoint address and see if there is a mismatch.
While looking into the flow of execution: // normal flow of exe: a.out-19844 [038] d... 8763.348609: hw_breakpoint_control: perf user addr: 400c6c, bp addr: 400c6c, ops: 0 // breakpoint exception: a.out-19844 [038] d... 8763.348611: do_debug_exception: ec: 0, pc: 400c6c, pstate: 20001000 a.out-19844 [038] d... 8763.348611: breakpoint_handler: perf bp read: 400c6c // send signal: a.out-19844 [038] d.h. 8763.348613: send_sigio_to_task <-send_sigio a.out-19844 [038] d.h. 8763.348614: <stack trace> => send_sigio_to_task => send_sigio => kill_fasync_rcu => kill_fasync => perf_event_wakeup => perf_pending_event => irq_work_single => irq_work_run_list => irq_work_run => do_handle_IPI => ipi_handler => handle_percpu_devid_fasteoi_ipi => __handle_domain_irq => gic_handle_irq => el0_irq_naked a.out-19844 [038] d.h. 8763.348614: send_sigio_to_task: step in with signum 38 a.out-19844 [038] d.h. 8763.348615: send_sigio_to_task: will do_send_sig_info 38 // kernel signal handling: a.out-19844 [038] .... 8763.348616: do_notify_resume: thread_flags 2097665, _TIF_SIGPENDING 1, _TIF_NOTIFY_SIGNAL40 a.out-19844 [038] .... 8763.348617: setup_sigframe: restore sig: 400c6c // single step exception: a.out-19844 [038] d... 8763.348619: do_debug_exception: ec: 1, pc: 400988, pstate: 20001000 a.out-19844 [038] d... 8763.348621: hw_breakpoint_control: perf user addr: 400c6c, bp addr: 400c6c, ops: 1
// abnormal flow of exe: a.out-19844 [084] d... 8763.782103: hw_breakpoint_control: perf user addr: 400c6c, bp addr: 400c6c, ops: 0 // breakpoint exception: a.out-19844 [084] d... 8763.782104: do_debug_exception: ec: 0, pc: 400c6c, pstate: 20001000 // single step exception: a.out-19844 [084] d... 8763.782105: breakpoint_handler: perf bp read: 400c6c a.out-19844 [084] d... 8763.782107: do_debug_exception: ec: 1, pc: 400c70, pstate: 20001000 // send signal: a.out-19844 [084] d.h. 8763.782108: send_sigio_to_task <-send_sigio a.out-19844 [084] d.h. 8763.782109: <stack trace> => send_sigio_to_task => send_sigio => kill_fasync_rcu => kill_fasync => perf_event_wakeup => perf_pending_event => irq_work_single => irq_work_run_list => irq_work_run => do_handle_IPI => ipi_handler => handle_percpu_devid_fasteoi_ipi => __handle_domain_irq => gic_handle_irq => el0_irq_naked a.out-19844 [084] d.h. 8763.782110: send_sigio_to_task: step in with signum 38 a.out-19844 [084] d.h. 8763.782110: send_sigio_to_task: will do_send_sig_info 38 // kernel signal handling: a.out-19844 [084] .... 8763.782111: do_notify_resume: thread_flags 513, _TIF_SIGPENDING 1, _TIF_NOTIFY_SIGNAL40 a.out-19844 [084] .... 8763.782113: setup_sigframe: restore sig: 400c70 a.out-19844 [084] d... 8763.782115: hw_breakpoint_control: perf user addr: 400c6c, bp addr: 400c6c, ops: 1
Kernel sends this signal to task through pushing a perf_pending_event (which sends the signal) in irq_work_queue then triggering an IPI to let kernel handle this pended task. seems that when mismatch occurs, the IPI does not preempt the breakpoint exception it should preempt, causing no pending signals to be handled. In this way, there is no pending signals in do_notify_resume and kernel will not set up signal frame with correct pc value.
Thanks, Chen
linux-stable-mirror@lists.linaro.org