On 8/3/2021 12:17 PM, Peter Zijlstra wrote:
On Tue, Aug 03, 2021 at 11:20:20AM -0400, Liang, Kan wrote:
On 8/3/2021 10:55 AM, Peter Zijlstra wrote:
On Tue, Aug 03, 2021 at 06:25:28AM -0700, kan.liang@linux.intel.com wrote:
From: Kan Liang kan.liang@linux.intel.com
A warning as below may be occasionally triggered in an ADL machine when these conditions occur,
- Two perf record commands run one by one. Both record a PEBS event.
- Both runs on small cores.
- They have different adaptive PEBS configuration (PEBS_DATA_CFG).
[ 673.663291] WARNING: CPU: 4 PID: 9874 at arch/x86/events/intel/ds.c:1743 setup_pebs_adaptive_sample_data+0x55e/0x5b0 [ 673.663348] RIP: 0010:setup_pebs_adaptive_sample_data+0x55e/0x5b0 [ 673.663357] Call Trace: [ 673.663357] <NMI> [ 673.663357] intel_pmu_drain_pebs_icl+0x48b/0x810 [ 673.663360] perf_event_nmi_handler+0x41/0x80 [ 673.663368] </NMI> [ 673.663370] __perf_event_task_sched_in+0x2c2/0x3a0
Different from the big core, the small core requires the ACK right before re-enabling counters in the NMI handler, otherwise a stale PEBS record may be dumped into the later NMI handler, which trigger the warning.
Add a new mid_ack flag to track the case. Add all PMI handler bits in the struct x86_hybrid_pmu to track the bits for different types of PMUs. Apply mid ACK for the small cores on an Alder Lake machine.
Why do we need a new option? Why isn't early (as in not late) good enough?
The early ACK can fix this issue, however it triggers a spurious NMI during the stress test. I'm told to do the ACK right before re-enabling counters for small cores. That indeed fixes all the issues.
Any chance that would also work for the chips that now use late_ack?
Let me check and do some tests.
Thanks, Kan