By default when users program perf to sample branch instructions (PERF_COUNT_HW_BRANCH_INSTRUCTIONS) with a sample period of 1, perf interprets this as a special case and enables BTS (Branch Trace Store) as an optimization to avoid taking an interrupt on every branch.
Since BTS doesn't virtualize, this optimization doesn't make sense when the request originates from a guest. Add an additional check that prevents this optimization for virtualized events (exclude_host).
Reported-by: Jan H. Schönherr jschoenh@amazon.de Suggested-by: Peter Zijlstra peterz@infradead.org Signed-off-by: Fernand Sieber sieberf@amazon.com --- arch/x86/events/perf_event.h | 13 +++++++++++-- 1 file changed, 11 insertions(+), 2 deletions(-)
diff --git a/arch/x86/events/perf_event.h b/arch/x86/events/perf_event.h index 3161ec0a3416..f2e2d9b03367 100644 --- a/arch/x86/events/perf_event.h +++ b/arch/x86/events/perf_event.h @@ -1574,13 +1574,22 @@ static inline bool intel_pmu_has_bts_period(struct perf_event *event, u64 period struct hw_perf_event *hwc = &event->hw; unsigned int hw_event, bts_event;
- if (event->attr.freq) + /* + * Only use BTS for fixed rate period==1 events. + */ + if (event->attr.freq || period != 1) + return false; + + /* + * BTS doesn't virtualize. + */ + if (event->attr.exclude_host) return false;
hw_event = hwc->config & INTEL_ARCH_EVENT_MASK; bts_event = x86_pmu.event_map(PERF_COUNT_HW_BRANCH_INSTRUCTIONS);
- return hw_event == bts_event && period == 1; + return hw_event == bts_event; }
static inline bool intel_pmu_has_bts(struct perf_event *event)