On Thu, Apr 08, 2021 at 12:35PM +0200, Marco Elver wrote: [...]
Motivation and Example Uses
Our immediate motivation is low-overhead sampling-based race detection for user space [1]. By using perf_event_open() at process initialization, we can create hardware breakpoint/watchpoint events that are propagated automatically to all threads in a process. As far as we are aware, today no existing kernel facility (such as ptrace) allows us to set up process-wide watchpoints with minimal overheads (that are comparable to mprotect() of whole pages).
Other low-overhead error detectors that rely on detecting accesses to certain memory locations or code, process-wide and also only in a specific set of subtasks or threads.
[1] https://llvm.org/devmtg/2020-09/slides/Morehouse-GWP-Tsan.pdf
Other ideas for use-cases we found interesting, but should only illustrate the range of potential to further motivate the utility (we're sure there are more):
Code hot patching without full stop-the-world. Specifically, by setting a code breakpoint to entry to the patched routine, then send signals to threads and check that they are not in the routine, but without stopping them further. If any of the threads will enter the routine, it will receive SIGTRAP and pause.
Safepoints without mprotect(). Some Java implementations use "load from a known memory location" as a safepoint. When threads need to be stopped, the page containing the location is mprotect()ed and threads get a signal. This could be replaced with a watchpoint, which does not require a whole page nor DTLB shootdowns.
Threads receiving signals on performance events to throttle/unthrottle themselves.
Tracking data flow globally.
For future reference:
I often wonder what happened to some new kernel feature, and how people are using it. I'm guessing there must be other users of "synchronous signals on perf events" somewhere by now (?), but the reason the whole thing started was because points #1 and #2 above.
Now 3 years later we were able to open source a framework that does #1 and #2 and more: https://github.com/google/gwpsan - "A framework for low-overhead sampling-based dynamic binary instrumentation, designed for implementing various bug detectors (also called "sanitizers") suitable for production uses. GWPSan does not modify the executed code, but instead performs dynamic analysis from signal handlers."
Documentation is sparse, it's still in development, and probably has numerous sharp corners right now...
That being said, the code demonstrates how low-overhead "process-wide synchronous event handling" thanks to perf events can be used to implement crazier things outside the realm of performance profiling.
Thanks!
-- Marco