On 9/4/25 10:26 AM, Kees Cook wrote:
On Wed, Sep 03, 2025 at 08:38:03PM +0000, Tom Hromatka wrote:
Add an operation, SECCOMP_CLONE_FILTER, that can copy the seccomp filters from another process to the current process.
I roughly reproduced the Docker seccomp filter [1] and timed how long it takes to build it (via libseccomp) and attach it to a process. After 1000 runs, on average it took 3,740,000 TSC ticks (or ~1440 microseconds) on an AMD EPYC 9J14 running at 2596 MHz. The median build/load time was 3,715,000 TSC ticks.
On the same system, I preloaded the above Docker seccomp filter onto a process. (Note that I opened a pidfd to the reference process and left the pidfd open for the entire run.) I then cloned the filter using the feature in this patch to 1000 new processes. On average, it took 9,300 TSC ticks (or ~3.6 microseconds) to copy the filter to the new processes. The median clone time was 9,048 TSC ticks.
This is approximately a 400x performance improvement for those container managers that are using the exact same seccomp filter across all of their containers.
Thanks for looking it over. I'll make the technical changes in a v2 in the next week or two.
This is a nice speedup, but with devil's advocate hat on, are launchers spawning at rates high enough that this makes a difference?
For users that launch VMs that last hours or more, you are correct, this change doesn't matter to them.
But there are a small subset of users that launch containers at a very high rate and startup times are critical.
FWIW, easyseccomp [1] was created a few years ago in part because generating filters with libseccomp can be challenging and somewhat slow.
Thanks!
Tom