On Thu, Jan 30, 2025 at 03:32:36PM -0800, Andrew Morton wrote:
On Thu, 30 Jan 2025 23:10:53 +0000 Pedro Falcato pedro.falcato@gmail.com wrote:
On Thu, Jan 30, 2025 at 10:53 PM Lorenzo Stoakes lorenzo.stoakes@oracle.com wrote:
The above code sequence doesn't seem at all onerous. I'm not understanding why it's worth altering the kernel to permit this little shortcut?
In practice it adds quite a bit of overhead for something that whatever mechanism is using the pidfd can avoid.
It was specifically intended for a real case of utilising process_madvise(), using the newly extended ability to batch _any_ madvise() operations for the current process, like:
if (process_madvise(PIDFD_SELF, iovec, 10, MADV_GUARD_INSTALL, 0)) { ... error handling ... }
vs.
pid_t pid = getpid(); int pidfd = pidfd_open(pid, PIDFD_THREAD); if (pidfd < 0) { ... error handling ... } if (process_madvise(PIDFD_SELF, iovec, 10, MADV_GUARD_INSTALL, 0)) { ... cleanup pidfd ... ... error handling ... } ... ... cleanup pidfd ...
So in practice, it's actually a lot more ceremony and noise. Suren has been working with this code in practice and found this to be useful.
It's also nice to add that people on the libc/allocator side should also appreciate skipping pidfd_open's reliability concerns (mostly, that RLIMIT_NOFILE Should Not(tm) ever affect thread spawning or a malloc[1]). Besides the big syscall reduction and nice speedup, that is.
[1] whether this is the already case is an exercise left to the reader, but at the very least we should not add onto existing problems
Thanks.
Could we please get all the above spelled out much more thoroughly in the [0/n] description (aka Patch Series Sales Brochure)?
Ack, will expand if there's a respin, or Christian - perhaps you could fold the above explanation into the cover letter?
Intent is for Christian to take this in his tree (if he so wishes) to be clear!
Cheers