Re: [PATCH v4 00/15] Add futex2 syscalls

8 Jun 2021


      Excerpts from Andrey Semashev's message of June 6, 2021 11:15 pm:
...
On 6/6/21 2:57 PM, Nicholas Piggin wrote:
...
Excerpts from Andrey Semashev's message of June 5, 2021 6:56 pm:
...
On 6/5/21 4:09 AM, Nicholas Piggin wrote:
...
Excerpts from André Almeida's message of June 5, 2021 6:01 am:
...
Às 08:36 de 04/06/21, Nicholas Piggin escreveu:
...
...
I'll be burned at the stake for suggesting it but it would be great if
we could use file descriptors. At least for the shared futex, maybe
private could use a per-process futex allocator. It solves all of the
above, although I'm sure has many of its own problem. It may not play
so nicely with the pthread mutex API because of the whole static
initialiser problem, but the first futex proposal did use fds. But it's
an example of an alternate API.
FDs and futex doesn't play well, because for futex_wait() you need to
tell the kernel the expected value in the futex address to avoid
sleeping in a free lock. FD operations (poll, select) don't have this
`value` argument, so they could sleep forever, but I'm not sure if you
had taken this in consideration.
I had. The futex wait API would take a fd additional. The only
difference is the waitqueue that is used when a sleep or wake is
required is derived from the fd, not from an address.
I think the bigger sticking points would be if it's too heavyweight an
object to use (which could be somewhat mitigated with a simpler ida
allocator although that's difficult to do with shared), and whether libc
could sanely use them due to the static initialiser problem of pthread
mutexes.
The static initialization feature is not the only benefit of the current
futex design, and probably not the most important one. You can work
around the static initialization in userspace, e.g. by initializing fd
to an invalid value and creating a valid fd upon the first use. Although
that would still incur a performance penalty and add a new source of
failure.
Sounds like a serious problem, but maybe it isn't. On the other hand,
maybe we don't have to support pthread mutexes as they are anyway
because futex already does that fairly well.
...
What is more important is that waiting on fd always requires a kernel
call. This will be terrible for performance of uncontended locks, which
is the majority of time.
No. As I said just before, it would be the same except the waitqueue is
derived from fd rather than address.
Sorry, in that case I'm not sure I understand how that would work. You 
do need to allocate a fd, do you?
Yes. As I said, imagine a futex_wait API that also takes a fd. The
wait queue is derived from that fd rather than the hash table.
...
...
...
Another important point is that a futex that is not being waited on
consumes zero kernel resources while fd is a limited resource even when
not used. You can have millions futexes in userspace and you are
guaranteed not to exhaust any limit as long as you have memory. That is
an important feature, and the current userspace is relying on it by
assuming that creating mutexes and condition variables is cheap.
Is it an important feture? Would 1 byte of kernel memory per uncontended
futex be okay? 10? 100?
I do see it's very nice the current design that requires no
initialization for uncontended, I'm just asking questions to get an idea
of what constraints we're working with. We have a pretty good API
already which can support unlimited uncontended futexes, so I'm
wondering do we really need another very very similar API that doesn't
fix the really difficult problems of the existing one?
It does provide the very much needed features that are missing in the 
current futex. Namely, more futex sizes and wait for multiple. So the 
argument of "why have two similar APIs" is not quite fair. It would be, 
if there was feature parity with futex.
It does provide some extra features sure, with some straightforward 
extension of the existing API. The really interesting or tricky part of
the API is left unchanged though.
My line of thinking is that while we're changing the API anyway, we 
should see if it can be changed to help those other problems too.
...
I believe, the low cost of a futex is an important feature, and was one 
of the reasons for its original design and introduction.
It is of course. The first futex proposal did use fds, interestingly.
I didn't look back further into the libc side of that thing, but maybe
I should.
...
Otherwise we 
would be using eventfds in mutexes.
I don't think so, not even if eventfd came before the futex syscall.
...
One other feature that I didn't mention earlier and which follows from 
its "address in memory" design is the ability to use futexes in 
process-shared memory. This is important for process-shared pthread 
components, too, but has its own value even without this, if you use 
futexes directly. With fds, you can't place the fd in a shared memory 
since every process needs to have its own fd referring to the same 
kernel object, and passing fds cannot be done without a UNIX socket. 
This is incompatible with pthreads API design and would require 
non-trivial design changes to the applications using futexes directly.
That may be true. file is a natural object to share such a resource, but 
the means to share the fd is not so easy. OTOH you could also use a 
syscall to open the same file and get a new fd.
Are shared pthread mutexes using existing pthread APIs that are today
implemented okay with futex1 system call a good reason to constrain 
futex2 I wonder? Or do we have an opportunity to make a bigger change
to the API so it suffers less from non deterministic latency (for
example)?
I don't want to limit it to just files vs addresses, fds was an example 
of something that could solve some of the problems.
Thanks,
Nick

2025

2024

2023

2022

2021

2020

2019

2018

2017

Re: [PATCH v4 00/15] Add futex2 syscalls