2024-12-12, 23:46:11 +0100, Antonio Quartulli wrote:
On 12/12/2024 17:19, Sabrina Dubroca wrote:
2024-12-11, 22:15:10 +0100, Antonio Quartulli wrote:
+static struct ovpn_socket *ovpn_socket_get(struct socket *sock) +{
- struct ovpn_socket *ovpn_sock;
- rcu_read_lock();
- ovpn_sock = rcu_dereference_sk_user_data(sock->sk);
- if (WARN_ON(!ovpn_socket_hold(ovpn_sock)))
Could we hit this situation when we're removing the last peer (so detaching its socket) just as we're adding a new one? ovpn_socket_new finds the socket already attached and goes through the EALREADY path, but the refcount has already dropped to 0?
hm good point.
Then we'd also return NULL from ovpn_socket_new [1], which I don't think is handled well by the caller (at least the netdev_dbg call at the end of ovpn_nl_peer_modify, maybe other spots too).
(I guess it's not an issue you would see with the existing userspace if it's single-threaded)
The TCP patch 11/22 will convert the socket release routine to a scheduled worker.
Oh right, I forgot about that.
This means we can have the following flow:
- userspace deletes a peer -> peer drops its reference to the ovpn_socket
- ovpn_socket refcnt may hit 0 -> cleanup/detach work is scheduled, but not
yet executed 3) userspace adds a new peer -> attach returns -EALREADY but refcnt is 0
So not so impossible, even with a single-threaded userspace software.
True, that seems possible.
[...]
+struct ovpn_socket *ovpn_socket_new(struct socket *sock, struct ovpn_peer *peer) +{
- struct ovpn_socket *ovpn_sock;
- int ret;
- ret = ovpn_socket_attach(sock, peer);
- if (ret < 0 && ret != -EALREADY)
return ERR_PTR(ret);
- /* if this socket is already owned by this interface, just increase the
* refcounter and use it as expected.
*
* Since UDP sockets can be used to talk to multiple remote endpoints,
* openvpn normally instantiates only one socket and shares it among all
* its peers. For this reason, when we find out that a socket is already
* used for some other peer in *this* instance, we can happily increase
* its refcounter and use it normally.
*/
- if (ret == -EALREADY) {
/* caller is expected to increase the sock refcounter before
* passing it to this function. For this reason we drop it if
* not needed, like when this socket is already owned.
*/
ovpn_sock = ovpn_socket_get(sock);
sockfd_put(sock);
[1] so we would need to add
if (!ovpn_sock) return -EAGAIN;
I am not sure returning -EAGAIN is the right move at this point. We don't know when the scheduled worker will execute, so we don't know when to try again.
Right.
Maybe we should call cancel_sync_work(&ovpn_sock->work) inside ovpn_socket_get()? So the latter will return NULL only when it is sure that the socket has been detached.
At that point we can skip the following return and continue along the "new socket" path.
What do you think?
The work may not have been scheduled yet? (small window between the last kref_put and schedule_work)
Maybe a completion [Documentation/scheduler/completion.rst] would solve it (but it makes things even more complex, unfortunately):
- at the end of ovpn_socket_detach: complete(&ovpn_sock->detached); - in ovpn_socket_new when handling EALREADY: wait_for_completion(&ovpn_sock->detached); - in ovpn_socket_new for the new socket: init_completion(&ovpn_sock->detached);
but ovpn_sock could be gone immediately after complete(). Maybe something with completion_done() before the kfree_rcu in ovpn_socket_detach? I'm not that familiar with the completion API.
However, this makes we wonder: what happens if we have two racing PEER_NEW with the same non-yet-attached UDP socket?
mhmm, I remember noticing that, but it seems I never mentioned it in my reviews. Sorry.
Maybe we should lock the socket in ovpn_udp_socket_attach() when checking its user-data and setting it (in order to make the test-and-set atomic)?
I'd use the lock to protect all of ovpn_socket_new. ovpn_tcp_socket_attach locks the socket but after doing the initial checks, so 2 callers could both see sock->sk->sk_user_data == NULL and do the full attach. And I don't think unlocking before rcu_assign_sk_user_data is safe for either UDP or TCP.
I am specifically talking about this in udp.c:
345 /* make sure no pre-existing encapsulation handler exists */ 346 rcu_read_lock(); 347 old_data = rcu_dereference_sk_user_data(sock->sk); 348 if (!old_data) { 349 /* socket is currently unused - we can take it */ 350 rcu_read_unlock(); 351 setup_udp_tunnel_sock(sock_net(sock->sk), sock, &cfg); 352 return 0; 353 }
We will end up returning 0 in both contexts and thus allocate two ovpn_sockets instead of re-using the first one we allocated.
Does it make sense?
Yes.
[...]
[I have some more nits/typos here and there but I worry the maintainers will get "slightly" annoyed if I make you repost 22 patches once again :) -- if that's all I find in the next few days, everyone might be happier if I stash them and we get them fixed after merging?]
If we have to rework this socket attaching part, it may be worth throwing in those typ0 fixes too :)
ACK, I'll send them out.
Thanks a lot.
Thanks again for your patience.