On 11/5/21 07:04, Leonard Crestez wrote:
On 11/5/21 3:22 AM, Dmitry Safonov wrote:
[..]
I remember we discussed it in RFC, that removing a key that's currently in use may result in random MKT to be used.
I think, it's possible to make this API a bit more predictable if:
- DEL command fails to remove a key that is current/receive_next;
- opt.flags has CURR/NEXT flag that has corresponding `u8 current_key`
and `u8 receive_next` values. As socket lock is held - that makes current_key/receive_next change atomic with deletion of an existing key that might have been in use.
In result user may remove a key that's not in use or has to set new current/next. Which avoids the issue with random MKT being used to sign segments.
The MKT used to sign segments is already essentially random unless the user makes a deliberate choice. This is what happens if you add two keys an call connect(). But why is this a problem?
The issue is predictability and less control for a user on how the key is selected.
Let's say as a user I have two MKTs A and B. I want to use A for 6 weeks and then change to B. I want to switch to B as soon as the admin of the peer adds the key and the peer sends me (rnext_key = B.id).
With your semantics currently a random key will be used as long as I don't "lock" the id which means that rnext_key won't be respected. So there's clearly less predictability for a user to select current key in use.
Applications which want to deliberately control the send key can do so with TCP_AUTHOPT_FLAG_LOCK_KEYID. If that flag is not set then the key with send_id == recv_rnextkeyid is preffered as suggested by the RFC, or a random one on connect.
I think your suggestion would force additional complexity on all applications for no clear gain.
I disagree. From RFC (3.1):
"It is presumed that an MKT affecting a particular connection cannot be destroyed during an active connection -- or, equivalently, that its parameters are copied to an area local to the connection (i.e., instantiated) and so changes would affect only new connections."
which means that the user shouldn't be able to remove a key in use. So, by default you should return an error if the key in use being deleted.
The only use-case to delete a key that is in use is if it has been compromised RFC(6.1):
"Deciding when to start using a key is a performance issue. Deciding when to remove an MKT is a security issue. Invalid MKTs are expected to be removed. TCP-AO provides no mechanism to coordinate their removal, as we consider this a key management operation."
I might misread the RFC, but it seems that shouldn't happen in an ordinary usage scenario (as long as the user don't --force removal of the compromised key in an exceptional case).
So, if you allow a user to set current_key/rnext_key atomically with removal - it seems to fit this --force use-case and let user more control over which key is in use.
Key selection controls are only added much later in the series, this is also part of the effort to split the code into readable patches. See this patch:
https://lore.kernel.org/netdev/2dc569c0d60c80c26aafcaa201ba5b5ec53ce6bd.1635...
A separate issue with that one (if I'm not misreading) seems to be that you're going to send segments with info->send_rnextkeyid if the deleted key was TCP_AUTHOPT_FLAG_LOCK_RNEXTKEYID one. And won't be able to verify the peer inbound segments/replies.
Removing a key while traffic is happening shouldn't cause failures in recv or send code; this takes some effort but is also required to prevent auth failures when a socket is closed and transitions to timewait. I attempted to ensure this by only doing rcu_dereference for tcp_authopt_info and tcp_authopt_key_info once per packet.
Thanks, Dmitry