From: Lorenz Bauer lmb@isovalent.com Date: Tue, 20 Jun 2023 15:26:05 +0100
On Tue, Jun 13, 2023 at 7:57 PM Kuniyuki Iwashima kuniyu@amazon.com wrote:
The assignment to result below is buggy. Let's say SO_REUSEPROT group have TCP_CLOSE and TCP_ESTABLISHED sockets.
- Find TCP_CLOSE sk and do SO_REUSEPORT lookup
- result is not NULL, but the group has TCP_ESTABLISHED sk
- result = result
- Find TCP_ESTABLISHED sk, which has a higher score
- result = result (TCP_CLOSE) <-- should be sk.
Same for v6 function.
Thanks for your explanation, I think I get it now. I misunderstood that you were worried about returning TCP_ESTABLISHED instead of TCP_CLOSE, but it's exactly the other way around.
I have a follow up question regarding the existing code:
result = lookup_reuseport(net, sk, skb, saddr, sport, daddr, hnum); /* Fall back to scoring if group has connections */ if (result && !reuseport_has_conns(sk)) return result; result = result ? : sk; badness = score;
Assuming that result != NULL but reuseport_has_conns() == true, we use the reuseport socket as the result, but assign the score of sk to badness. Shouldn't we use the score of the reuseport socket?
Good point. This is based on an assumption that all SO_REUSEPORT sockets have the same score, which is wrong for two corner cases if reuseport_has_conns() == true :
1) SO_INCOMING_CPU is set -> selected sk might have +1 score
2) BPF prog returns ESTABLISHED and/or SO_INCOMING_CPU sk -> selected sk will have more than 8
Using the old score could trigger more lookups depending on the order that sockets are created.
sk -> sk (SO_INCOMING_CPU) -> sk (ESTABLISHED) | | `-> select the next SO_INCOMING_CPU sk | `-> select itself (We should save this lookup)
So, yes, we should update badness like
if (unlikely(result)) { badness = compute_score(result, ...); } else { result = sk; badness = score; }