Gur Stavi wrote:
Gur Stavi wrote:
Gur Stavi wrote:
Gur Stavi wrote: > >> @@ -1846,21 +1846,21 @@ static int fanout_add(struct sock
*sk,
struct fanout_args *args) > >> err = -EINVAL; > >> > >> spin_lock(&po->bind_lock); > >> - if (packet_sock_flag(po, PACKET_SOCK_RUNNING) && > >> - match->type == type && > >> + if (match->type == type && > >> match->prot_hook.type == po->prot_hook.type && > >> match->prot_hook.dev == po->prot_hook.dev) { > > > > Remaining unaddressed issue is that the socket can now be
added
> > before being bound. See comment in v1. > > I extended the psock_fanout test with unbound fanout test. > > As far as I understand, the easiest way to verify bind is to
test
that
> po->prot_hook.dev != NULL, since we are under a bind_lock
anyway.
> But perhaps a more readable and direct approach to test "bind"
would be
> to test po->ifindex != -1, as ifindex is commented as "bound
device".
> However, at the moment ifindex is not initialized to -1, I can
add
such
> initialization, but perhaps I do not fully understand all the
logic.
> > Any preferences?
prot_hook.dev is not necessarily set if a packet socket is bound. It may be bound to any device. See dev_add_pack and ptype_head.
prot_hook.type, on the other hand, must be set if bound and is
only
modified with the bind_lock held too.
Well, and in packet_create. But setsockopt PACKET_FANOUT_ADD also succeeds in case bind() was not called explicitly first to bind
to
a specific device or change ptype.
Please clarify the last paragraph? When you say "also succeeds" do
you
mean SHOULD succeed or MAY SUCCEED by mistake if "something"
happens
???
I mean it succeeds currently. Which behavior must then be maintained.
Do you refer to the following scenario: socket is created with non-
zero
protocol and becomes RUNNING "without bind" for all devices. In
that
case
it can be added to FANOUT without bind. Is that considered a bug or
does
the bind requirement for fanout only apply for all-protocol (0)
sockets?
I'm beginning to think that this bind requirement is not needed.
I agree with that. I think that is an historical mistake that socket becomes implicitly bound to all interfaces if a protocol is defined during create. Without this bind requirement would make sense.
All type and dev are valid, even if an ETH_P_NONE fanout group would be fairly useless.
Fanout is all about RX, I think that refusing fanout for socket that will not receive any packet is OK. The condition can be: if (po->ifindex == -1 || !po->num)
Fanout is not limited to sockets bound to a specific interface. This will break existing users.
For specific interface ifindex >= 1 For "any interface" ifindex == 0 ifindex is -1 only if the socket was created unbound with proto == 0 or for the rare race case that during re-bind the new dev became unlisted. For both of these cases fanout should fail.
The only case where packet_create does not call __register_prot_hook is if proto == 0. If proto is anything else, the socket will be bound, whether to a device hook, or ptype_all. I don't think we need this extra ifindex condition.
Binding to ETH_P_NONE is useless, but we're not going to slow down legitimate users with branches for cases that are harmless.
With "branch", do you refer to performance or something else? As I said in other mail, ETH_P_NONE could not be used in a fanout before as well because socket cannot become RUNNING with proto == 0.
Good point.
For performance, we removed the RUNNING condition and added this. It is not like we need to perform 5M fanout registrations/sec. It is a syscall after all.
It's as much about code complexity as performance. Both the patch and resulting code should be as small and self-evident as possible.
Patch v3 introduces a lot of code churn.
If we don't care about opening up fanout groups to ETH_P_NONE, then patch v2 seems sufficient. If explicitly blocking this, the ENXIO return can be added, but ideally without touching the other lines.
I realized another possible problem. We should consider adding ifindex Field to struct packet_fanout to be used for lookup of an existing
match.
There is little sense to bind sockets to different interfaces and then put them in the same fanout group. If you agree, I can prepare a separate patch for that.
The type and dev must match that of the fanout group, and once added to a fanout group can no longer be changed (bind will fail).
I briefy considered the reason might be max_num_members accounting. Since f->num_members counts running sockets. But that is not used when tracking membership of the group, sk_ref is. Every packet socket whose po->rollover is increased increases this refcount.
What about using ifindex to detect bind? Initialize it to -1 in packet_create and ensure that packet_do_bind, on success, sets it to device id or 0?
psock_fanout, should probably be extended with scenarios that test "all devices" and all/specific protocols. Any specific scenario suggestions?