On Tue, Jul 18, 2023 at 11:11 AM Jakub Kicinski kuba@kernel.org wrote:
On Tue, 18 Jul 2023 10:50:14 -0700 Alexei Starovoitov wrote:
On Tue, Jul 18, 2023 at 10:18 AM Jakub Kicinski kuba@kernel.org wrote:
you're still missing the point. Pls read the whole patch series.
Could you just tell me what the point is then? The "series" is one patch plus some tiny selftests. I don't see any documentation for how dynptrs are supposed to work either.
As far as I can grasp this makes the "copy buffer" optional from the kfunc-API perspective (of bpf_dynptr_slice()).
It is _not_ input validation. skb_copy_bits is a slow path. One extra check doesn't affect performance at all. So 'fast paths' isn't a valid argument here. The code is reusing if (likely(hlen - offset >= len)) return (void *)data + offset; which _is_ the fast path.
What you're requesting is to copy paste the whole __skb_header_pointer into __skb_header_pointer2. Makes no sense.
No, Alexei, the whole point of skb_header_pointer() is to pass the secondary buffer, to make header parsing dependable.
of course. No one argues about that.
Passing NULL buffer to skb_header_pointer() is absolutely nonsensical.
Quick grep through the code proves you wrong: drivers/net/ethernet/broadcom/bnxt/bnxt.c __skb_header_pointer(NULL, start, sizeof(*hp), skb->data, skb_headlen(skb), NULL);
was done before this patch. It's using __ variant on purpose and explicitly passing skb==NULL to exactly trigger that line to deliberately avoid the slow path.
Another example: drivers/net/ethernet/intel/ixgbevf/ixgbevf_main.c skb_header_pointer(skb, 0, 0, NULL);
This one I'm not sure about. Looks buggy.
These are both Tx path for setting up offloads, Linux doesn't request offloads for headers outside of the linear part. The ixgbevf code is completely pointless, as you say.
In general drivers are rarely a source of high quality code examples. Having been directly involved in the bugs that lead to the bnxt code being written - I was so happy that the driver started parsing Tx packets *at all*, so I wasn't too fussed by the minor problems :(
It should *not* be supported. We had enough prod problems with people thinking that the entire header will be in the linear portion. Then either the NIC can't parse the header, someone enables jumbo, disables GRO, adds new HW, adds encap, etc etc and things implode.
I don't see how this is related. NULL buffer allows to get a linear pointer and explicitly avoids slow path when it's not linear.
Direct packet access via skb->data is there for those who want high speed 🤷️
skb->data/data_end approach unfortunately doesn't work that well. Too much verifier fighting. That's why dynptr was introduced.
If you want to support it in BPF that's up to you, but I think it's entirely reasonable for me to request that you don't do such things in general networking code. The function is 5 LoC, so a local BPF copy seems fine. Although I'd suggest skb_header_pointer_misguided() rather than __skb_header_pointer2() as the name :)
If you insist we can, but bnxt is an example that buffer==NULL is a useful concept for networking and not bpf specific. It also doesn't make "people think the header is linear" any worse.
My worry is that people will think that whether the buffer is needed or not depends on _their program_, rather than on the underlying platform. So if it works in testing without the buffer - the buffer must not be required for their use case.
Are you concerned about bpf progs breaking this way? I thought you're worried about the driver misusing skb_header_pointer() with buffer==NULL.
We can remove !buffer check as in the attached patch, but I don't quite see how it would improve driver quality.