On Mon 29 Mar 2021 at 15:32, Toke Høiland-Jørgensen toke@redhat.com wrote:
Vlad Buslov vladbu@nvidia.com writes:
On Thu 25 Mar 2021 at 14:00, Kumar Kartikeya Dwivedi memxor@gmail.com wrote:
This adds functions that wrap the netlink API used for adding, manipulating, and removing filters and actions. These functions operate directly on the loaded prog's fd, and return a handle to the filter and action using an out parameter (id for tc_cls, and index for tc_act).
The basic featureset is covered to allow for attaching, manipulation of properties, and removal of filters and actions. Some additional features like TCA_BPF_POLICE and TCA_RATE for tc_cls have been omitted. These can added on top later by extending the bpf_tc_cls_opts struct.
Support for binding actions directly to a classifier by passing them in during filter creation has also been omitted for now. These actions have an auto clean up property because their lifetime is bound to the filter they are attached to. This can be added later, but was omitted for now as direct action mode is a better alternative to it.
An API summary:
The BPF TC-CLS API
bpf_tc_act_{attach, change, replace}_{dev, block} may be used to attach, change, and replace SCHED_CLS bpf classifiers. Separate set of functions are provided for network interfaces and shared filter blocks.
bpf_tc_cls_detach_{dev, block} may be used to detach existing SCHED_CLS filter. The bpf_tc_cls_attach_id object filled in during attach, change, or replace must be passed in to the detach functions for them to remove the filter and its attached classififer correctly.
bpf_tc_cls_get_info is a helper that can be used to obtain attributes for the filter and classififer. The opts structure may be used to choose the granularity of search, such that info for a specific filter corresponding to the same loaded bpf program can be obtained. By default, the first match is returned to the user.
Examples:
struct bpf_tc_cls_attach_id id = {}; struct bpf_object *obj; struct bpf_program *p; int fd, r;
obj = bpf_object_open("foo.o"); if (IS_ERR_OR_NULL(obj)) return PTR_ERR(obj);
p = bpf_object__find_program_by_title(obj, "classifier"); if (IS_ERR_OR_NULL(p)) return PTR_ERR(p);
if (bpf_object__load(obj) < 0) return -1;
fd = bpf_program__fd(p);
r = bpf_tc_cls_attach_dev(fd, if_nametoindex("lo"), BPF_TC_CLSACT_INGRESS, ETH_P_IP, NULL, &id); if (r < 0) return r;
... which is roughly equivalent to (after clsact qdisc setup): # tc filter add dev lo ingress bpf obj /home/kkd/foo.o sec classifier
If a user wishes to modify existing options on an attached filter, the bpf_tc_cls_change_{dev, block} API may be used. Parameters like chain_index, priority, and handle are ignored in the bpf_tc_cls_opts struct as they cannot be modified after attaching a filter.
Example:
/* Optional parameters necessary to select the right filter */ DECLARE_LIBBPF_OPTS(bpf_tc_cls_opts, opts, .handle = id.handle, .priority = id.priority, .chain_index = id.chain_index) /* Turn on direct action mode */ opts.direct_action = true; r = bpf_tc_cls_change_dev(fd, id.ifindex, id.parent_id, id.protocol, &opts, &id); if (r < 0) return r;
/* Verify that the direct action mode has been set */ struct bpf_tc_cls_info info = {}; r = bpf_tc_cls_get_info_dev(fd, id.ifindex, id.parent_id, id.protocol, &opts, &info); if (r < 0) return r;
assert(info.bpf_flags & TCA_BPF_FLAG_ACT_DIRECT);
This would be roughly equivalent to doing: # tc filter change dev lo egress prio <p> handle <h> bpf obj /home/kkd/foo.o section classifier da
... except a new bpf program will be loaded and replace existing one.
If a user wishes to either replace an existing filter, or create a new one with the same properties, they can use bpf_tc_cls_replace_dev. The benefit of bpf_tc_cls_change is that it fails if no matching filter exists.
The BPF TC-ACT API
bpf_tc_act_{attach, replace} may be used to attach and replace already attached SCHED_ACT actions. Passing an index of 0 has special meaning, in that an index will be automatically chosen by the kernel. The index chosen by the kernel is the return value of these functions in case of success.
bpf_tc_act_detach may be used to detach a SCHED_ACT action prog identified by the index parameter. The index 0 again has a special meaning, in that passing it will flush all existing SCHED_ACT actions loaded using the ACT API.
bpf_tc_act_get_info is a helper to get the required attributes of a loaded program to be able to manipulate it futher, by passing them into the aforementioned functions.
Example:
struct bpf_object *obj; struct bpf_program *p; __u32 index; int fd, r;
obj = bpf_object_open("foo.o"); if (IS_ERR_OR_NULL(obj)) return PTR_ERR(obj);
p = bpf_object__find_program_by_title(obj, "action"); if (IS_ERR_OR_NULL(p)) return PTR_ERR(p);
if (bpf_object__load(obj) < 0) return -1;
fd = bpf_program__fd(p);
r = bpf_tc_act_attach(fd, NULL, &index); if (r < 0) return r;
if (bpf_tc_act_detach(index)) return -1;
... which is equivalent to the following sequence: tc action add action bpf obj /home/kkd/foo.o sec action tc action del action bpf index <idx>
How do you handle the locking here? Please note that while RTM_{NEW|GET|DEL}FILTER API has been refactored to handle its own locking internally (and registered with RTNL_FLAG_DOIT_UNLOCKED flag), RTM_{NEW|GET|DEL}ACTION API still expects to be called with rtnl lock taken.
Huh, locking? This is all userspace code that uses the netlink API...
-Toke
Thanks for the clarification. I'm not familiar with libbpf internals and it wasn't obvious to me that this functionality is not for creating classifiers/actions from BPF program executing in kernel-space.