On 12/2/22 12:49 PM, Eyal Birger wrote:
On Fri, Dec 2, 2022 at 10:27 PM Martin KaFai Lau martin.lau@linux.dev wrote:
On 12/2/22 11:42 AM, Eyal Birger wrote:
Hi Martin,
On Fri, Dec 2, 2022 at 9:08 PM Martin KaFai Lau martin.lau@linux.dev wrote:
On 12/2/22 1:59 AM, Eyal Birger wrote:
+__used noinline +int bpf_skb_set_xfrm_info(struct __sk_buff *skb_ctx,
const struct bpf_xfrm_info *from)
+{
struct sk_buff *skb = (struct sk_buff *)skb_ctx;
struct metadata_dst *md_dst;
struct xfrm_md_info *info;
if (unlikely(skb_metadata_dst(skb)))
return -EINVAL;
md_dst = this_cpu_ptr(xfrm_md_dst);
info = &md_dst->u.xfrm_info;
info->if_id = from->if_id;
info->link = from->link;
skb_dst_force(skb);
info->dst_orig = skb_dst(skb);
dst_hold((struct dst_entry *)md_dst);
skb_dst_set(skb, (struct dst_entry *)md_dst);
I may be missed something obvious and this just came to my mind,
What stops cleanup_xfrm_interface_bpf() being run while skb is still holding the md_dst?
Oh I think you're right. I missed this.
In order to keep this implementation I suppose it means that the module would not be allowed to be removed upon use of this kfunc. but this could be seen as annoying from the configuration user experience.
Alternatively the metadata dsts can be separately allocated from the kfunc, which is probably the simplest approach to maintain, so I'll work on that approach.
If it means dst_alloc on every skb, it will not be cheap.
Another option is to metadata_dst_alloc_percpu() once during the very first bpf_skb_set_xfrm_info() call and the xfrm_md_dst memory will never be freed. It is a tradeoff but likely the correct one. You can take a look at bpf_get_skb_set_tunnel_proto().
Yes, I originally wrote this as a helper similar to the tunnel key helper which uses bpf_get_skb_set_tunnel_proto(), and when converting to kfuncs I kept the percpu implementation.
However, the set tunnel key code is never unloaded. Whereas taking this approach here would mean that this memory would leak on each module reload iiuc.
'struct metadata_dst __percpu *xfrm_md_dst' cannot be in the xfrm module. filter.c could be an option.