Hi,
It is said eBPF is a safe way to extend kernels and that is very attarctive, but we need to use kfuncs to add new usage of eBPF and kfuncs are said as unstable as EXPORT_SYMBOL_GPL. So now I'd like to ask some questions:
1) Which should I choose, BPF kfuncs or ioctl, when adding a new feature for userspace apps? 2) How should I use BPF kfuncs from userspace apps if I add them?
Here, a "userspace app" means something not like a system-wide daemon like systemd (particularly, I have QEMU in mind). I'll describe the context more below:
---
I'm working on a new feature that aids virtio-net implementations using tuntap virtual network device. You can see [1] for details, but basically it's to extend BPF_PROG_TYPE_SOCKET_FILTER to report four more bytes.
However, with long discussions we have confirmed extending BPF_PROG_TYPE_SOCKET_FILTER is not going to happen, and adding kfuncs is the way forward. So I decided how to add kfuncs to the kernel and how to use it. There are rich documentations for the kernel side, but I found little about the userspace. The best I could find is a systemd change proposal that is based on WIP kernel changes[2].
So now I'm wondering how I should use BPF kfuncs from userspace apps if I add them. In the systemd discussion, it is told that Linus said it's fine to use BPF kfuncs in a private infrastructure big companies own, or in systemd as those users know well about the system[3]. Indeed, those users should be able to make more assumptions on the kernel than "normal" userspace applications can.
Returning to my proposal, I'm proposing a new feature to be used by QEMU or other VMM applications. QEMU is more like a normal userspace application, and usually does not make much assumptions on the kernel it runs on. For example, it's generally safe to run a Debian container including QEMU installed with apt on Fedora. BPF kfuncs may work even in such a situation thanks to CO-RE, but it sounds like *accidentally* creating UAPIs.
Considering all above, how can I integrate BPF kfuncs to the application?
If BPF kfuncs are like EXPORT_SYMBOL_GPL, the natural way to handle them is to think of BPF programs as some sort of kernel modules and incorporate logic that behaves like modprobe. More concretely, I can put eBPF binaries to a directory like: /usr/local/share/qemu/ebpf/$KERNEL_RELEASE
Then, QEMU can uname() and get the path to the binary. It will give an error if it can't find the binary for the current kernel so that it won't create accidental UAPIs.
The obvious downside of this is that it complicates packaging a lot; it requires packaging QEMU eBPF binaries each time a new kernel comes up. This complexity is centrally managed by modprobe for kernel modules, but apparently each application needs to take care of it for BPF programs.
In conclusion, I see too much complexity to use BPF in a userspace application, which we didn't have to care for BPF_PROG_TYPE_SOCKET_FILTER. Isn't there a better way? Or shouldn't I use BPF in my case in the first place?
Thanks, Akihiko Odaki
[1] https://lore.kernel.org/all/20231015141644.260646-1-akihiko.odaki@daynix.com... [2] https://github.com/systemd/systemd/pull/29797 [3] https://github.com/systemd/systemd/pull/29797#discussion_r1384637939