On 1/17/24 7:55 AM, Willem de Bruijn wrote:
Martin KaFai Lau wrote:
On 1/16/24 7:17 AM, Willem de Bruijn wrote:
Jörn-Thorben Hinz wrote:
A BPF application, e.g., a TCP congestion control, might benefit from or even require precise (=hardware) packet timestamps. These timestamps are already available through __sk_buff.hwtstamp and bpf_sock_ops.skb_hwtstamp, but could not be requested: BPF programs were not allowed to set SO_TIMESTAMPING* on sockets.
This patch only uses the SOF_TIMESTAMPING_RX_HARDWARE in the selftest. How about others? e.g. the SOF_TIMESTAMPING_TX_* that will affect the sk->sk_error_queue which seems not good. If rx tstamp is useful, tx tstamp should be useful also?
Good point. Or should not be allowed to be set from BPF.
That significantly changes process behavior, e.g., by returning POLLERR.
Enable BPF programs to actively request the generation of timestamps from a stream socket. The also required ioctl(SIOCSHWTSTAMP) on the network device must still be done separately, in user space.
hmm... so both ioctl(SIOCSHWTSTAMP) of the netdevice and the SOF_TIMESTAMPING_RX_HARDWARE of the sk must be done?
I likely miss something. When skb is created in the driver rx path, the sk is not known yet though. How the SOF_TIMESTAMPING_RX_HARDWARE of the sk affects the skb_shinfo(skb)->hwtstamps?
Indeed it does not seem to do anything in the datapath.
Requesting SOF_TIMESTAMPING_RX_SOFTWARE will call net_enable_timestamp to start timestamping packets.
But SOF_TIMESTAMPING_RX_HARDWARE does not so thing.
Drivers do use it in ethtool get_ts_info to signal hardware capabilities. But those must be configured using the ioctl.
It is there more for consistency with the other timestamp recording options, I suppose.
Thanks for the explanation on the SOF_TIMESTAMPING_RX_{HARDWARE,SOFTWARE}.
__sk_buff.hwtstamp should have the NIC rx timestamp then as long as the NIC is ioctl configured.
Jorn, do you need RX_SOFTWARE? From looking at net_timestamp_set(), any socket requested RX_SOFTWARE should be enough to get a skb->tstamp for all skbs. A workaround is to manually create a socket and turn on RX_SOFTWARE.
It will still be nice to get proper bpf_setsockopt() support for RX_SOFTWARE but it should be considered together with how SO_TIMESTAMPING_TX_* should work in bpf prog considering the TX tstamping does not have a workaround solution like RX_SOFTWARE.
It is probably cleaner to have a separate bit in sk->sk_tsflags for bpf such that the bpf prog won't be affected by the userspace turning it on/off and it won't change the userspace's expectation also (e.g. sk_error_queue and POLLERR).
The part that needs more thoughts in the tx tstamp is how to notify the bpf prog to consume it. Potentially the kernel can involve a bpf prog to collect the tx timestamp when the bpf bit in sk->sk_tsflags is set. An example on how TCP-CC is using it will help to think of the approach here.