Hmm, after taking a new look at it today, I think my patch can be disregarded---at least for having a BPF program access *RX* *hardware* timestamps. (Sorry about the noise then.)
When I looked into this a few months ago, I half-blindly followed Documentation/networking/timestamping.rst, afterwards assuming bpf_setsockopt(SO_TIMESTAMPING*) will be necessary for my use case (see about it at the end).
Looking at it again today, it seems the ioctl(SIOCSHWTSTAMP) is sufficient here: It enables the hardware timestamping on the device, which are placed in skb's/skb_shared_info's hwtstamps field. This hwtstamps is where the values of __sk_buff.hwtstamp and bpf_sock_ops.skb_hwtstamp are coming from. No further timestamp processing is involved when a BPF program reads the these two fields. Meaning bpf_setsockopt(SOF_TIMESTAMPING_RX_HARDWARE) would be a no-op from the view of a BPF program.
I started this message before coming to the above understanding but I've left my replies in below.
With bpf_setsockopt(SOF_TIMESTAMPING_RX_HARDWARE) being unnecessary, and bpf_setsockopt(SOF_TIMESTAMPING_RX_SOFTWARE), as I understand, having a number of possibly unwanted implications---should we leave it at that here?
On Wed, 2024-01-17 at 13:23 -0800, Martin KaFai Lau wrote:
On 1/17/24 7:55 AM, Willem de Bruijn wrote:
Martin KaFai Lau wrote:
On 1/16/24 7:17 AM, Willem de Bruijn wrote: > > Jörn-Thorben Hinz wrote: > > > > A BPF application, e.g., a TCP congestion control, > > > > might > > > > benefit from or > > > > even require precise (=hardware) packet timestamps. > > > > These > > > > timestamps are > > > > already available through __sk_buff.hwtstamp and > > > > bpf_sock_ops.skb_hwtstamp, but could not be > > > > requested: BPF > > > > programs were > > > > not allowed to set SO_TIMESTAMPING* on sockets.
This patch only uses the SOF_TIMESTAMPING_RX_HARDWARE in the selftest. How about others? e.g. the SOF_TIMESTAMPING_TX_* that will affect the sk->sk_error_queue which seems not good. If rx tstamp is useful, tx tstamp should be useful also?
I admit I only ever looked at enabling and using SOF_TIMESTAMPING_RX_HARDWARE for my/our use case. With that, I was not aware that _SOFTWARE has more, possibly complicating implications.
Good point. Or should not be allowed to be set from BPF.
That significantly changes process behavior, e.g., by returning POLLERR.
> > > > > > > > Enable BPF programs to actively request the > > > > generation of > > > > timestamps > > > > from a stream socket. The also required > > > > ioctl(SIOCSHWTSTAMP) > > > > on the > > > > network device must still be done separately, in > > > > user space.
hmm... so both ioctl(SIOCSHWTSTAMP) of the netdevice and the SOF_TIMESTAMPING_RX_HARDWARE of the sk must be done?
I likely miss something. When skb is created in the driver rx path, the sk is not known yet though. How the SOF_TIMESTAMPING_RX_HARDWARE of the sk affects the skb_shinfo(skb)->hwtstamps?
I mostly followed Documentation/networking/timestamping.rst (section 3) to understand how the hardware timestamps are to be setup and used.
From my understanding, the ioctl(SIOCSHWTSTAMP) makes a persistent setting for the device/driver, independent of the lifetime of any socket or skb.
I used a simplified program[1] when trying out this patch a few months ago.
Indeed it does not seem to do anything in the datapath.
Requesting SOF_TIMESTAMPING_RX_SOFTWARE will call net_enable_timestamp to start timestamping packets.
But SOF_TIMESTAMPING_RX_HARDWARE does not so thing.
Drivers do use it in ethtool get_ts_info to signal hardware capabilities. But those must be configured using the ioctl.
It is there more for consistency with the other timestamp recording options, I suppose.
Thanks for the explanation on the SOF_TIMESTAMPING_RX_{HARDWARE,SOFTWARE}.
__sk_buff.hwtstamp should have the NIC rx timestamp then as long as the NIC is ioctl configured.
Jorn, do you need RX_SOFTWARE? From looking at net_timestamp_set(), any socket requested RX_SOFTWARE should be enough to get a skb->tstamp for all skbs. A workaround is to manually create a socket and turn on RX_SOFTWARE.
No, my use case was only for the RX hardware timestamps, as close to the packet reception time point as possible.
It will still be nice to get proper bpf_setsockopt() support for RX_SOFTWARE but it should be considered together with how SO_TIMESTAMPING_TX_* should work in bpf prog considering the TX tstamping does not have a workaround solution like RX_SOFTWARE.
It is probably cleaner to have a separate bit in sk->sk_tsflags for bpf such that the bpf prog won't be affected by the userspace turning it on/off and it won't change the userspace's expectation also (e.g. sk_error_queue and POLLERR).
The part that needs more thoughts in the tx tstamp is how to notify the bpf prog to consume it. Potentially the kernel can involve a bpf prog to collect the tx timestamp when the bpf bit in sk->sk_tsflags is set. An example on how TCP-CC is using it will help to think of the approach here.
My (academic) application was an implementation[2,3] of PowerTCP[4], a CC that (in its simplified variant) profits from precise timestamping. Only the RX timestamps would be of use there.
As mentioned above, I used[1] a while ago when I looked into timestamp usage. It shows how I imagine the timestamps could be accessed and used (similarly implemented in [2]).
[1] https://github.com/jtdor/bpf_hwtstamps [2] https://github.com/inet-tub/powertcp-linux [3] https://schmiste.github.io/ebpf23.pdf [4] https://schmiste.github.io/nsdi22powertcp.pdf