November 20, 2025 at 20:58, "Jakub Sitnicki" <jakub@cloudflare.com mailto:jakub@cloudflare.com?to=%22Jakub%20Sitnicki%22%20%3Cjakub%40cloudflare.com%3E > wrote:
On Thu, Nov 20, 2025 at 02:49 AM GMT, Jiayuan Chen wrote:
November 20, 2025 at 03:53, "Jakub Sitnicki" <jakub@cloudflare.com mailto:jakub@cloudflare.com?to=%22Jakub%20Sitnicki%22%20%3Cjakub%40cloudflare.com%3E > wrote:
[...]
+/* The BPF program sets BPF_F_INGRESS on sk_msg to indicate data needs to be
- redirected to the ingress queue of a specified socket. Since BPF_F_INGRESS is
- defined in UAPI so that we can't extend this enum for our internal flags. We
- define some internal flags here while inheriting BPF_F_INGRESS.
- */
+enum {
- SK_MSG_F_INGRESS = BPF_F_INGRESS, /* (1ULL << 0) */
- /* internal flag */
- SK_MSG_F_INGRESS_SELF = (1ULL << 1)
+};
I'm wondering if we need additional state to track this. Can we track sk_msg's construted from skb's that were not redirected by setting `sk_msg.sk = sk` to indicate that the source socket is us in sk_psock_skb_ingress_self()?
Functionally, that would work. However, in that case, we would have to hold a reference to sk until the sk_msg is read, which would delay the release of sk. One concern is that if there is a bug in the read-side application, sk might never be released.
We don't need to grab a reference to sk if we're talking about setting it only in sk_psock_skb_ingress_self(). psock already holds a ref for psock->sk, and we purge psock->ingress_msg queue when destroying the psock before releasing the sock ref in sk_psock_destroy().
I see. When it's an ingress to self redirection, the msg.sk would point to the same socket as psock->sk (the socket itself), not to another socket, so indeed no additional reference grab is needed.
While there's nothing wrong with an internal flaag, I'm trying to see if we make things somewhat consistent so as a result sk_msg state is easier to reason about.
My thinking here is that we already set sk_msg.sk to source socket in sk_psock_msg_verdict() on sendmsg() path, so we know that this is the purpose of that field. We could mimic this on recvmsg() path.