On Tue, 2023-05-16 at 22:15 +0200, Nicolas Dichtel wrote:
With a raw socket bound to IPPROTO_RAW (ie with hdrincl enabled), the protocol field of the flow structure, build by raw_sendmsg() / rawv6_sendmsg()), is set to IPPROTO_RAW. This breaks the ipsec policy lookup when some policies are defined with a protocol in the selector.
For ipv6, the sin6_port field from 'struct sockaddr_in6' could be used to specify the protocol. Just accept all values for IPPROTO_RAW socket.
For ipv4, the sin_port field of 'struct sockaddr_in' could not be used without breaking backward compatibility (the value of this field was never checked). Let's add a new kind of control message, so that the userland could specify which protocol is used.
Fixes: 1da177e4c3f4 ("Linux-2.6.12-rc2") CC: stable@vger.kernel.org Signed-off-by: Nicolas Dichtel nicolas.dichtel@6wind.com
The first version has been marked 'Awaiting Upstream'. Steffen confirmed that the 'net' tree should be the target, thus I resend this patch. I also CC stable@vger.kernel.org.
include/net/ip.h | 2 ++ include/uapi/linux/in.h | 1 + net/ipv4/ip_sockglue.c | 15 ++++++++++++++- net/ipv4/raw.c | 5 ++++- net/ipv6/raw.c | 3 ++- 5 files changed, 23 insertions(+), 3 deletions(-)
diff --git a/include/net/ip.h b/include/net/ip.h index c3fffaa92d6e..acec504c469a 100644 --- a/include/net/ip.h +++ b/include/net/ip.h @@ -76,6 +76,7 @@ struct ipcm_cookie { __be32 addr; int oif; struct ip_options_rcu *opt;
- __u8 protocol; __u8 ttl; __s16 tos; char priority;
@@ -96,6 +97,7 @@ static inline void ipcm_init_sk(struct ipcm_cookie *ipcm, ipcm->sockc.tsflags = inet->sk.sk_tsflags; ipcm->oif = READ_ONCE(inet->sk.sk_bound_dev_if); ipcm->addr = inet->inet_saddr;
- ipcm->protocol = inet->inet_num;
} #define IPCB(skb) ((struct inet_skb_parm*)((skb)->cb)) diff --git a/include/uapi/linux/in.h b/include/uapi/linux/in.h index 4b7f2df66b99..e682ab628dfa 100644 --- a/include/uapi/linux/in.h +++ b/include/uapi/linux/in.h @@ -163,6 +163,7 @@ struct in_addr { #define IP_MULTICAST_ALL 49 #define IP_UNICAST_IF 50 #define IP_LOCAL_PORT_RANGE 51 +#define IP_PROTOCOL 52 #define MCAST_EXCLUDE 0 #define MCAST_INCLUDE 1 diff --git a/net/ipv4/ip_sockglue.c b/net/ipv4/ip_sockglue.c index b511ff0adc0a..ec0fbe874426 100644 --- a/net/ipv4/ip_sockglue.c +++ b/net/ipv4/ip_sockglue.c @@ -317,7 +317,17 @@ int ip_cmsg_send(struct sock *sk, struct msghdr *msg, struct ipcm_cookie *ipc, ipc->tos = val; ipc->priority = rt_tos2priority(ipc->tos); break;
case IP_PROTOCOL:
if (cmsg->cmsg_len == CMSG_LEN(sizeof(int)))
val = *(int *)CMSG_DATA(cmsg);
else if (cmsg->cmsg_len == CMSG_LEN(sizeof(u8)))
val = *(u8 *)CMSG_DATA(cmsg);
AFAICS the 'dual' u8 support for IP_TOS has been introduce to cope with asymmetry WRT recvmsg(). Here we don't have (yet) the recvmsg counter- part, and if/when that will be added we can use the correct data type.
I think we are better off supporting only int, as e.g. IP_TTL does.
Side note, the above code could be factored out in an helper to be used both for IP_PROTOCOL and IP_TTL (possibly in a net-next patch).
Thanks!
Paolo