[PATCH] netfilter: xt_connlimit: fix race in connection counting - Linux-stable-mirror

19 Nov 2018

An iptable rule like the following on a multicore systems will result in
accepting more connections than set in the rule.
iptables  -A INPUT -p tcp -m tcp --syn --dport 7777 -m connlimit \
      --connlimit-above 2000 --connlimit-mask 0 -j DROP
In check_hlist function, connections that are found in saved connections
but not in netfilter conntrack are deleted, assuming that those
connections do not exist anymore. But for multi core systems, there exists
a small time window, when a connection has been added to the xt_connlimit
maintained rb-tree but has not yet made to netfilter conntrack table. This
causes concurrent connections to return incorrect counts and go over limit
set in iptable rule.
Connection 1 on  Core 1                Connection 2 on Core 2
list_length = N
conntrack_table_len = N
spin_lock_bh()
In check_hlist() function
a. loop over saved connections
  1.  call nf_conntrack_find_get()
  2.  If not found in 1,
    i.  call hlist_del()
b. return total count to caller
c. connection gets added to list
   of  saved connections.
spin_unlock_bh()
    			      list_length = N + 1
                                      spin_lock_bh()   on core 2
    			      In check_hlist() function
                                      a. loop over saved connection.
                                        1. call nf_conntrack_find_get()
                                        2. If not found in  1.
                                          i.   call hlist_del()
    				     [Connection 1 was in the
                                              but not in nf_conntrack yet]
    				  ii. connection 1 gets deleted
list_length = N
    			      conntrack_table_len = N
    			      b. return total count to caller
    			      c. connection 2 gets added to list
    			      of saved connections
    			      spin_unlock_bh()
d. connection 1 gets added to
   nf_conntrack
list_length = N + 1
conntrack_table_len = N + 1
    			      e. connection 2 gets added to
    				 nf_conntrack
    			      list_length = N + 1
    			      conntrack_table_len = N + 2
So we end up with N + 1 connections in the list but N + 2 in nf_conntrack,
allowing more number of connections eventually than set in the rule.
This fix adds an additional field to track such pending connections
and prevent them from being deleted by another execution thread on
a different core and returns correct count.
Signed-off-by: Alakesh Haloi alakeshh@amazon.com
Cc: Pablo Neira Ayuso pablo@netfilter.org
Cc: Jozsef Kadlecsik kadlec@blackhole.kfki.hu
Cc: Florian Westphal fw@strlen.de
Cc: "David S. Miller" davem@davemloft.net
Cc: stable@vger.kernel.org # v4.15 and before
---
 net/netfilter/xt_connlimit.c | 24 +++++++++++++++++++++---
 1 file changed, 21 insertions(+), 3 deletions(-)

diff --git a/net/netfilter/xt_connlimit.c b/net/netfilter/xt_connlimit.c
index ffa8eec980e9..bd7563c209a4 100644
--- a/net/netfilter/xt_connlimit.c
+++ b/net/netfilter/xt_connlimit.c
@@ -47,6 +47,7 @@ struct xt_connlimit_conn {
    struct hlist_node		node;
    struct nf_conntrack_tuple	tuple;
    union nf_inet_addr		addr;
+	bool				pending_add;
 };
struct xt_connlimit_rb {
@@ -126,6 +127,7 @@ static bool add_hlist(struct hlist_head *head,
    	return false;
    conn->tuple = *tuple;
    conn->addr = *addr;
+	conn->pending_add = true;
    hlist_add_head(&conn->node, head);
    return true;
 }
@@ -144,15 +146,31 @@ static unsigned int check_hlist(struct net *net,
*addit = true;
-	/* check the saved connections */
+	/* check the saved connections
+	 */
    hlist_for_each_entry_safe(conn, n, head, node) {
    	found = nf_conntrack_find_get(net, zone, &conn->tuple);
    	if (found == NULL) {
-			hlist_del(&conn->node);
-			kmem_cache_free(connlimit_conn_cachep, conn);
+			/* It could be an already deleted connection or
+			 * a new connection that is not there in conntrack
+			 * yet. If former delete it from the list, else
+			 * increase count and move on.
+			 */
+			if (conn->pending_add) {
+				length++;
+			} else {
+				hlist_del(&conn->node);
+				kmem_cache_free(connlimit_conn_cachep, conn);
+			}
    		continue;
    	}
+		/* If it is a connection that was pending insertion to
+		 * connection tracking table before, then it's time to clear
+		 * the flag.
+		 */
+		conn->pending_add = false;
+
    	found_ct = nf_ct_tuplehash_to_ctrack(found);
if (nf_ct_tuple_equal(&conn->tuple, tuple)) {
-- 
2.14.4