diff options
author | Philo Lu <lulie@linux.alibaba.com> | 2024-11-14 18:52:06 +0800 |
---|---|---|
committer | David S. Miller <davem@davemloft.net> | 2024-11-18 11:56:21 +0000 |
commit | 78c91ae2c6deb5d236a5a93ff2995cdd05514380 (patch) | |
tree | f26282e1c034c2bf23c2891b63ff19d7951a34b2 /net/ipv6 | |
parent | dab78a1745ab3c6001e1e4d50a9d09efef8e260d (diff) | |
download | lwn-78c91ae2c6deb5d236a5a93ff2995cdd05514380.tar.gz lwn-78c91ae2c6deb5d236a5a93ff2995cdd05514380.zip |
ipv4/udp: Add 4-tuple hash for connected socket
Currently, the udp_table has two hash table, the port hash and portaddr
hash. Usually for UDP servers, all sockets have the same local port and
addr, so they are all on the same hash slot within a reuseport group.
In some applications, UDP servers use connect() to manage clients. In
particular, when firstly receiving from an unseen 4 tuple, a new socket
is created and connect()ed to the remote addr:port, and then the fd is
used exclusively by the client.
Once there are connected sks in a reuseport group, udp has to score all
sks in the same hash2 slot to find the best match. This could be
inefficient with a large number of connections, resulting in high
softirq overhead.
To solve the problem, this patch implement 4-tuple hash for connected
udp sockets. During connect(), hash4 slot is updated, as well as a
corresponding counter, hash4_cnt, in hslot2. In __udp4_lib_lookup(),
hslot4 will be searched firstly if the counter is non-zero. Otherwise,
hslot2 is used like before. Note that only connected sockets enter this
hash4 path, while un-connected ones are not affected.
hlist_nulls is used for hash4, because we probably move to another hslot
wrongly when lookup with concurrent rehash. Then we check nulls at the
list end to see if we should restart lookup. Because udp does not use
SLAB_TYPESAFE_BY_RCU, we don't need to touch sk_refcnt when lookup.
Stress test results (with 1 cpu fully used) are shown below, in pps:
(1) _un-connected_ socket as server
[a] w/o hash4: 1,825176
[b] w/ hash4: 1,831750 (+0.36%)
(2) 500 _connected_ sockets as server
[c] w/o hash4: 290860 (only 16% of [a])
[d] w/ hash4: 1,889658 (+3.1% compared with [b])
With hash4, compute_score is skipped when lookup, so [d] is slightly
better than [b].
Co-developed-by: Cambda Zhu <cambda@linux.alibaba.com>
Signed-off-by: Cambda Zhu <cambda@linux.alibaba.com>
Co-developed-by: Fred Chen <fred.cc@alibaba-inc.com>
Signed-off-by: Fred Chen <fred.cc@alibaba-inc.com>
Co-developed-by: Yubing Qiu <yubing.qiuyubing@alibaba-inc.com>
Signed-off-by: Yubing Qiu <yubing.qiuyubing@alibaba-inc.com>
Signed-off-by: Philo Lu <lulie@linux.alibaba.com>
Acked-by: Willem de Bruijn <willemb@google.com>
Acked-by: Paolo Abeni <pabeni@redhat.com>
Signed-off-by: David S. Miller <davem@davemloft.net>
Diffstat (limited to 'net/ipv6')
-rw-r--r-- | net/ipv6/udp.c | 2 |
1 files changed, 1 insertions, 1 deletions
diff --git a/net/ipv6/udp.c b/net/ipv6/udp.c index 0d7aac9d44e5..1ea99d704e31 100644 --- a/net/ipv6/udp.c +++ b/net/ipv6/udp.c @@ -111,7 +111,7 @@ void udp_v6_rehash(struct sock *sk) &sk->sk_v6_rcv_saddr, inet_sk(sk)->inet_num); - udp_lib_rehash(sk, new_hash); + udp_lib_rehash(sk, new_hash, 0); /* 4-tuple hash not implemented */ } static int compute_score(struct sock *sk, const struct net *net, |