tcp: sk_add_backlog() is too agressive for TCP
authorEric Dumazet <edumazet@google.com>
Sun, 22 Apr 2012 23:38:54 +0000 (23:38 +0000)
committerDavid S. Miller <davem@davemloft.net>
Tue, 24 Apr 2012 02:28:28 +0000 (22:28 -0400)
While investigating TCP performance problems on 10Gb+ links, we found a
tcp sender was dropping lot of incoming ACKS because of sk_rcvbuf limit
in sk_add_backlog(), especially if receiver doesnt use GRO/LRO and sends
one ACK every two MSS segments.

A sender usually tweaks sk_sndbuf, but sk_rcvbuf stays at its default
value (87380), allowing a too small backlog.

A TCP ACK, even being small, can consume nearly same truesize space than
outgoing packets. Using sk_rcvbuf + sk_sndbuf as a limit makes sense and
is fast to compute.

Performance results on netperf, single flow, receiver with disabled
GRO/LRO : 7500 Mbits instead of 6050 Mbits, no more TCPBacklogDrop
increments at sender.

Signed-off-by: Eric Dumazet <edumazet@google.com>
Cc: Neal Cardwell <ncardwell@google.com>
Cc: Tom Herbert <therbert@google.com>
Cc: Maciej Żenczykowski <maze@google.com>
Cc: Yuchung Cheng <ycheng@google.com>
Cc: Ilpo Järvinen <ilpo.jarvinen@helsinki.fi>
Cc: Rick Jones <rick.jones2@hp.com>
Signed-off-by: David S. Miller <davem@davemloft.net>
net/ipv4/tcp_ipv4.c
net/ipv6/tcp_ipv6.c

index 917607e9bd5b79aa3d22c7d44887b438e879ccef..cf97e9821d76b352f0f786f1201967db8dd0d6ea 100644 (file)
@@ -1752,7 +1752,8 @@ process:
                        if (!tcp_prequeue(sk, skb))
                                ret = tcp_v4_do_rcv(sk, skb);
                }
-       } else if (unlikely(sk_add_backlog(sk, skb, sk->sk_rcvbuf))) {
+       } else if (unlikely(sk_add_backlog(sk, skb,
+                                          sk->sk_rcvbuf + sk->sk_sndbuf))) {
                bh_unlock_sock(sk);
                NET_INC_STATS_BH(net, LINUX_MIB_TCPBACKLOGDROP);
                goto discard_and_relse;
index b04e6d8a8371631c350e96162494cc50fe9a3c47..5fb19d345cfd00608507b7207e2d8bb34d291afa 100644 (file)
@@ -1654,7 +1654,8 @@ process:
                        if (!tcp_prequeue(sk, skb))
                                ret = tcp_v6_do_rcv(sk, skb);
                }
-       } else if (unlikely(sk_add_backlog(sk, skb, sk->sk_rcvbuf))) {
+       } else if (unlikely(sk_add_backlog(sk, skb,
+                                          sk->sk_rcvbuf + sk->sk_sndbuf))) {
                bh_unlock_sock(sk);
                NET_INC_STATS_BH(net, LINUX_MIB_TCPBACKLOGDROP);
                goto discard_and_relse;