tcp: autocork should not hold first packet in write queue
authorEric Dumazet <edumazet@google.com>
Tue, 17 Dec 2013 17:58:30 +0000 (09:58 -0800)
committerDavid S. Miller <davem@davemloft.net>
Fri, 20 Dec 2013 22:56:25 +0000 (17:56 -0500)
commita181ceb501b31b4bf8812a5c84c716cc31d82c2d
tree8233fdcc125262b4985b26ff46d1c58bf2592d1a
parenta792866ad2dafb8f272e4fdfb98a93fdbfff2277
tcp: autocork should not hold first packet in write queue

Willem noticed a TCP_RR regression caused by TCP autocorking
on a Mellanox test bed. MLX4_EN_TX_COAL_TIME is 16 us, which can be
right above RTT between hosts.

We can receive a ACK for a packet still in NIC TX ring buffer or in a
softnet completion queue.

Fix this by always pushing the skb if it is at the head of write queue.

Also, as TX completion is lockless, it's safer to perform sk_wmem_alloc
test after setting TSQ_THROTTLED.

erd:~# MIB="MIN_LATENCY,MEAN_LATENCY,MAX_LATENCY,P99_LATENCY,STDDEV_LATENCY"
erd:~#  ./netperf -H remote -t TCP_RR -- -o $MIB | tail -n 1
(repeat 3 times)

Before patch :

18,1049.87,41004,39631,6295.47
17,239.52,40804,48,2912.79
18,348.40,40877,54,3573.39

After patch :

18,22.84,4606,38,16.39
17,21.56,2871,36,13.51
17,22.46,2705,37,11.83

Reported-by: Willem de Bruijn <willemb@google.com>
Signed-off-by: Eric Dumazet <edumazet@google.com>
Fixes: f54b311142a9 ("tcp: auto corking")
Signed-off-by: David S. Miller <davem@davemloft.net>
net/ipv4/tcp.c