net: ethernet: ti: cpsw: fix net watchdog timeout
authorGrygorii Strashko <grygorii.strashko@ti.com>
Wed, 7 Feb 2018 01:17:06 +0000 (19:17 -0600)
committerGreg Kroah-Hartman <gregkh@linuxfoundation.org>
Fri, 9 Mar 2018 06:41:08 +0000 (22:41 -0800)
commitda260080c2e3a80c4b3c70ec6846011137373f2d
treeb17f6b3fcc526a2c67d94ceb9acff3cf16c5d851
parent94870df33c9bf52a9be3ed750670520aad43a44e
net: ethernet: ti: cpsw: fix net watchdog timeout

[ Upstream commit 62f94c2101f35cd45775df00ba09bde77580e26a ]

It was discovered that simple program which indefinitely sends 200b UDP
packets and runs on TI AM574x SoC (SMP) under RT Kernel triggers network
watchdog timeout in TI CPSW driver (<6 hours run). The network watchdog
timeout is triggered due to race between cpsw_ndo_start_xmit() and
cpsw_tx_handler() [NAPI]

cpsw_ndo_start_xmit()
if (unlikely(!cpdma_check_free_tx_desc(txch))) {
txq = netdev_get_tx_queue(ndev, q_idx);
netif_tx_stop_queue(txq);

^^ as per [1] barier has to be used after set_bit() otherwise new value
might not be visible to other cpus
}

cpsw_tx_handler()
if (unlikely(netif_tx_queue_stopped(txq)))
netif_tx_wake_queue(txq);

and when it happens ndev TX queue became disabled forever while driver's HW
TX queue is empty.

Fix this, by adding smp_mb__after_atomic() after netif_tx_stop_queue()
calls and double check for free TX descriptors after stopping ndev TX queue
- if there are free TX descriptors wake up ndev TX queue.

[1] https://www.kernel.org/doc/html/latest/core-api/atomic_ops.html
Signed-off-by: Grygorii Strashko <grygorii.strashko@ti.com>
Reviewed-by: Ivan Khoronzhuk <ivan.khoronzhuk@linaro.org>
Signed-off-by: David S. Miller <davem@davemloft.net>
Signed-off-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org>
drivers/net/ethernet/ti/cpsw.c