Paolo Abeni says:
====================
udp: reduce cache pressure
In the most common use case, many skb fields are not used by recvmsg(), and
the few ones actually accessed lays on cold cachelines, which leads to several
cache miss per packet.
This patch series attempts to reduce such misses with different strategies:
* caching the interesting fields in the scratched space
* avoid accessing at all uninteresting fields
* prefetching
Tested using the udp_sink program by Jesper[1] as the receiver, an h/w l4 rx
hash on the ingress nic, so that the number of ingress nic rx queues hit by the
udp traffic could be controlled via ethtool -L.
The udp_sink program was bound to the first idle cpu, to get more
stable numbers.
On a single numa node receiver:
nic rx queues vanilla patched kernel delta
1 1850 kpps 1850 kpps 0%
2 2370 kpps 2700 kpps 13.9%
16 2000 kpps 2220 kpps 11%
[1] https://github.com/netoptimizer/network-testing/blob/master/src/udp_sink.c
v1 -> v2:
- replaced secpath_reset() with skb_release_head_state()
- changed udp_dev_scratch fields types to u{32,16} variant,
replaced bitfield with bool
v2 -> v3:
- no changes, tested against apachebench for performances regression
====================
Signed-off-by: David S. Miller <davem@davemloft.net>