From: David S. Miller Date: Tue, 7 Feb 2017 03:53:14 +0000 (-0500) Subject: Merge branch 'bridge-improve-cache-utilization' X-Git-Url: https://git.stricted.de/?a=commitdiff_plain;h=152bff377653047c2a69c226435e2c3fd316b592;p=GitHub%2Fmoto-9609%2Fandroid_kernel_motorola_exynos9610.git Merge branch 'bridge-improve-cache-utilization' Nikolay Aleksandrov says: ==================== bridge: improve cache utilization This is the first set which begins to deal with the bad bridge cache access patterns. The first patch rearranges the bridge and port structs a little so the frequently (and closely) accessed members are in the same cache line. The second patch then moves the garbage collection to a workqueue trying to improve system responsiveness under load (many fdbs) and more importantly removes the need to check if the matched entry is expired in __br_fdb_get which was a major source of false-sharing. The third patch is a preparation for the final one which If properly configured, i.e. ports bound to CPUs (thus updating "updated" locally) then the bridge's HitM goes from 100% to 0%, but even without binding we get a win because previously every lookup that iterated over the hash chain caused false-sharing due to the first cache line being used for both mac/vid and used/updated fields. Some results from tests I've run: (note that these were run in good conditions for the baseline, everything ran on a single NUMA node and there were only 3 fdbs) 1. baseline 100% Load HitM on the fdbs (between everyone who has done lookups and hit one of the 3 hash chains of the communicating src/dst fdbs) Overall 5.06% Load HitM for the bridge, first place in the list 2. patched & ports bound to CPUs 0% Local load HitM, bridge is not even in the c2c report list Also there's 3% consistent improvement in netperf tests. ==================== Signed-off-by: David S. Miller --- 152bff377653047c2a69c226435e2c3fd316b592