rcu: Reduce expedited GP memory contention via per-CPU variables
authorPaul E. McKenney <paulmck@linux.vnet.ibm.com>
Thu, 1 Oct 2015 17:26:24 +0000 (10:26 -0700)
committerPaul E. McKenney <paulmck@linux.vnet.ibm.com>
Fri, 4 Dec 2015 20:26:52 +0000 (12:26 -0800)
commitdf5bd5144a80a9f6c3807383b11f735dae9caf9d
treefce44d0970a70446c5fd6b50f2e764db0efb8e56
parent1307f2148719cc9e9d12f5fa7d5b3b61ec5aef72
rcu: Reduce expedited GP memory contention via per-CPU variables

Currently, the piggybacked-work checks carried out by sync_exp_work_done()
atomically increment a small set of variables (the ->expedited_workdone0,
->expedited_workdone1, ->expedited_workdone2, ->expedited_workdone3
fields in the rcu_state structure), which will form a memory-contention
bottleneck given a sufficiently large number of CPUs concurrently invoking
either synchronize_rcu_expedited() or synchronize_sched_expedited().

This commit therefore moves these for fields to the per-CPU rcu_data
structure, eliminating the memory contention.  The show_rcuexp() function
also changes to sum up each field in the rcu_data structures.

Signed-off-by: Paul E. McKenney <paulmck@linux.vnet.ibm.com>
kernel/rcu/tree.c
kernel/rcu/tree.h
kernel/rcu/tree_trace.c