Yet another exception is where the low real-time latency of RCU's
read-side primitives is critically important.
+ One final exception is where RCU readers are used to prevent
+ the ABA problem (https://en.wikipedia.org/wiki/ABA_problem)
+ for lockless updates. This does result in the mildly
+ counter-intuitive situation where rcu_read_lock() and
+ rcu_read_unlock() are used to protect updates, however, this
+ approach provides the same potential simplifications that garbage
+ collectors do.
+
1. Does the update code have proper mutual exclusion?
RCU does allow -readers- to run (almost) naked, but -writers- must
explain how this single task does not become a major bottleneck on
big multiprocessor machines (for example, if the task is updating
information relating to itself that other tasks can read, there
- by definition can be no bottleneck).
+ by definition can be no bottleneck). Note that the definition
+ of "large" has changed significantly: Eight CPUs was "large"
+ in the year 2000, but a hundred CPUs was unremarkable in 2017.
2. Do the RCU read-side critical sections make proper use of
rcu_read_lock() and friends? These primitives are needed
Disabling of preemption can serve as rcu_read_lock_sched(), but
is less readable.
+ Letting RCU-protected pointers "leak" out of an RCU read-side
+ critical section is every bid as bad as letting them leak out
+ from under a lock. Unless, of course, you have arranged some
+ other means of protection, such as a lock or a reference count
+ -before- letting them out of the RCU read-side critical section.
+
3. Does the update code tolerate concurrent accesses?
The whole point of RCU is to permit readers to run without
This works quite well, also.
- c. Make updates appear atomic to readers. For example,
+ c. Make updates appear atomic to readers. For example,
pointer updates to properly aligned fields will
appear atomic, as will individual atomic primitives.
- Sequences of perations performed under a lock will -not-
+ Sequences of operations performed under a lock will -not-
appear to be atomic to RCU readers, nor will sequences
of multiple atomic primitives.
5. If call_rcu(), or a related primitive such as call_rcu_bh(),
call_rcu_sched(), or call_srcu() is used, the callback function
- must be written to be called from softirq context. In particular,
- it cannot block.
+ will be called from softirq context. In particular, it cannot
+ block.
6. Since synchronize_rcu() can block, it cannot be called from
any sort of irq context. The same rule applies for
synchronize_sched_expedite(), and synchronize_srcu_expedited().
The expedited forms of these primitives have the same semantics
- as the non-expedited forms, but expediting is both expensive
- and unfriendly to real-time workloads. Use of the expedited
- primitives should be restricted to rare configuration-change
- operations that would not normally be undertaken while a real-time
- workload is running.
+ as the non-expedited forms, but expediting is both expensive and
+ (with the exception of synchronize_srcu_expedited()) unfriendly
+ to real-time workloads. Use of the expedited primitives should
+ be restricted to rare configuration-change operations that would
+ not normally be undertaken while a real-time workload is running.
+ However, real-time workloads can use rcupdate.rcu_normal kernel
+ boot parameter to completely disable expedited grace periods,
+ though this might have performance implications.
In particular, if you find yourself invoking one of the expedited
primitives repeatedly in a loop, please do everyone a favor:
of the system, especially to real-time workloads running on
the rest of the system.
- In addition, it is illegal to call the expedited forms from
- a CPU-hotplug notifier, or while holding a lock that is acquired
- by a CPU-hotplug notifier. Failing to observe this restriction
- will result in deadlock.
-
7. If the updater uses call_rcu() or synchronize_rcu(), then the
corresponding readers must use rcu_read_lock() and
rcu_read_unlock(). If the updater uses call_rcu_bh() or
Similarly, disabling preemption is not an acceptable substitute
for rcu_read_lock(). Code that attempts to use preemption
disabling where it should be using rcu_read_lock() will break
- in real-time kernel builds.
+ in CONFIG_PREEMPT=y kernel builds.
If you want to wait for interrupt handlers, NMI handlers, and
code under the influence of preempt_disable(), you instead
not the case, a self-spawning RCU callback would prevent the
victim CPU from ever going offline.)
-14. SRCU (srcu_read_lock(), srcu_read_unlock(), srcu_dereference(),
- synchronize_srcu(), synchronize_srcu_expedited(), and call_srcu())
- may only be invoked from process context. Unlike other forms of
- RCU, it -is- permissible to block in an SRCU read-side critical
- section (demarked by srcu_read_lock() and srcu_read_unlock()),
- hence the "SRCU": "sleepable RCU". Please note that if you
- don't need to sleep in read-side critical sections, you should be
- using RCU rather than SRCU, because RCU is almost always faster
- and easier to use than is SRCU.
-
- Also unlike other forms of RCU, explicit initialization
- and cleanup is required via init_srcu_struct() and
- cleanup_srcu_struct(). These are passed a "struct srcu_struct"
- that defines the scope of a given SRCU domain. Once initialized,
- the srcu_struct is passed to srcu_read_lock(), srcu_read_unlock()
- synchronize_srcu(), synchronize_srcu_expedited(), and call_srcu().
- A given synchronize_srcu() waits only for SRCU read-side critical
+14. Unlike other forms of RCU, it -is- permissible to block in an
+ SRCU read-side critical section (demarked by srcu_read_lock()
+ and srcu_read_unlock()), hence the "SRCU": "sleepable RCU".
+ Please note that if you don't need to sleep in read-side critical
+ sections, you should be using RCU rather than SRCU, because RCU
+ is almost always faster and easier to use than is SRCU.
+
+ Also unlike other forms of RCU, explicit initialization and
+ cleanup is required either at build time via DEFINE_SRCU()
+ or DEFINE_STATIC_SRCU() or at runtime via init_srcu_struct()
+ and cleanup_srcu_struct(). These last two are passed a
+ "struct srcu_struct" that defines the scope of a given
+ SRCU domain. Once initialized, the srcu_struct is passed
+ to srcu_read_lock(), srcu_read_unlock() synchronize_srcu(),
+ synchronize_srcu_expedited(), and call_srcu(). A given
+ synchronize_srcu() waits only for SRCU read-side critical
sections governed by srcu_read_lock() and srcu_read_unlock()
calls that have been passed the same srcu_struct. This property
is what makes sleeping read-side critical sections tolerable --
Therefore, SRCU should be used in preference to rw_semaphore
only in extremely read-intensive situations, or in situations
requiring SRCU's read-side deadlock immunity or low read-side
- realtime latency.
+ realtime latency. You should also consider percpu_rw_semaphore
+ when you need lightweight readers.
- Note that, rcu_assign_pointer() relates to SRCU just as it does
- to other forms of RCU.
+ SRCU's expedited primitive (synchronize_srcu_expedited())
+ never sends IPIs to other CPUs, so it is easier on
+ real-time workloads than is synchronize_rcu_expedited(),
+ synchronize_rcu_bh_expedited() or synchronize_sched_expedited().
+
+ Note that rcu_dereference() and rcu_assign_pointer() relate to
+ SRCU just as they do to other forms of RCU.
15. The whole point of call_rcu(), synchronize_rcu(), and friends
is to wait until all pre-existing readers have finished before
These debugging aids can help you find problems that are
otherwise extremely difficult to spot.
+
+18. If you register a callback using call_rcu(), call_rcu_bh(),
+ call_rcu_sched(), or call_srcu(), and pass in a function defined
+ within a loadable module, then it in necessary to wait for
+ all pending callbacks to be invoked after the last invocation
+ and before unloading that module. Note that it is absolutely
+ -not- sufficient to wait for a grace period! The current (say)
+ synchronize_rcu() implementation waits only for all previous
+ callbacks registered on the CPU that synchronize_rcu() is running
+ on, but it is -not- guaranteed to wait for callbacks registered
+ on other CPUs.
+
+ You instead need to use one of the barrier functions:
+
+ o call_rcu() -> rcu_barrier()
+ o call_rcu_bh() -> rcu_barrier_bh()
+ o call_rcu_sched() -> rcu_barrier_sched()
+ o call_srcu() -> srcu_barrier()
+
+ However, these barrier functions are absolutely -not- guaranteed
+ to wait for a grace period. In fact, if there are no call_rcu()
+ callbacks waiting anywhere in the system, rcu_barrier() is within
+ its rights to return immediately.
+
+ So if you need to wait for both an RCU grace period and for
+ all pre-existing call_rcu() callbacks, you will need to execute
+ both rcu_barrier() and synchronize_rcu(), if necessary, using
+ something like workqueues to to execute them concurrently.
+
+ See rcubarrier.txt for more information.