doc: Update RCU documentation

Signed-off-by: Paul E. McKenney <paulmck@linux.vnet.ibm.com>
author: Paul E. McKenney <paulmck@linux.vnet.ibm.com> 2017-06-06 15:04:03 -0700
committer: Paul E. McKenney <paulmck@linux.vnet.ibm.com> 2017-08-17 07:29:48 -0700
commit: 4de5f89ef8498e012ba4755b9b63df28c1382690 (patch)
tree: aa454d59a4b6189ed7f6d064aaf9826fbfe6a344 /Documentation/RCU/checklist.txt
parent: f99bcb2cdb5843548a2b0b78157cb7a0b4c7ccac (diff)
download: lwn-4de5f89ef8498e012ba4755b9b63df28c1382690.tar.gz
lwn-4de5f89ef8498e012ba4755b9b63df28c1382690.zip
1 files changed, 85 insertions, 36 deletions
diff --git a/Documentation/RCU/checklist.txt b/Documentation/RCU/checklist.txt
index 6beda556faf3..49747717d905 100644
--- a/Documentation/RCU/checklist.txt
+++ b/Documentation/RCU/checklist.txt
@@ -23,6 +23,14 @@ over a rather long period of time, but improvements are always welcome!
 	Yet another exception is where the low real-time latency of RCU's
 	read-side primitives is critically important.
 
+	One final exception is where RCU readers are used to prevent
+	the ABA problem (https://en.wikipedia.org/wiki/ABA_problem)
+	for lockless updates.  This does result in the mildly
+	counter-intuitive situation where rcu_read_lock() and
+	rcu_read_unlock() are used to protect updates, however, this
+	approach provides the same potential simplifications that garbage
+	collectors do.
+
 1.	Does the update code have proper mutual exclusion?
 
 	RCU does allow -readers- to run (almost) naked, but -writers- must
@@ -40,7 +48,9 @@ over a rather long period of time, but improvements are always welcome!
 	explain how this single task does not become a major bottleneck on
 	big multiprocessor machines (for example, if the task is updating
 	information relating to itself that other tasks can read, there
-	by definition can be no bottleneck).
+	by definition can be no bottleneck).  Note that the definition
+	of "large" has changed significantly:  Eight CPUs was "large"
+	in the year 2000, but a hundred CPUs was unremarkable in 2017.
 
 2.	Do the RCU read-side critical sections make proper use of
 	rcu_read_lock() and friends?  These primitives are needed
@@ -55,6 +65,12 @@ over a rather long period of time, but improvements are always welcome!
 	Disabling of preemption can serve as rcu_read_lock_sched(), but
 	is less readable.
 
+	Letting RCU-protected pointers "leak" out of an RCU read-side
+	critical section is every bid as bad as letting them leak out
+	from under a lock.  Unless, of course, you have arranged some
+	other means of protection, such as a lock or a reference count
+	-before- letting them out of the RCU read-side critical section.
+
 3.	Does the update code tolerate concurrent accesses?
 
 	The whole point of RCU is to permit readers to run without
@@ -78,10 +94,10 @@ over a rather long period of time, but improvements are always welcome!
 
 		This works quite well, also.
 
-	c.	Make updates appear atomic to readers.  For example,
+	c.	Make updates appear atomic to readers.	For example,
 		pointer updates to properly aligned fields will
 		appear atomic, as will individual atomic primitives.
-		Sequences of perations performed under a lock will -not-
+		Sequences of operations performed under a lock will -not-
 		appear to be atomic to RCU readers, nor will sequences
 		of multiple atomic primitives.
 
@@ -168,8 +184,8 @@ over a rather long period of time, but improvements are always welcome!
 
 5.	If call_rcu(), or a related primitive such as call_rcu_bh(),
 	call_rcu_sched(), or call_srcu() is used, the callback function
-	must be written to be called from softirq context.  In particular,
-	it cannot block.
+	will be called from softirq context.  In particular, it cannot
+	block.
 
 6.	Since synchronize_rcu() can block, it cannot be called from
 	any sort of irq context.  The same rule applies for
@@ -178,11 +194,14 @@ over a rather long period of time, but improvements are always welcome!
 	synchronize_sched_expedite(), and synchronize_srcu_expedited().
 
 	The expedited forms of these primitives have the same semantics
-	as the non-expedited forms, but expediting is both expensive
-	and unfriendly to real-time workloads.	Use of the expedited
-	primitives should be restricted to rare configuration-change
-	operations that would not normally be undertaken while a real-time
-	workload is running.
+	as the non-expedited forms, but expediting is both expensive and
+	(with the exception of synchronize_srcu_expedited()) unfriendly
+	to real-time workloads.  Use of the expedited primitives should
+	be restricted to rare configuration-change operations that would
+	not normally be undertaken while a real-time workload is running.
+	However, real-time workloads can use rcupdate.rcu_normal kernel
+	boot parameter to completely disable expedited grace periods,
+	though this might have performance implications.
 
 	In particular, if you find yourself invoking one of the expedited
 	primitives repeatedly in a loop, please do everyone a favor:
@@ -193,11 +212,6 @@ over a rather long period of time, but improvements are always welcome!
 	of the system, especially to real-time workloads running on
 	the rest of the system.
 
-	In addition, it is illegal to call the expedited forms from
-	a CPU-hotplug notifier, or while holding a lock that is acquired
-	by a CPU-hotplug notifier.  Failing to observe this restriction
-	will result in deadlock.
-
 7.	If the updater uses call_rcu() or synchronize_rcu(), then the
 	corresponding readers must use rcu_read_lock() and
 	rcu_read_unlock().  If the updater uses call_rcu_bh() or
@@ -321,7 +335,7 @@ over a rather long period of time, but improvements are always welcome!
 	Similarly, disabling preemption is not an acceptable substitute
 	for rcu_read_lock().  Code that attempts to use preemption
 	disabling where it should be using rcu_read_lock() will break
-	in real-time kernel builds.
+	in CONFIG_PREEMPT=y kernel builds.
 
 	If you want to wait for interrupt handlers, NMI handlers, and
 	code under the influence of preempt_disable(), you instead
@@ -356,23 +370,22 @@ over a rather long period of time, but improvements are always welcome!
 	not the case, a self-spawning RCU callback would prevent the
 	victim CPU from ever going offline.)
 
-14.	SRCU (srcu_read_lock(), srcu_read_unlock(), srcu_dereference(),
-	synchronize_srcu(), synchronize_srcu_expedited(), and call_srcu())
-	may only be invoked from process context.  Unlike other forms of
-	RCU, it -is- permissible to block in an SRCU read-side critical
-	section (demarked by srcu_read_lock() and srcu_read_unlock()),
-	hence the "SRCU": "sleepable RCU".  Please note that if you
-	don't need to sleep in read-side critical sections, you should be
-	using RCU rather than SRCU, because RCU is almost always faster
-	and easier to use than is SRCU.
-
-	Also unlike other forms of RCU, explicit initialization
-	and cleanup is required via init_srcu_struct() and
-	cleanup_srcu_struct().	These are passed a "struct srcu_struct"
-	that defines the scope of a given SRCU domain.	Once initialized,
-	the srcu_struct is passed to srcu_read_lock(), srcu_read_unlock()
-	synchronize_srcu(), synchronize_srcu_expedited(), and call_srcu().
-	A given synchronize_srcu() waits only for SRCU read-side critical
+14.	Unlike other forms of RCU, it -is- permissible to block in an
+	SRCU read-side critical section (demarked by srcu_read_lock()
+	and srcu_read_unlock()), hence the "SRCU": "sleepable RCU".
+	Please note that if you don't need to sleep in read-side critical
+	sections, you should be using RCU rather than SRCU, because RCU
+	is almost always faster and easier to use than is SRCU.
+
+	Also unlike other forms of RCU, explicit initialization and
+	cleanup is required either at build time via DEFINE_SRCU()
+	or DEFINE_STATIC_SRCU() or at runtime via init_srcu_struct()
+	and cleanup_srcu_struct().  These last two are passed a
+	"struct srcu_struct" that defines the scope of a given
+	SRCU domain.  Once initialized, the srcu_struct is passed
+	to srcu_read_lock(), srcu_read_unlock() synchronize_srcu(),
+	synchronize_srcu_expedited(), and call_srcu().	A given
+	synchronize_srcu() waits only for SRCU read-side critical
 	sections governed by srcu_read_lock() and srcu_read_unlock()
 	calls that have been passed the same srcu_struct.  This property
 	is what makes sleeping read-side critical sections tolerable --
@@ -390,10 +403,16 @@ over a rather long period of time, but improvements are always welcome!
 	Therefore, SRCU should be used in preference to rw_semaphore
 	only in extremely read-intensive situations, or in situations
 	requiring SRCU's read-side deadlock immunity or low read-side
-	realtime latency.
+	realtime latency.  You should also consider percpu_rw_semaphore
+	when you need lightweight readers.
 
-	Note that, rcu_assign_pointer() relates to SRCU just as it does
-	to other forms of RCU.
+	SRCU's expedited primitive (synchronize_srcu_expedited())
+	never sends IPIs to other CPUs, so it is easier on
+	real-time workloads than is synchronize_rcu_expedited(),
+	synchronize_rcu_bh_expedited() or synchronize_sched_expedited().
+
+	Note that rcu_dereference() and rcu_assign_pointer() relate to
+	SRCU just as they do to other forms of RCU.
 
 15.	The whole point of call_rcu(), synchronize_rcu(), and friends
 	is to wait until all pre-existing readers have finished before
@@ -435,3 +454,33 @@ over a rather long period of time, but improvements are always welcome!
 
 	These debugging aids can help you find problems that are
 	otherwise extremely difficult to spot.
+
+18.	If you register a callback using call_rcu(), call_rcu_bh(),
+	call_rcu_sched(), or call_srcu(), and pass in a function defined
+	within a loadable module, then it in necessary to wait for
+	all pending callbacks to be invoked after the last invocation
+	and before unloading that module.  Note that it is absolutely
+	-not- sufficient to wait for a grace period!  The current (say)
+	synchronize_rcu() implementation waits only for all previous
+	callbacks registered on the CPU that synchronize_rcu() is running
+	on, but it is -not- guaranteed to wait for callbacks registered
+	on other CPUs.
+
+	You instead need to use one of the barrier functions:
+
+	o	call_rcu() -> rcu_barrier()
+	o	call_rcu_bh() -> rcu_barrier_bh()
+	o	call_rcu_sched() -> rcu_barrier_sched()
+	o	call_srcu() -> srcu_barrier()
+
+	However, these barrier functions are absolutely -not- guaranteed
+	to wait for a grace period.  In fact, if there are no call_rcu()
+	callbacks waiting anywhere in the system, rcu_barrier() is within
+	its rights to return immediately.
+
+	So if you need to wait for both an RCU grace period and for
+	all pre-existing call_rcu() callbacks, you will need to execute
+	both rcu_barrier() and synchronize_rcu(), if necessary, using
+	something like workqueues to to execute them concurrently.
+
+	See rcubarrier.txt for more information.
author	Paul E. McKenney <paulmck@linux.vnet.ibm.com>	2017-06-06 15:04:03 -0700
committer	Paul E. McKenney <paulmck@linux.vnet.ibm.com>	2017-08-17 07:29:48 -0700
commit	4de5f89ef8498e012ba4755b9b63df28c1382690 (patch)
tree	aa454d59a4b6189ed7f6d064aaf9826fbfe6a344 /Documentation/RCU/checklist.txt
parent	f99bcb2cdb5843548a2b0b78157cb7a0b4c7ccac (diff)
download	lwn-4de5f89ef8498e012ba4755b9b63df28c1382690.tar.gz lwn-4de5f89ef8498e012ba4755b9b63df28c1382690.zip