summaryrefslogtreecommitdiff
path: root/Documentation
diff options
context:
space:
mode:
authorLinus Torvalds <torvalds@linux-foundation.org>2010-10-21 12:54:12 -0700
committerLinus Torvalds <torvalds@linux-foundation.org>2010-10-21 12:54:12 -0700
commit888a6f77e0418b049f83d37547c209b904d30af4 (patch)
tree42cdb9f781d2177e6b380e69a66a27ec7705f51f /Documentation
parent31b7eab27a314b153d8fa07ba9e9ec00a98141e1 (diff)
parent6506cf6ce68d78a5470a8360c965dafe8e4b78e3 (diff)
downloadlwn-888a6f77e0418b049f83d37547c209b904d30af4.tar.gz
lwn-888a6f77e0418b049f83d37547c209b904d30af4.zip
Merge branch 'core-rcu-for-linus' of git://git.kernel.org/pub/scm/linux/kernel/git/tip/linux-2.6-tip
* 'core-rcu-for-linus' of git://git.kernel.org/pub/scm/linux/kernel/git/tip/linux-2.6-tip: (52 commits) sched: fix RCU lockdep splat from task_group() rcu: using ACCESS_ONCE() to observe the jiffies_stall/rnp->qsmask value sched: suppress RCU lockdep splat in task_fork_fair net: suppress RCU lockdep false positive in sock_update_classid rcu: move check from rcu_dereference_bh to rcu_read_lock_bh_held rcu: Add advice to PROVE_RCU_REPEATEDLY kernel config parameter rcu: Add tracing data to support queueing models rcu: fix sparse errors in rcutorture.c rcu: only one evaluation of arg in rcu_dereference_check() unless sparse kernel: Remove undead ifdef CONFIG_DEBUG_LOCK_ALLOC rcu: fix _oddness handling of verbose stall warnings rcu: performance fixes to TINY_PREEMPT_RCU callback checking rcu: upgrade stallwarn.txt documentation for CPU-bound RT processes vhost: add __rcu annotations rcu: add comment stating that list_empty() applies to RCU-protected lists rcu: apply TINY_PREEMPT_RCU read-side speedup to TREE_PREEMPT_RCU rcu: combine duplicate code, courtesy of CONFIG_PREEMPT_RCU rcu: Upgrade srcu_read_lock() docbook about SRCU grace periods rcu: document ways of stalling updates in low-memory situations rcu: repair code-duplication FIXMEs ...
Diffstat (limited to 'Documentation')
-rw-r--r--Documentation/DocBook/kernel-locking.tmpl14
-rw-r--r--Documentation/RCU/checklist.txt46
-rw-r--r--Documentation/RCU/stallwarn.txt18
-rw-r--r--Documentation/RCU/trace.txt13
4 files changed, 73 insertions, 18 deletions
diff --git a/Documentation/DocBook/kernel-locking.tmpl b/Documentation/DocBook/kernel-locking.tmpl
index a0d479d1e1dd..f66f4df18690 100644
--- a/Documentation/DocBook/kernel-locking.tmpl
+++ b/Documentation/DocBook/kernel-locking.tmpl
@@ -1645,7 +1645,9 @@ the amount of locking which needs to be done.
all the readers who were traversing the list when we deleted the
element are finished. We use <function>call_rcu()</function> to
register a callback which will actually destroy the object once
- the readers are finished.
+ all pre-existing readers are finished. Alternatively,
+ <function>synchronize_rcu()</function> may be used to block until
+ all pre-existing are finished.
</para>
<para>
But how does Read Copy Update know when the readers are
@@ -1714,7 +1716,7 @@ the amount of locking which needs to be done.
- object_put(obj);
+ list_del_rcu(&amp;obj-&gt;list);
cache_num--;
-+ call_rcu(&amp;obj-&gt;rcu, cache_delete_rcu, obj);
++ call_rcu(&amp;obj-&gt;rcu, cache_delete_rcu);
}
/* Must be holding cache_lock */
@@ -1725,14 +1727,6 @@ the amount of locking which needs to be done.
if (++cache_num > MAX_CACHE_SIZE) {
struct object *i, *outcast = NULL;
list_for_each_entry(i, &amp;cache, list) {
-@@ -85,6 +94,7 @@
- obj-&gt;popularity = 0;
- atomic_set(&amp;obj-&gt;refcnt, 1); /* The cache holds a reference */
- spin_lock_init(&amp;obj-&gt;lock);
-+ INIT_RCU_HEAD(&amp;obj-&gt;rcu);
-
- spin_lock_irqsave(&amp;cache_lock, flags);
- __cache_add(obj);
@@ -104,12 +114,11 @@
struct object *cache_find(int id)
{
diff --git a/Documentation/RCU/checklist.txt b/Documentation/RCU/checklist.txt
index 790d1a812376..0c134f8afc6f 100644
--- a/Documentation/RCU/checklist.txt
+++ b/Documentation/RCU/checklist.txt
@@ -218,13 +218,22 @@ over a rather long period of time, but improvements are always welcome!
include:
a. Keeping a count of the number of data-structure elements
- used by the RCU-protected data structure, including those
- waiting for a grace period to elapse. Enforce a limit
- on this number, stalling updates as needed to allow
- previously deferred frees to complete.
-
- Alternatively, limit only the number awaiting deferred
- free rather than the total number of elements.
+ used by the RCU-protected data structure, including
+ those waiting for a grace period to elapse. Enforce a
+ limit on this number, stalling updates as needed to allow
+ previously deferred frees to complete. Alternatively,
+ limit only the number awaiting deferred free rather than
+ the total number of elements.
+
+ One way to stall the updates is to acquire the update-side
+ mutex. (Don't try this with a spinlock -- other CPUs
+ spinning on the lock could prevent the grace period
+ from ever ending.) Another way to stall the updates
+ is for the updates to use a wrapper function around
+ the memory allocator, so that this wrapper function
+ simulates OOM when there is too much memory awaiting an
+ RCU grace period. There are of course many other
+ variations on this theme.
b. Limiting update rate. For example, if updates occur only
once per hour, then no explicit rate limiting is required,
@@ -365,3 +374,26 @@ over a rather long period of time, but improvements are always welcome!
and the compiler to freely reorder code into and out of RCU
read-side critical sections. It is the responsibility of the
RCU update-side primitives to deal with this.
+
+17. Use CONFIG_PROVE_RCU, CONFIG_DEBUG_OBJECTS_RCU_HEAD, and
+ the __rcu sparse checks to validate your RCU code. These
+ can help find problems as follows:
+
+ CONFIG_PROVE_RCU: check that accesses to RCU-protected data
+ structures are carried out under the proper RCU
+ read-side critical section, while holding the right
+ combination of locks, or whatever other conditions
+ are appropriate.
+
+ CONFIG_DEBUG_OBJECTS_RCU_HEAD: check that you don't pass the
+ same object to call_rcu() (or friends) before an RCU
+ grace period has elapsed since the last time that you
+ passed that same object to call_rcu() (or friends).
+
+ __rcu sparse checks: tag the pointer to the RCU-protected data
+ structure with __rcu, and sparse will warn you if you
+ access that pointer without the services of one of the
+ variants of rcu_dereference().
+
+ These debugging aids can help you find problems that are
+ otherwise extremely difficult to spot.
diff --git a/Documentation/RCU/stallwarn.txt b/Documentation/RCU/stallwarn.txt
index 44c6dcc93d6d..862c08ef1fde 100644
--- a/Documentation/RCU/stallwarn.txt
+++ b/Documentation/RCU/stallwarn.txt
@@ -80,6 +80,24 @@ o A CPU looping with bottom halves disabled. This condition can
o For !CONFIG_PREEMPT kernels, a CPU looping anywhere in the kernel
without invoking schedule().
+o A CPU-bound real-time task in a CONFIG_PREEMPT kernel, which might
+ happen to preempt a low-priority task in the middle of an RCU
+ read-side critical section. This is especially damaging if
+ that low-priority task is not permitted to run on any other CPU,
+ in which case the next RCU grace period can never complete, which
+ will eventually cause the system to run out of memory and hang.
+ While the system is in the process of running itself out of
+ memory, you might see stall-warning messages.
+
+o A CPU-bound real-time task in a CONFIG_PREEMPT_RT kernel that
+ is running at a higher priority than the RCU softirq threads.
+ This will prevent RCU callbacks from ever being invoked,
+ and in a CONFIG_TREE_PREEMPT_RCU kernel will further prevent
+ RCU grace periods from ever completing. Either way, the
+ system will eventually run out of memory and hang. In the
+ CONFIG_TREE_PREEMPT_RCU case, you might see stall-warning
+ messages.
+
o A bug in the RCU implementation.
o A hardware failure. This is quite unlikely, but has occurred
diff --git a/Documentation/RCU/trace.txt b/Documentation/RCU/trace.txt
index efd8cc95c06b..a851118775d8 100644
--- a/Documentation/RCU/trace.txt
+++ b/Documentation/RCU/trace.txt
@@ -125,6 +125,17 @@ o "b" is the batch limit for this CPU. If more than this number
of RCU callbacks is ready to invoke, then the remainder will
be deferred.
+o "ci" is the number of RCU callbacks that have been invoked for
+ this CPU. Note that ci+ql is the number of callbacks that have
+ been registered in absence of CPU-hotplug activity.
+
+o "co" is the number of RCU callbacks that have been orphaned due to
+ this CPU going offline.
+
+o "ca" is the number of RCU callbacks that have been adopted due to
+ other CPUs going offline. Note that ci+co-ca+ql is the number of
+ RCU callbacks registered on this CPU.
+
There is also an rcu/rcudata.csv file with the same information in
comma-separated-variable spreadsheet format.
@@ -180,7 +191,7 @@ o "s" is the "signaled" state that drives force_quiescent_state()'s
o "jfq" is the number of jiffies remaining for this grace period
before force_quiescent_state() is invoked to help push things
- along. Note that CPUs in dyntick-idle mode thoughout the grace
+ along. Note that CPUs in dyntick-idle mode throughout the grace
period will not report on their own, but rather must be check by
some other CPU via force_quiescent_state().