summaryrefslogtreecommitdiff
diff options
context:
space:
mode:
authorOleg Nesterov <oleg@tv-sign.ru>2007-07-17 04:03:55 -0700
committerGreg Kroah-Hartman <gregkh@suse.de>2007-08-09 14:27:41 -0700
commit0b9a58a713f276833943528792844808ccc3e4ae (patch)
tree6974d9ff711a39bd92941fc231d889d1239b1231
parent7553b617208a627281cd764ec6b08070e56a4dcb (diff)
downloadlwn-0b9a58a713f276833943528792844808ccc3e4ae.tar.gz
lwn-0b9a58a713f276833943528792844808ccc3e4ae.zip
destroy_workqueue() can livelock
Pointed out by Michal Schmidt <mschmidt@redhat.com>. The bug was introduced in 2.6.22 by me. cleanup_workqueue_thread() does flush_cpu_workqueue(cwq) in a loop until ->worklist becomes empty. This is live-lockable, a re-niced caller can get CPU after wake_up() and insert a new barrier before the lower-priority cwq->thread has a chance to clear ->current_work. Change cleanup_workqueue_thread() to do flush_cpu_workqueue(cwq) only once. We can rely on the fact that run_workqueue() won't return until it flushes all works. So it is safe to call kthread_stop() after that, the "should stop" request won't be noticed until run_workqueue() returns. Signed-off-by: Oleg Nesterov <oleg@tv-sign.ru> Cc: Michal Schmidt <mschmidt@redhat.com> Cc: Srivatsa Vaddagiri <vatsa@in.ibm.com> Signed-off-by: Andrew Morton <akpm@linux-foundation.org> Signed-off-by: Greg Kroah-Hartman <gregkh@suse.de>
-rw-r--r--kernel/workqueue.c11
1 files changed, 5 insertions, 6 deletions
diff --git a/kernel/workqueue.c b/kernel/workqueue.c
index 3bebf73be976..3831f8801be6 100644
--- a/kernel/workqueue.c
+++ b/kernel/workqueue.c
@@ -739,18 +739,17 @@ static void cleanup_workqueue_thread(struct cpu_workqueue_struct *cwq, int cpu)
if (cwq->thread == NULL)
return;
+ flush_cpu_workqueue(cwq);
/*
- * If the caller is CPU_DEAD the single flush_cpu_workqueue()
- * is not enough, a concurrent flush_workqueue() can insert a
- * barrier after us.
+ * If the caller is CPU_DEAD and cwq->worklist was not empty,
+ * a concurrent flush_workqueue() can insert a barrier after us.
+ * However, in that case run_workqueue() won't return and check
+ * kthread_should_stop() until it flushes all work_struct's.
* When ->worklist becomes empty it is safe to exit because no
* more work_structs can be queued on this cwq: flush_workqueue
* checks list_empty(), and a "normal" queue_work() can't use
* a dead CPU.
*/
- while (flush_cpu_workqueue(cwq))
- ;
-
kthread_stop(cwq->thread);
cwq->thread = NULL;
}