diff options
author | Linus Torvalds <torvalds@linux-foundation.org> | 2011-04-30 09:15:40 -0700 |
---|---|---|
committer | Linus Torvalds <torvalds@linux-foundation.org> | 2011-04-30 09:15:40 -0700 |
commit | 3fd9952df4964fac7d5868ba48eadcc9dae3ba46 (patch) | |
tree | ac5ad53b758329fbd5d38972a26fb8380092a649 | |
parent | 1be6a1f89f131e9c3d22f819ec542be9cda8c9e3 (diff) | |
parent | 5035b20fa5cd146b66f5f89619c20a4177fb736d (diff) | |
download | lwn-3fd9952df4964fac7d5868ba48eadcc9dae3ba46.tar.gz lwn-3fd9952df4964fac7d5868ba48eadcc9dae3ba46.zip |
Merge branch 'fixes-2.6.39' of git://git.kernel.org/pub/scm/linux/kernel/git/tj/wq
* 'fixes-2.6.39' of git://git.kernel.org/pub/scm/linux/kernel/git/tj/wq:
workqueue: fix deadlock in worker_maybe_bind_and_lock()
workqueue: Document debugging tricks
Fix up trivial spelling conflict in kernel/workqueue.c
-rw-r--r-- | Documentation/workqueue.txt | 40 | ||||
-rw-r--r-- | kernel/workqueue.c | 8 |
2 files changed, 47 insertions, 1 deletions
diff --git a/Documentation/workqueue.txt b/Documentation/workqueue.txt index 01c513fac40e..a0b577de918f 100644 --- a/Documentation/workqueue.txt +++ b/Documentation/workqueue.txt @@ -12,6 +12,7 @@ CONTENTS 4. Application Programming Interface (API) 5. Example Execution Scenarios 6. Guidelines +7. Debugging 1. Introduction @@ -379,3 +380,42 @@ If q1 has WQ_CPU_INTENSIVE set, * Unless work items are expected to consume a huge amount of CPU cycles, using a bound wq is usually beneficial due to the increased level of locality in wq operations and work item execution. + + +7. Debugging + +Because the work functions are executed by generic worker threads +there are a few tricks needed to shed some light on misbehaving +workqueue users. + +Worker threads show up in the process list as: + +root 5671 0.0 0.0 0 0 ? S 12:07 0:00 [kworker/0:1] +root 5672 0.0 0.0 0 0 ? S 12:07 0:00 [kworker/1:2] +root 5673 0.0 0.0 0 0 ? S 12:12 0:00 [kworker/0:0] +root 5674 0.0 0.0 0 0 ? S 12:13 0:00 [kworker/1:0] + +If kworkers are going crazy (using too much cpu), there are two types +of possible problems: + + 1. Something beeing scheduled in rapid succession + 2. A single work item that consumes lots of cpu cycles + +The first one can be tracked using tracing: + + $ echo workqueue:workqueue_queue_work > /sys/kernel/debug/tracing/set_event + $ cat /sys/kernel/debug/tracing/trace_pipe > out.txt + (wait a few secs) + ^C + +If something is busy looping on work queueing, it would be dominating +the output and the offender can be determined with the work item +function. + +For the second type of problems it should be possible to just check +the stack trace of the offending worker thread. + + $ cat /proc/THE_OFFENDING_KWORKER/stack + +The work item's function should be trivially visible in the stack +trace. diff --git a/kernel/workqueue.c b/kernel/workqueue.c index 8859a41806dd..e3378e8d3a5c 100644 --- a/kernel/workqueue.c +++ b/kernel/workqueue.c @@ -1291,8 +1291,14 @@ __acquires(&gcwq->lock) return true; spin_unlock_irq(&gcwq->lock); - /* CPU has come up in between, retry migration */ + /* + * We've raced with CPU hot[un]plug. Give it a breather + * and retry migration. cond_resched() is required here; + * otherwise, we might deadlock against cpu_stop trying to + * bring down the CPU on non-preemptive kernel. + */ cpu_relax(); + cond_resched(); } } |