diff options
Diffstat (limited to 'Documentation/trace')
-rw-r--r-- | Documentation/trace/events-msr.rst | 4 | ||||
-rw-r--r-- | Documentation/trace/events-nmi.rst | 6 | ||||
-rw-r--r-- | Documentation/trace/events.rst | 90 | ||||
-rw-r--r-- | Documentation/trace/ftrace.rst | 6 | ||||
-rw-r--r-- | Documentation/trace/histogram-design.rst | 12 | ||||
-rw-r--r-- | Documentation/trace/histogram.rst | 416 | ||||
-rw-r--r-- | Documentation/trace/kprobetrace.rst | 53 | ||||
-rw-r--r-- | Documentation/trace/mmiotrace.rst | 20 | ||||
-rw-r--r-- | Documentation/trace/postprocess/trace-pagealloc-postprocess.pl | 4 | ||||
-rw-r--r-- | Documentation/trace/postprocess/trace-vmscan-postprocess.pl | 4 | ||||
-rw-r--r-- | Documentation/trace/tracepoint-analysis.rst | 8 | ||||
-rw-r--r-- | Documentation/trace/uprobetracer.rst | 22 | ||||
-rw-r--r-- | Documentation/trace/user_events.rst | 18 |
13 files changed, 417 insertions, 246 deletions
diff --git a/Documentation/trace/events-msr.rst b/Documentation/trace/events-msr.rst index 810481e530b6..35d06dc66bc2 100644 --- a/Documentation/trace/events-msr.rst +++ b/Documentation/trace/events-msr.rst @@ -8,7 +8,7 @@ at https://www.intel.com/sdm (Volume 3) Available trace points: -/sys/kernel/debug/tracing/events/msr/ +/sys/kernel/tracing/events/msr/ Trace MSR reads: @@ -34,7 +34,7 @@ rdpmc The trace data can be post processed with the postprocess/decode_msr.py script:: - cat /sys/kernel/debug/tracing/trace | decode_msr.py /usr/src/linux/include/asm/msr-index.h + cat /sys/kernel/tracing/trace | decode_msr.py /usr/src/linux/include/asm/msr-index.h to add symbolic MSR names. diff --git a/Documentation/trace/events-nmi.rst b/Documentation/trace/events-nmi.rst index 9e0a7289d80a..22ac1be0ea6f 100644 --- a/Documentation/trace/events-nmi.rst +++ b/Documentation/trace/events-nmi.rst @@ -4,7 +4,7 @@ NMI Trace Events These events normally show up here: - /sys/kernel/debug/tracing/events/nmi + /sys/kernel/tracing/events/nmi nmi_handler @@ -31,13 +31,13 @@ really hogging a lot of CPU time, like a millisecond at a time. Note that the kernel's output is in milliseconds, but the input to the filter is in nanoseconds! You can filter on 'delta_ns':: - cd /sys/kernel/debug/tracing/events/nmi/nmi_handler + cd /sys/kernel/tracing/events/nmi/nmi_handler echo 'handler==0xffffffff81625600 && delta_ns>1000000' > filter echo 1 > enable Your output would then look like:: - $ cat /sys/kernel/debug/tracing/trace_pipe + $ cat /sys/kernel/tracing/trace_pipe <idle>-0 [000] d.h3 505.397558: nmi_handler: perf_event_nmi_handler() delta_ns: 3236765 handled: 1 <idle>-0 [000] d.h3 505.805893: nmi_handler: perf_event_nmi_handler() delta_ns: 3174234 handled: 1 <idle>-0 [000] d.h3 506.158206: nmi_handler: perf_event_nmi_handler() delta_ns: 3084642 handled: 1 diff --git a/Documentation/trace/events.rst b/Documentation/trace/events.rst index c47f381d0c00..f5fcb8e1218f 100644 --- a/Documentation/trace/events.rst +++ b/Documentation/trace/events.rst @@ -24,27 +24,27 @@ tracing information should be printed. --------------------------------- The events which are available for tracing can be found in the file -/sys/kernel/debug/tracing/available_events. +/sys/kernel/tracing/available_events. To enable a particular event, such as 'sched_wakeup', simply echo it -to /sys/kernel/debug/tracing/set_event. For example:: +to /sys/kernel/tracing/set_event. For example:: - # echo sched_wakeup >> /sys/kernel/debug/tracing/set_event + # echo sched_wakeup >> /sys/kernel/tracing/set_event .. Note:: '>>' is necessary, otherwise it will firstly disable all the events. To disable an event, echo the event name to the set_event file prefixed with an exclamation point:: - # echo '!sched_wakeup' >> /sys/kernel/debug/tracing/set_event + # echo '!sched_wakeup' >> /sys/kernel/tracing/set_event To disable all events, echo an empty line to the set_event file:: - # echo > /sys/kernel/debug/tracing/set_event + # echo > /sys/kernel/tracing/set_event To enable all events, echo ``*:*`` or ``*:`` to the set_event file:: - # echo *:* > /sys/kernel/debug/tracing/set_event + # echo *:* > /sys/kernel/tracing/set_event The events are organized into subsystems, such as ext4, irq, sched, etc., and a full event name looks like this: <subsystem>:<event>. The @@ -53,29 +53,29 @@ file. All of the events in a subsystem can be specified via the syntax ``<subsystem>:*``; for example, to enable all irq events, you can use the command:: - # echo 'irq:*' > /sys/kernel/debug/tracing/set_event + # echo 'irq:*' > /sys/kernel/tracing/set_event 2.2 Via the 'enable' toggle --------------------------- -The events available are also listed in /sys/kernel/debug/tracing/events/ hierarchy +The events available are also listed in /sys/kernel/tracing/events/ hierarchy of directories. To enable event 'sched_wakeup':: - # echo 1 > /sys/kernel/debug/tracing/events/sched/sched_wakeup/enable + # echo 1 > /sys/kernel/tracing/events/sched/sched_wakeup/enable To disable it:: - # echo 0 > /sys/kernel/debug/tracing/events/sched/sched_wakeup/enable + # echo 0 > /sys/kernel/tracing/events/sched/sched_wakeup/enable To enable all events in sched subsystem:: - # echo 1 > /sys/kernel/debug/tracing/events/sched/enable + # echo 1 > /sys/kernel/tracing/events/sched/enable To enable all events:: - # echo 1 > /sys/kernel/debug/tracing/events/enable + # echo 1 > /sys/kernel/tracing/events/enable When reading one of these enable files, there are four results: @@ -126,7 +126,7 @@ is the size of the data item, in bytes. For example, here's the information displayed for the 'sched_wakeup' event:: - # cat /sys/kernel/debug/tracing/events/sched/sched_wakeup/format + # cat /sys/kernel/tracing/events/sched/sched_wakeup/format name: sched_wakeup ID: 60 @@ -207,6 +207,18 @@ field name:: As the kernel will have to know how to retrieve the memory that the pointer is at from user space. +You can convert any long type to a function address and search by function name:: + + call_site.function == security_prepare_creds + +The above will filter when the field "call_site" falls on the address within +"security_prepare_creds". That is, it will compare the value of "call_site" and +the filter will return true if it is greater than or equal to the start of +the function "security_prepare_creds" and less than the end of that function. + +The ".function" postfix can only be attached to values of size long, and can only +be compared with "==" or "!=". + 5.2 Setting filters ------------------- @@ -215,19 +227,19 @@ to the 'filter' file for the given event. For example:: - # cd /sys/kernel/debug/tracing/events/sched/sched_wakeup + # cd /sys/kernel/tracing/events/sched/sched_wakeup # echo "common_preempt_count > 4" > filter A slightly more involved example:: - # cd /sys/kernel/debug/tracing/events/signal/signal_generate + # cd /sys/kernel/tracing/events/signal/signal_generate # echo "((sig >= 10 && sig < 15) || sig == 17) && comm != bash" > filter If there is an error in the expression, you'll get an 'Invalid argument' error when setting it, and the erroneous string along with an error message can be seen by looking at the filter e.g.:: - # cd /sys/kernel/debug/tracing/events/signal/signal_generate + # cd /sys/kernel/tracing/events/signal/signal_generate # echo "((sig >= 10 && sig < 15) || dsig == 17) && comm != bash" > filter -bash: echo: write error: Invalid argument # cat filter @@ -258,7 +270,7 @@ file. To clear the filters for all events in a subsystem, write a '0' to the subsystem's filter file. -5.3 Subsystem filters +5.4 Subsystem filters --------------------- For convenience, filters for every event in a subsystem can be set or @@ -277,7 +289,7 @@ above points: Clear the filters on all events in the sched subsystem:: - # cd /sys/kernel/debug/tracing/events/sched + # cd /sys/kernel/tracing/events/sched # echo 0 > filter # cat sched_switch/filter none @@ -287,7 +299,7 @@ Clear the filters on all events in the sched subsystem:: Set a filter using only common fields for all events in the sched subsystem (all events end up with the same filter):: - # cd /sys/kernel/debug/tracing/events/sched + # cd /sys/kernel/tracing/events/sched # echo common_pid == 0 > filter # cat sched_switch/filter common_pid == 0 @@ -298,14 +310,14 @@ Attempt to set a filter using a non-common field for all events in the sched subsystem (all events but those that have a prev_pid field retain their old filters):: - # cd /sys/kernel/debug/tracing/events/sched + # cd /sys/kernel/tracing/events/sched # echo prev_pid == 0 > filter # cat sched_switch/filter prev_pid == 0 # cat sched_wakeup/filter common_pid == 0 -5.4 PID filtering +5.5 PID filtering ----------------- The set_event_pid file in the same directory as the top events directory @@ -313,7 +325,7 @@ exists, will filter all events from tracing any task that does not have the PID listed in the set_event_pid file. :: - # cd /sys/kernel/debug/tracing + # cd /sys/kernel/tracing # echo $$ > set_event_pid # echo 1 > events/enable @@ -409,14 +421,14 @@ The following commands are supported: specifies that this enablement happens only once:: # echo 'enable_event:kmem:kmalloc:1' > \ - /sys/kernel/debug/tracing/events/syscalls/sys_enter_read/trigger + /sys/kernel/tracing/events/syscalls/sys_enter_read/trigger The following trigger causes kmalloc events to stop being traced when a read system call exits. This disablement happens on every read system call exit:: # echo 'disable_event:kmem:kmalloc' > \ - /sys/kernel/debug/tracing/events/syscalls/sys_exit_read/trigger + /sys/kernel/tracing/events/syscalls/sys_exit_read/trigger The format is:: @@ -426,10 +438,10 @@ The following commands are supported: To remove the above commands:: # echo '!enable_event:kmem:kmalloc:1' > \ - /sys/kernel/debug/tracing/events/syscalls/sys_enter_read/trigger + /sys/kernel/tracing/events/syscalls/sys_enter_read/trigger # echo '!disable_event:kmem:kmalloc' > \ - /sys/kernel/debug/tracing/events/syscalls/sys_exit_read/trigger + /sys/kernel/tracing/events/syscalls/sys_exit_read/trigger Note that there can be any number of enable/disable_event triggers per triggering event, but there can only be one trigger per @@ -448,13 +460,13 @@ The following commands are supported: kmalloc tracepoint is hit:: # echo 'stacktrace' > \ - /sys/kernel/debug/tracing/events/kmem/kmalloc/trigger + /sys/kernel/tracing/events/kmem/kmalloc/trigger The following trigger dumps a stacktrace the first 5 times a kmalloc request happens with a size >= 64K:: # echo 'stacktrace:5 if bytes_req >= 65536' > \ - /sys/kernel/debug/tracing/events/kmem/kmalloc/trigger + /sys/kernel/tracing/events/kmem/kmalloc/trigger The format is:: @@ -463,16 +475,16 @@ The following commands are supported: To remove the above commands:: # echo '!stacktrace' > \ - /sys/kernel/debug/tracing/events/kmem/kmalloc/trigger + /sys/kernel/tracing/events/kmem/kmalloc/trigger # echo '!stacktrace:5 if bytes_req >= 65536' > \ - /sys/kernel/debug/tracing/events/kmem/kmalloc/trigger + /sys/kernel/tracing/events/kmem/kmalloc/trigger The latter can also be removed more simply by the following (without the filter):: # echo '!stacktrace:5' > \ - /sys/kernel/debug/tracing/events/kmem/kmalloc/trigger + /sys/kernel/tracing/events/kmem/kmalloc/trigger Note that there can be only one stacktrace trigger per triggering event. @@ -488,20 +500,20 @@ The following commands are supported: capture those events when the trigger event occurred:: # echo 'snapshot if nr_rq > 1' > \ - /sys/kernel/debug/tracing/events/block/block_unplug/trigger + /sys/kernel/tracing/events/block/block_unplug/trigger To only snapshot once:: # echo 'snapshot:1 if nr_rq > 1' > \ - /sys/kernel/debug/tracing/events/block/block_unplug/trigger + /sys/kernel/tracing/events/block/block_unplug/trigger To remove the above commands:: # echo '!snapshot if nr_rq > 1' > \ - /sys/kernel/debug/tracing/events/block/block_unplug/trigger + /sys/kernel/tracing/events/block/block_unplug/trigger # echo '!snapshot:1 if nr_rq > 1' > \ - /sys/kernel/debug/tracing/events/block/block_unplug/trigger + /sys/kernel/tracing/events/block/block_unplug/trigger Note that there can be only one snapshot trigger per triggering event. @@ -519,20 +531,20 @@ The following commands are supported: trigger event:: # echo 'traceoff:1 if nr_rq > 1' > \ - /sys/kernel/debug/tracing/events/block/block_unplug/trigger + /sys/kernel/tracing/events/block/block_unplug/trigger To always disable tracing when nr_rq > 1:: # echo 'traceoff if nr_rq > 1' > \ - /sys/kernel/debug/tracing/events/block/block_unplug/trigger + /sys/kernel/tracing/events/block/block_unplug/trigger To remove the above commands:: # echo '!traceoff:1 if nr_rq > 1' > \ - /sys/kernel/debug/tracing/events/block/block_unplug/trigger + /sys/kernel/tracing/events/block/block_unplug/trigger # echo '!traceoff if nr_rq > 1' > \ - /sys/kernel/debug/tracing/events/block/block_unplug/trigger + /sys/kernel/tracing/events/block/block_unplug/trigger Note that there can be only one traceon or traceoff trigger per triggering event. diff --git a/Documentation/trace/ftrace.rst b/Documentation/trace/ftrace.rst index 21f01d32c959..b927fb2b94dc 100644 --- a/Documentation/trace/ftrace.rst +++ b/Documentation/trace/ftrace.rst @@ -830,10 +830,10 @@ Error conditions The extended error information and usage takes the form shown in this example:: - # echo xxx > /sys/kernel/debug/tracing/events/sched/sched_wakeup/trigger + # echo xxx > /sys/kernel/tracing/events/sched/sched_wakeup/trigger echo: write error: Invalid argument - # cat /sys/kernel/debug/tracing/error_log + # cat /sys/kernel/tracing/error_log [ 5348.887237] location: error: Couldn't yyy: zzz Command: xxx ^ @@ -843,7 +843,7 @@ Error conditions To clear the error log, echo the empty string into it:: - # echo > /sys/kernel/debug/tracing/error_log + # echo > /sys/kernel/tracing/error_log Examples of using the tracer ---------------------------- diff --git a/Documentation/trace/histogram-design.rst b/Documentation/trace/histogram-design.rst index 088c8cce738b..5765eb3e9efa 100644 --- a/Documentation/trace/histogram-design.rst +++ b/Documentation/trace/histogram-design.rst @@ -14,7 +14,7 @@ tracing_map.c. Note: All the ftrace histogram command examples assume the working directory is the ftrace /tracing directory. For example:: - # cd /sys/kernel/debug/tracing + # cd /sys/kernel/tracing Also, the histogram output displayed for those commands will be generally be truncated - only enough to make the point is displayed. @@ -905,7 +905,7 @@ means it will be automatically converted into a field variable:: # echo 'hist:keys=next_pid:wakeup_lat=common_timestamp.usecs-$ts0: \ onmatch(sched.sched_waking).wakeup_latency($wakeup_lat,next_pid)' >> - /sys/kernel/debug/tracing/events/sched/sched_switch/trigger + /sys/kernel/tracing/events/sched/sched_switch/trigger The diagram for the sched_switch event is similar to previous examples but shows the additional field_vars[] array for hist_data and shows @@ -1112,7 +1112,7 @@ sched_switch event fields, next_pid and next_comm, to generate a wakeup_latency trace event. The next_pid and next_comm event fields are automatically converted into field variables for this purpose:: - # echo 'hist:keys=next_pid:wakeup_lat=common_timestamp.usecs-$ts0:onmatch(sched.sched_waking).wakeup_latency($wakeup_lat,next_pid,next_comm)' >> /sys/kernel/debug/tracing/events/sched/sched_switch/trigger + # echo 'hist:keys=next_pid:wakeup_lat=common_timestamp.usecs-$ts0:onmatch(sched.sched_waking).wakeup_latency($wakeup_lat,next_pid,next_comm)' >> /sys/kernel/tracing/events/sched/sched_switch/trigger The sched_waking hist_debug output shows the same data as in the previous test example:: @@ -1305,7 +1305,7 @@ and event name for the onmatch() handler:: The commands below can be used to clean things up for the next test:: - # echo '!hist:keys=next_pid:wakeup_lat=common_timestamp.usecs-$ts0:onmatch(sched.sched_waking).wakeup_latency($wakeup_lat,next_pid,next_comm)' >> /sys/kernel/debug/tracing/events/sched/sched_switch/trigger + # echo '!hist:keys=next_pid:wakeup_lat=common_timestamp.usecs-$ts0:onmatch(sched.sched_waking).wakeup_latency($wakeup_lat,next_pid,next_comm)' >> /sys/kernel/tracing/events/sched/sched_switch/trigger # echo '!hist:keys=pid:ts0=common_timestamp.usecs' >> events/sched/sched_waking/trigger @@ -1363,13 +1363,13 @@ with the save() and snapshot() actions. For example:: # echo 'hist:keys=next_pid:wakeup_lat=common_timestamp.usecs-$ts0: \ onmax($wakeup_lat).save(next_comm,prev_pid,prev_prio,prev_comm)' >> - /sys/kernel/debug/tracing/events/sched/sched_switch/trigger + /sys/kernel/tracing/events/sched/sched_switch/trigger or:: # echo 'hist:keys=next_pid:wakeup_lat=common_timestamp.usecs-$ts0: \ onmax($wakeup_lat).snapshot()' >> - /sys/kernel/debug/tracing/events/sched/sched_switch/trigger + /sys/kernel/tracing/events/sched/sched_switch/trigger save() action field variable test --------------------------------- diff --git a/Documentation/trace/histogram.rst b/Documentation/trace/histogram.rst index f95459aa984f..479c9eac6335 100644 --- a/Documentation/trace/histogram.rst +++ b/Documentation/trace/histogram.rst @@ -81,6 +81,7 @@ Documentation written by Tom Zanussi .usecs display a common_timestamp in microseconds .percent display a number of percentage value .graph display a bar-graph of a value + .stacktrace display as a stacktrace (must by a long[] type) ============= ================================================= Note that in general the semantics of a given field aren't @@ -102,12 +103,12 @@ Documentation written by Tom Zanussi trigger, read its current contents, and then turn it off:: # echo 'hist:keys=skbaddr.hex:vals=len' > \ - /sys/kernel/debug/tracing/events/net/netif_rx/trigger + /sys/kernel/tracing/events/net/netif_rx/trigger - # cat /sys/kernel/debug/tracing/events/net/netif_rx/hist + # cat /sys/kernel/tracing/events/net/netif_rx/hist # echo '!hist:keys=skbaddr.hex:vals=len' > \ - /sys/kernel/debug/tracing/events/net/netif_rx/trigger + /sys/kernel/tracing/events/net/netif_rx/trigger The trigger file itself can be read to show the details of the currently attached hist trigger. This information is also displayed @@ -169,13 +170,13 @@ Documentation written by Tom Zanussi aggregation on and off when conditions of interest are hit:: # echo 'hist:keys=skbaddr.hex:vals=len:pause' > \ - /sys/kernel/debug/tracing/events/net/netif_receive_skb/trigger + /sys/kernel/tracing/events/net/netif_receive_skb/trigger # echo 'enable_hist:net:netif_receive_skb if filename==/usr/bin/wget' > \ - /sys/kernel/debug/tracing/events/sched/sched_process_exec/trigger + /sys/kernel/tracing/events/sched/sched_process_exec/trigger # echo 'disable_hist:net:netif_receive_skb if comm==wget' > \ - /sys/kernel/debug/tracing/events/sched/sched_process_exit/trigger + /sys/kernel/tracing/events/sched/sched_process_exit/trigger The above sets up an initially paused hist trigger which is unpaused and starts aggregating events when a given program is executed, and @@ -218,7 +219,7 @@ Extended error information event. The fields that can be used for the hist trigger are listed in the kmalloc event's format file:: - # cat /sys/kernel/debug/tracing/events/kmem/kmalloc/format + # cat /sys/kernel/tracing/events/kmem/kmalloc/format name: kmalloc ID: 374 format: @@ -238,7 +239,7 @@ Extended error information the kernel that made one or more calls to kmalloc:: # echo 'hist:key=call_site:val=bytes_req.buckets=32' > \ - /sys/kernel/debug/tracing/events/kmem/kmalloc/trigger + /sys/kernel/tracing/events/kmem/kmalloc/trigger This tells the tracing system to create a 'hist' trigger using the call_site field of the kmalloc event as the key for the table, which @@ -252,7 +253,7 @@ Extended error information file in the kmalloc event's subdirectory (for readability, a number of entries have been omitted):: - # cat /sys/kernel/debug/tracing/events/kmem/kmalloc/hist + # cat /sys/kernel/tracing/events/kmem/kmalloc/hist # trigger info: hist:keys=call_site:vals=bytes_req:sort=hitcount:size=2048 [active] { call_site: 18446744072106379007 } hitcount: 1 bytes_req: 176 @@ -292,7 +293,7 @@ Extended error information the trigger info, which can also be displayed by reading the 'trigger' file:: - # cat /sys/kernel/debug/tracing/events/kmem/kmalloc/trigger + # cat /sys/kernel/tracing/events/kmem/kmalloc/trigger hist:keys=call_site:vals=bytes_req:sort=hitcount:size=2048 [active] At the end of the output are a few lines that display the overall @@ -323,7 +324,7 @@ Extended error information command history and re-execute it with a '!' prepended:: # echo '!hist:key=call_site:val=bytes_req' > \ - /sys/kernel/debug/tracing/events/kmem/kmalloc/trigger + /sys/kernel/tracing/events/kmem/kmalloc/trigger Finally, notice that the call_site as displayed in the output above isn't really very useful. It's an address, but normally addresses @@ -331,9 +332,9 @@ Extended error information value, simply append '.hex' to the field name in the trigger:: # echo 'hist:key=call_site.hex:val=bytes_req' > \ - /sys/kernel/debug/tracing/events/kmem/kmalloc/trigger + /sys/kernel/tracing/events/kmem/kmalloc/trigger - # cat /sys/kernel/debug/tracing/events/kmem/kmalloc/hist + # cat /sys/kernel/tracing/events/kmem/kmalloc/hist # trigger info: hist:keys=call_site.hex:vals=bytes_req:sort=hitcount:size=2048 [active] { call_site: ffffffffa026b291 } hitcount: 1 bytes_req: 433 @@ -376,9 +377,9 @@ Extended error information trigger:: # echo 'hist:key=call_site.sym:val=bytes_req' > \ - /sys/kernel/debug/tracing/events/kmem/kmalloc/trigger + /sys/kernel/tracing/events/kmem/kmalloc/trigger - # cat /sys/kernel/debug/tracing/events/kmem/kmalloc/hist + # cat /sys/kernel/tracing/events/kmem/kmalloc/hist # trigger info: hist:keys=call_site.sym:vals=bytes_req:sort=hitcount:size=2048 [active] { call_site: [ffffffff810adcb9] syslog_print_all } hitcount: 1 bytes_req: 1024 @@ -426,9 +427,9 @@ Extended error information the 'sort' parameter, along with the 'descending' modifier:: # echo 'hist:key=call_site.sym:val=bytes_req:sort=bytes_req.descending' > \ - /sys/kernel/debug/tracing/events/kmem/kmalloc/trigger + /sys/kernel/tracing/events/kmem/kmalloc/trigger - # cat /sys/kernel/debug/tracing/events/kmem/kmalloc/hist + # cat /sys/kernel/tracing/events/kmem/kmalloc/hist # trigger info: hist:keys=call_site.sym:vals=bytes_req:sort=bytes_req.descending:size=2048 [active] { call_site: [ffffffffa046041c] i915_gem_execbuffer2 [i915] } hitcount: 2186 bytes_req: 3397464 @@ -467,9 +468,9 @@ Extended error information name, just use 'sym-offset' instead:: # echo 'hist:key=call_site.sym-offset:val=bytes_req:sort=bytes_req.descending' > \ - /sys/kernel/debug/tracing/events/kmem/kmalloc/trigger + /sys/kernel/tracing/events/kmem/kmalloc/trigger - # cat /sys/kernel/debug/tracing/events/kmem/kmalloc/hist + # cat /sys/kernel/tracing/events/kmem/kmalloc/hist # trigger info: hist:keys=call_site.sym-offset:vals=bytes_req:sort=bytes_req.descending:size=2048 [active] { call_site: [ffffffffa046041c] i915_gem_execbuffer2+0x6c/0x2c0 [i915] } hitcount: 4569 bytes_req: 3163720 @@ -506,9 +507,9 @@ Extended error information allocated in a descending order:: # echo 'hist:keys=call_site.sym:values=bytes_req,bytes_alloc:sort=bytes_alloc.descending' > \ - /sys/kernel/debug/tracing/events/kmem/kmalloc/trigger + /sys/kernel/tracing/events/kmem/kmalloc/trigger - # cat /sys/kernel/debug/tracing/events/kmem/kmalloc/hist + # cat /sys/kernel/tracing/events/kmem/kmalloc/hist # trigger info: hist:keys=call_site.sym:vals=bytes_req,bytes_alloc:sort=bytes_alloc.descending:size=2048 [active] { call_site: [ffffffffa046041c] i915_gem_execbuffer2 [i915] } hitcount: 7403 bytes_req: 4084360 bytes_alloc: 5958016 @@ -549,7 +550,7 @@ Extended error information value 'stacktrace' for the key parameter:: # echo 'hist:keys=stacktrace:values=bytes_req,bytes_alloc:sort=bytes_alloc' > \ - /sys/kernel/debug/tracing/events/kmem/kmalloc/trigger + /sys/kernel/tracing/events/kmem/kmalloc/trigger The above trigger will use the kernel stack trace in effect when an event is triggered as the key for the hash table. This allows the @@ -559,7 +560,7 @@ Extended error information every callpath in the system that led up to a kmalloc (in this case every callpath to a kmalloc for a kernel compile):: - # cat /sys/kernel/debug/tracing/events/kmem/kmalloc/hist + # cat /sys/kernel/tracing/events/kmem/kmalloc/hist # trigger info: hist:keys=stacktrace:vals=bytes_req,bytes_alloc:sort=bytes_alloc:size=2048 [active] { stacktrace: @@ -658,9 +659,9 @@ Extended error information keeps a per-process sum of total bytes read:: # echo 'hist:key=common_pid.execname:val=count:sort=count.descending' > \ - /sys/kernel/debug/tracing/events/syscalls/sys_enter_read/trigger + /sys/kernel/tracing/events/syscalls/sys_enter_read/trigger - # cat /sys/kernel/debug/tracing/events/syscalls/sys_enter_read/hist + # cat /sys/kernel/tracing/events/syscalls/sys_enter_read/hist # trigger info: hist:keys=common_pid.execname:vals=count:sort=count.descending:size=2048 [active] { common_pid: gnome-terminal [ 3196] } hitcount: 280 count: 1093512 @@ -699,9 +700,9 @@ Extended error information counts for the system during the run:: # echo 'hist:key=id.syscall:val=hitcount' > \ - /sys/kernel/debug/tracing/events/raw_syscalls/sys_enter/trigger + /sys/kernel/tracing/events/raw_syscalls/sys_enter/trigger - # cat /sys/kernel/debug/tracing/events/raw_syscalls/sys_enter/hist + # cat /sys/kernel/tracing/events/raw_syscalls/sys_enter/hist # trigger info: hist:keys=id.syscall:vals=hitcount:sort=hitcount:size=2048 [active] { id: sys_fsync [ 74] } hitcount: 1 @@ -753,9 +754,9 @@ Extended error information hitcount sum as the secondary key:: # echo 'hist:key=id.syscall,common_pid.execname:val=hitcount:sort=id,hitcount' > \ - /sys/kernel/debug/tracing/events/raw_syscalls/sys_enter/trigger + /sys/kernel/tracing/events/raw_syscalls/sys_enter/trigger - # cat /sys/kernel/debug/tracing/events/raw_syscalls/sys_enter/hist + # cat /sys/kernel/tracing/events/raw_syscalls/sys_enter/hist # trigger info: hist:keys=id.syscall,common_pid.execname:vals=hitcount:sort=id.syscall,hitcount:size=2048 [active] { id: sys_read [ 0], common_pid: rtkit-daemon [ 1877] } hitcount: 1 @@ -803,9 +804,9 @@ Extended error information can use that to filter out all the other syscalls:: # echo 'hist:key=id.syscall,common_pid.execname:val=hitcount:sort=id,hitcount if id == 16' > \ - /sys/kernel/debug/tracing/events/raw_syscalls/sys_enter/trigger + /sys/kernel/tracing/events/raw_syscalls/sys_enter/trigger - # cat /sys/kernel/debug/tracing/events/raw_syscalls/sys_enter/hist + # cat /sys/kernel/tracing/events/raw_syscalls/sys_enter/hist # trigger info: hist:keys=id.syscall,common_pid.execname:vals=hitcount:sort=id.syscall,hitcount:size=2048 if id == 16 [active] { id: sys_ioctl [ 16], common_pid: gmain [ 2769] } hitcount: 1 @@ -846,9 +847,9 @@ Extended error information each process:: # echo 'hist:key=common_pid.execname,size:val=hitcount:sort=common_pid,size' > \ - /sys/kernel/debug/tracing/events/syscalls/sys_enter_recvfrom/trigger + /sys/kernel/tracing/events/syscalls/sys_enter_recvfrom/trigger - # cat /sys/kernel/debug/tracing/events/syscalls/sys_enter_recvfrom/hist + # cat /sys/kernel/tracing/events/syscalls/sys_enter_recvfrom/hist # trigger info: hist:keys=common_pid.execname,size:vals=hitcount:sort=common_pid.execname,size:size=2048 [active] { common_pid: smbd [ 784], size: 4 } hitcount: 1 @@ -899,9 +900,9 @@ Extended error information much smaller number, say 256:: # echo 'hist:key=child_comm:val=hitcount:size=256' > \ - /sys/kernel/debug/tracing/events/sched/sched_process_fork/trigger + /sys/kernel/tracing/events/sched/sched_process_fork/trigger - # cat /sys/kernel/debug/tracing/events/sched/sched_process_fork/hist + # cat /sys/kernel/tracing/events/sched/sched_process_fork/hist # trigger info: hist:keys=child_comm:vals=hitcount:sort=hitcount:size=256 [active] { child_comm: dconf worker } hitcount: 1 @@ -935,9 +936,9 @@ Extended error information displays as [paused]:: # echo 'hist:key=child_comm:val=hitcount:size=256:pause' >> \ - /sys/kernel/debug/tracing/events/sched/sched_process_fork/trigger + /sys/kernel/tracing/events/sched/sched_process_fork/trigger - # cat /sys/kernel/debug/tracing/events/sched/sched_process_fork/hist + # cat /sys/kernel/tracing/events/sched/sched_process_fork/hist # trigger info: hist:keys=child_comm:vals=hitcount:sort=hitcount:size=256 [paused] { child_comm: dconf worker } hitcount: 1 @@ -972,9 +973,9 @@ Extended error information again, and the data has changed:: # echo 'hist:key=child_comm:val=hitcount:size=256:cont' >> \ - /sys/kernel/debug/tracing/events/sched/sched_process_fork/trigger + /sys/kernel/tracing/events/sched/sched_process_fork/trigger - # cat /sys/kernel/debug/tracing/events/sched/sched_process_fork/hist + # cat /sys/kernel/tracing/events/sched/sched_process_fork/hist # trigger info: hist:keys=child_comm:vals=hitcount:sort=hitcount:size=256 [active] { child_comm: dconf worker } hitcount: 1 @@ -1026,7 +1027,7 @@ Extended error information netif_receive_skb event:: # echo 'hist:key=stacktrace:vals=len:pause' > \ - /sys/kernel/debug/tracing/events/net/netif_receive_skb/trigger + /sys/kernel/tracing/events/net/netif_receive_skb/trigger Next, we set up an 'enable_hist' trigger on the sched_process_exec event, with an 'if filename==/usr/bin/wget' filter. The effect of @@ -1037,7 +1038,7 @@ Extended error information hash table keyed on stacktrace:: # echo 'enable_hist:net:netif_receive_skb if filename==/usr/bin/wget' > \ - /sys/kernel/debug/tracing/events/sched/sched_process_exec/trigger + /sys/kernel/tracing/events/sched/sched_process_exec/trigger The aggregation continues until the netif_receive_skb is paused again, which is what the following disable_hist event does by @@ -1045,7 +1046,7 @@ Extended error information filter 'comm==wget':: # echo 'disable_hist:net:netif_receive_skb if comm==wget' > \ - /sys/kernel/debug/tracing/events/sched/sched_process_exit/trigger + /sys/kernel/tracing/events/sched/sched_process_exit/trigger Whenever a process exits and the comm field of the disable_hist trigger filter matches 'comm==wget', the netif_receive_skb hist @@ -1058,7 +1059,7 @@ Extended error information $ wget https://www.kernel.org/pub/linux/kernel/v3.x/patch-3.19.xz - # cat /sys/kernel/debug/tracing/events/net/netif_receive_skb/hist + # cat /sys/kernel/tracing/events/net/netif_receive_skb/hist # trigger info: hist:keys=stacktrace:vals=len:sort=hitcount:size=2048 [paused] { stacktrace: @@ -1142,12 +1143,12 @@ Extended error information again, we can just clear the histogram first:: # echo 'hist:key=stacktrace:vals=len:clear' >> \ - /sys/kernel/debug/tracing/events/net/netif_receive_skb/trigger + /sys/kernel/tracing/events/net/netif_receive_skb/trigger Just to verify that it is in fact cleared, here's what we now see in the hist file:: - # cat /sys/kernel/debug/tracing/events/net/netif_receive_skb/hist + # cat /sys/kernel/tracing/events/net/netif_receive_skb/hist # trigger info: hist:keys=stacktrace:vals=len:sort=hitcount:size=2048 [paused] Totals: @@ -1162,21 +1163,21 @@ Extended error information sched_process_exit events as such:: # echo 'enable_event:net:netif_receive_skb if filename==/usr/bin/wget' > \ - /sys/kernel/debug/tracing/events/sched/sched_process_exec/trigger + /sys/kernel/tracing/events/sched/sched_process_exec/trigger # echo 'disable_event:net:netif_receive_skb if comm==wget' > \ - /sys/kernel/debug/tracing/events/sched/sched_process_exit/trigger + /sys/kernel/tracing/events/sched/sched_process_exit/trigger If you read the trigger files for the sched_process_exec and sched_process_exit triggers, you should see two triggers for each: one enabling/disabling the hist aggregation and the other enabling/disabling the logging of events:: - # cat /sys/kernel/debug/tracing/events/sched/sched_process_exec/trigger + # cat /sys/kernel/tracing/events/sched/sched_process_exec/trigger enable_event:net:netif_receive_skb:unlimited if filename==/usr/bin/wget enable_hist:net:netif_receive_skb:unlimited if filename==/usr/bin/wget - # cat /sys/kernel/debug/tracing/events/sched/sched_process_exit/trigger + # cat /sys/kernel/tracing/events/sched/sched_process_exit/trigger enable_event:net:netif_receive_skb:unlimited if comm==wget disable_hist:net:netif_receive_skb:unlimited if comm==wget @@ -1192,7 +1193,7 @@ Extended error information saw in the last run, but this time you should also see the individual events in the trace file:: - # cat /sys/kernel/debug/tracing/trace + # cat /sys/kernel/tracing/trace # tracer: nop # @@ -1226,15 +1227,15 @@ Extended error information other things:: # echo 'hist:keys=skbaddr.hex:vals=len if len < 0' >> \ - /sys/kernel/debug/tracing/events/net/netif_receive_skb/trigger + /sys/kernel/tracing/events/net/netif_receive_skb/trigger # echo 'hist:keys=skbaddr.hex:vals=len if len > 4096' >> \ - /sys/kernel/debug/tracing/events/net/netif_receive_skb/trigger + /sys/kernel/tracing/events/net/netif_receive_skb/trigger # echo 'hist:keys=skbaddr.hex:vals=len if len == 256' >> \ - /sys/kernel/debug/tracing/events/net/netif_receive_skb/trigger + /sys/kernel/tracing/events/net/netif_receive_skb/trigger # echo 'hist:keys=skbaddr.hex:vals=len' >> \ - /sys/kernel/debug/tracing/events/net/netif_receive_skb/trigger + /sys/kernel/tracing/events/net/netif_receive_skb/trigger # echo 'hist:keys=len:vals=common_preempt_count' >> \ - /sys/kernel/debug/tracing/events/net/netif_receive_skb/trigger + /sys/kernel/tracing/events/net/netif_receive_skb/trigger The above set of commands create four triggers differing only in their filters, along with a completely different though fairly @@ -1246,7 +1247,7 @@ Extended error information Displaying the contents of the 'hist' file for the event shows the contents of all five histograms:: - # cat /sys/kernel/debug/tracing/events/net/netif_receive_skb/hist + # cat /sys/kernel/tracing/events/net/netif_receive_skb/hist # event histogram # @@ -1367,15 +1368,15 @@ Extended error information field in the shared 'foo' histogram data:: # echo 'hist:name=foo:keys=skbaddr.hex:vals=len' > \ - /sys/kernel/debug/tracing/events/net/netif_receive_skb/trigger + /sys/kernel/tracing/events/net/netif_receive_skb/trigger # echo 'hist:name=foo:keys=skbaddr.hex:vals=len' > \ - /sys/kernel/debug/tracing/events/net/netif_rx/trigger + /sys/kernel/tracing/events/net/netif_rx/trigger You can see that they're updating common histogram data by reading each event's hist files at the same time:: - # cat /sys/kernel/debug/tracing/events/net/netif_receive_skb/hist; - cat /sys/kernel/debug/tracing/events/net/netif_rx/hist + # cat /sys/kernel/tracing/events/net/netif_receive_skb/hist; + cat /sys/kernel/tracing/events/net/netif_rx/hist # event histogram # @@ -1488,15 +1489,15 @@ Extended error information couple of triggers named 'bar' using those fields:: # echo 'hist:name=bar:key=stacktrace:val=hitcount' > \ - /sys/kernel/debug/tracing/events/sched/sched_process_fork/trigger + /sys/kernel/tracing/events/sched/sched_process_fork/trigger # echo 'hist:name=bar:key=stacktrace:val=hitcount' > \ - /sys/kernel/debug/tracing/events/net/netif_rx/trigger + /sys/kernel/tracing/events/net/netif_rx/trigger And displaying the output of either shows some interesting if somewhat confusing output:: - # cat /sys/kernel/debug/tracing/events/sched/sched_process_fork/hist - # cat /sys/kernel/debug/tracing/events/net/netif_rx/hist + # cat /sys/kernel/tracing/events/sched/sched_process_fork/hist + # cat /sys/kernel/tracing/events/net/netif_rx/hist # event histogram # @@ -1786,6 +1787,8 @@ or assigned to a variable and referenced in a subsequent expression:: # echo 'hist:keys=next_pid:us_per_sec=1000000 ...' >> event/trigger # echo 'hist:keys=next_pid:timestamp_secs=common_timestamp/$us_per_sec ...' >> event/trigger +Variables can even hold stacktraces, which are useful with synthetic events. + 2.2.2 Synthetic Events ---------------------- @@ -1826,19 +1829,19 @@ variable reference to a variable on another event:: u64 lat; \ pid_t pid; \ int prio' >> \ - /sys/kernel/debug/tracing/synthetic_events + /sys/kernel/tracing/synthetic_events Reading the tracing/synthetic_events file lists all the currently defined synthetic events, in this case the event defined above:: - # cat /sys/kernel/debug/tracing/synthetic_events + # cat /sys/kernel/tracing/synthetic_events wakeup_latency u64 lat; pid_t pid; int prio An existing synthetic event definition can be removed by prepending the command that defined it with a '!':: # echo '!wakeup_latency u64 lat pid_t pid int prio' >> \ - /sys/kernel/debug/tracing/synthetic_events + /sys/kernel/tracing/synthetic_events At this point, there isn't yet an actual 'wakeup_latency' event instantiated in the event subsystem - for this to happen, a 'hist @@ -1850,20 +1853,20 @@ done, the 'wakeup_latency' synthetic event instance is created. The new event is created under the tracing/events/synthetic/ directory and looks and behaves just like any other event:: - # ls /sys/kernel/debug/tracing/events/synthetic/wakeup_latency + # ls /sys/kernel/tracing/events/synthetic/wakeup_latency enable filter format hist id trigger A histogram can now be defined for the new synthetic event:: # echo 'hist:keys=pid,prio,lat.log2:sort=lat' >> \ - /sys/kernel/debug/tracing/events/synthetic/wakeup_latency/trigger + /sys/kernel/tracing/events/synthetic/wakeup_latency/trigger The above shows the latency "lat" in a power of 2 grouping. Like any other event, once a histogram is enabled for the event, the -output can be displayed by reading the event's 'hist' file. +output can be displayed by reading the event's 'hist' file:: - # cat /sys/kernel/debug/tracing/events/synthetic/wakeup_latency/hist + # cat /sys/kernel/tracing/events/synthetic/wakeup_latency/hist # event histogram # @@ -1908,10 +1911,10 @@ output can be displayed by reading the event's 'hist' file. The latency values can also be grouped linearly by a given size with -the ".buckets" modifier and specify a size (in this case groups of 10). +the ".buckets" modifier and specify a size (in this case groups of 10):: # echo 'hist:keys=pid,prio,lat.buckets=10:sort=lat' >> \ - /sys/kernel/debug/tracing/events/synthetic/wakeup_latency/trigger + /sys/kernel/tracing/events/synthetic/wakeup_latency/trigger # event histogram # @@ -1940,6 +1943,157 @@ the ".buckets" modifier and specify a size (in this case groups of 10). Entries: 16 Dropped: 0 +To save stacktraces, create a synthetic event with a field of type "unsigned long[]" +or even just "long[]". For example, to see how long a task is blocked in an +uninterruptible state:: + + # cd /sys/kernel/tracing + # echo 's:block_lat pid_t pid; u64 delta; unsigned long[] stack;' > dynamic_events + # echo 'hist:keys=next_pid:ts=common_timestamp.usecs,st=stacktrace if prev_state == 2' >> events/sched/sched_switch/trigger + # echo 'hist:keys=prev_pid:delta=common_timestamp.usecs-$ts,s=$st:onmax($delta).trace(block_lat,prev_pid,$delta,$s)' >> events/sched/sched_switch/trigger + # echo 1 > events/synthetic/block_lat/enable + # cat trace + + # tracer: nop + # + # entries-in-buffer/entries-written: 2/2 #P:8 + # + # _-----=> irqs-off/BH-disabled + # / _----=> need-resched + # | / _---=> hardirq/softirq + # || / _--=> preempt-depth + # ||| / _-=> migrate-disable + # |||| / delay + # TASK-PID CPU# ||||| TIMESTAMP FUNCTION + # | | | ||||| | | + <idle>-0 [005] d..4. 521.164922: block_lat: pid=0 delta=8322 stack=STACK: + => __schedule+0x448/0x7b0 + => schedule+0x5a/0xb0 + => io_schedule+0x42/0x70 + => bit_wait_io+0xd/0x60 + => __wait_on_bit+0x4b/0x140 + => out_of_line_wait_on_bit+0x91/0xb0 + => jbd2_journal_commit_transaction+0x1679/0x1a70 + => kjournald2+0xa9/0x280 + => kthread+0xe9/0x110 + => ret_from_fork+0x2c/0x50 + + <...>-2 [004] d..4. 525.184257: block_lat: pid=2 delta=76 stack=STACK: + => __schedule+0x448/0x7b0 + => schedule+0x5a/0xb0 + => schedule_timeout+0x11a/0x150 + => wait_for_completion_killable+0x144/0x1f0 + => __kthread_create_on_node+0xe7/0x1e0 + => kthread_create_on_node+0x51/0x70 + => create_worker+0xcc/0x1a0 + => worker_thread+0x2ad/0x380 + => kthread+0xe9/0x110 + => ret_from_fork+0x2c/0x50 + +A synthetic event that has a stacktrace field may use it as a key in +histogram:: + + # echo 'hist:keys=delta.buckets=100,stack.stacktrace:sort=delta' > events/synthetic/block_lat/trigger + # cat events/synthetic/block_lat/hist + + # event histogram + # + # trigger info: hist:keys=delta.buckets=100,stack.stacktrace:vals=hitcount:sort=delta.buckets=100:size=2048 [active] + # + { delta: ~ 0-99, stack.stacktrace __schedule+0xa19/0x1520 + schedule+0x6b/0x110 + io_schedule+0x46/0x80 + bit_wait_io+0x11/0x80 + __wait_on_bit+0x4e/0x120 + out_of_line_wait_on_bit+0x8d/0xb0 + __wait_on_buffer+0x33/0x40 + jbd2_journal_commit_transaction+0x155a/0x19b0 + kjournald2+0xab/0x270 + kthread+0xfa/0x130 + ret_from_fork+0x29/0x50 + } hitcount: 1 + { delta: ~ 0-99, stack.stacktrace __schedule+0xa19/0x1520 + schedule+0x6b/0x110 + io_schedule+0x46/0x80 + rq_qos_wait+0xd0/0x170 + wbt_wait+0x9e/0xf0 + __rq_qos_throttle+0x25/0x40 + blk_mq_submit_bio+0x2c3/0x5b0 + __submit_bio+0xff/0x190 + submit_bio_noacct_nocheck+0x25b/0x2b0 + submit_bio_noacct+0x20b/0x600 + submit_bio+0x28/0x90 + ext4_bio_write_page+0x1e0/0x8c0 + mpage_submit_page+0x60/0x80 + mpage_process_page_bufs+0x16c/0x180 + mpage_prepare_extent_to_map+0x23f/0x530 + } hitcount: 1 + { delta: ~ 0-99, stack.stacktrace __schedule+0xa19/0x1520 + schedule+0x6b/0x110 + schedule_hrtimeout_range_clock+0x97/0x110 + schedule_hrtimeout_range+0x13/0x20 + usleep_range_state+0x65/0x90 + __intel_wait_for_register+0x1c1/0x230 [i915] + intel_psr_wait_for_idle_locked+0x171/0x2a0 [i915] + intel_pipe_update_start+0x169/0x360 [i915] + intel_update_crtc+0x112/0x490 [i915] + skl_commit_modeset_enables+0x199/0x600 [i915] + intel_atomic_commit_tail+0x7c4/0x1080 [i915] + intel_atomic_commit_work+0x12/0x20 [i915] + process_one_work+0x21c/0x3f0 + worker_thread+0x50/0x3e0 + kthread+0xfa/0x130 + } hitcount: 3 + { delta: ~ 0-99, stack.stacktrace __schedule+0xa19/0x1520 + schedule+0x6b/0x110 + schedule_timeout+0x11e/0x160 + __wait_for_common+0x8f/0x190 + wait_for_completion+0x24/0x30 + __flush_work.isra.0+0x1cc/0x360 + flush_work+0xe/0x20 + drm_mode_rmfb+0x18b/0x1d0 [drm] + drm_mode_rmfb_ioctl+0x10/0x20 [drm] + drm_ioctl_kernel+0xb8/0x150 [drm] + drm_ioctl+0x243/0x560 [drm] + __x64_sys_ioctl+0x92/0xd0 + do_syscall_64+0x59/0x90 + entry_SYSCALL_64_after_hwframe+0x72/0xdc + } hitcount: 1 + { delta: ~ 0-99, stack.stacktrace __schedule+0xa19/0x1520 + schedule+0x6b/0x110 + schedule_timeout+0x87/0x160 + __wait_for_common+0x8f/0x190 + wait_for_completion_timeout+0x1d/0x30 + drm_atomic_helper_wait_for_flip_done+0x57/0x90 [drm_kms_helper] + intel_atomic_commit_tail+0x8ce/0x1080 [i915] + intel_atomic_commit_work+0x12/0x20 [i915] + process_one_work+0x21c/0x3f0 + worker_thread+0x50/0x3e0 + kthread+0xfa/0x130 + ret_from_fork+0x29/0x50 + } hitcount: 1 + { delta: ~ 100-199, stack.stacktrace __schedule+0xa19/0x1520 + schedule+0x6b/0x110 + schedule_hrtimeout_range_clock+0x97/0x110 + schedule_hrtimeout_range+0x13/0x20 + usleep_range_state+0x65/0x90 + pci_set_low_power_state+0x17f/0x1f0 + pci_set_power_state+0x49/0x250 + pci_finish_runtime_suspend+0x4a/0x90 + pci_pm_runtime_suspend+0xcb/0x1b0 + __rpm_callback+0x48/0x120 + rpm_callback+0x67/0x70 + rpm_suspend+0x167/0x780 + rpm_idle+0x25a/0x380 + pm_runtime_work+0x93/0xc0 + process_one_work+0x21c/0x3f0 + } hitcount: 1 + + Totals: + Hits: 10 + Entries: 7 + Dropped: 0 + 2.2.3 Hist trigger 'handlers' and 'actions' ------------------------------------------- @@ -2039,9 +2193,9 @@ The following commonly-used handler.action pairs are available: event:: # echo 'wakeup_new_test pid_t pid' >> \ - /sys/kernel/debug/tracing/synthetic_events + /sys/kernel/tracing/synthetic_events - # cat /sys/kernel/debug/tracing/synthetic_events + # cat /sys/kernel/tracing/synthetic_events wakeup_new_test pid_t pid The following hist trigger both defines the missing testpid @@ -2052,26 +2206,26 @@ The following commonly-used handler.action pairs are available: # echo 'hist:keys=$testpid:testpid=pid:onmatch(sched.sched_wakeup_new).\ wakeup_new_test($testpid) if comm=="cyclictest"' >> \ - /sys/kernel/debug/tracing/events/sched/sched_wakeup_new/trigger + /sys/kernel/tracing/events/sched/sched_wakeup_new/trigger - Or, equivalently, using the 'trace' keyword syntax: + Or, equivalently, using the 'trace' keyword syntax:: - # echo 'hist:keys=$testpid:testpid=pid:onmatch(sched.sched_wakeup_new).\ - trace(wakeup_new_test,$testpid) if comm=="cyclictest"' >> \ - /sys/kernel/debug/tracing/events/sched/sched_wakeup_new/trigger + # echo 'hist:keys=$testpid:testpid=pid:onmatch(sched.sched_wakeup_new).\ + trace(wakeup_new_test,$testpid) if comm=="cyclictest"' >> \ + /sys/kernel/tracing/events/sched/sched_wakeup_new/trigger Creating and displaying a histogram based on those events is now just a matter of using the fields and new synthetic event in the tracing/events/synthetic directory, as usual:: # echo 'hist:keys=pid:sort=pid' >> \ - /sys/kernel/debug/tracing/events/synthetic/wakeup_new_test/trigger + /sys/kernel/tracing/events/synthetic/wakeup_new_test/trigger Running 'cyclictest' should cause wakeup_new events to generate wakeup_new_test synthetic events which should result in histogram output in the wakeup_new_test event's hist file:: - # cat /sys/kernel/debug/tracing/events/synthetic/wakeup_new_test/hist + # cat /sys/kernel/tracing/events/synthetic/wakeup_new_test/hist A more typical usage would be to use two events to calculate a latency. The following example uses a set of hist triggers to @@ -2080,14 +2234,14 @@ The following commonly-used handler.action pairs are available: First, we define a 'wakeup_latency' synthetic event:: # echo 'wakeup_latency u64 lat; pid_t pid; int prio' >> \ - /sys/kernel/debug/tracing/synthetic_events + /sys/kernel/tracing/synthetic_events Next, we specify that whenever we see a sched_waking event for a cyclictest thread, save the timestamp in a 'ts0' variable:: # echo 'hist:keys=$saved_pid:saved_pid=pid:ts0=common_timestamp.usecs \ if comm=="cyclictest"' >> \ - /sys/kernel/debug/tracing/events/sched/sched_waking/trigger + /sys/kernel/tracing/events/sched/sched_waking/trigger Then, when the corresponding thread is actually scheduled onto the CPU by a sched_switch event (saved_pid matches next_pid), calculate @@ -2097,19 +2251,19 @@ The following commonly-used handler.action pairs are available: # echo 'hist:keys=next_pid:wakeup_lat=common_timestamp.usecs-$ts0:\ onmatch(sched.sched_waking).wakeup_latency($wakeup_lat,\ $saved_pid,next_prio) if next_comm=="cyclictest"' >> \ - /sys/kernel/debug/tracing/events/sched/sched_switch/trigger + /sys/kernel/tracing/events/sched/sched_switch/trigger We also need to create a histogram on the wakeup_latency synthetic event in order to aggregate the generated synthetic event data:: # echo 'hist:keys=pid,prio,lat:sort=pid,lat' >> \ - /sys/kernel/debug/tracing/events/synthetic/wakeup_latency/trigger + /sys/kernel/tracing/events/synthetic/wakeup_latency/trigger Finally, once we've run cyclictest to actually generate some events, we can see the output by looking at the wakeup_latency synthetic event's hist file:: - # cat /sys/kernel/debug/tracing/events/synthetic/wakeup_latency/hist + # cat /sys/kernel/tracing/events/synthetic/wakeup_latency/hist - onmax(var).save(field,.. .) @@ -2135,19 +2289,19 @@ The following commonly-used handler.action pairs are available: # echo 'hist:keys=pid:ts0=common_timestamp.usecs \ if comm=="cyclictest"' >> \ - /sys/kernel/debug/tracing/events/sched/sched_waking/trigger + /sys/kernel/tracing/events/sched/sched_waking/trigger # echo 'hist:keys=next_pid:\ wakeup_lat=common_timestamp.usecs-$ts0:\ onmax($wakeup_lat).save(next_comm,prev_pid,prev_prio,prev_comm) \ if next_comm=="cyclictest"' >> \ - /sys/kernel/debug/tracing/events/sched/sched_switch/trigger + /sys/kernel/tracing/events/sched/sched_switch/trigger When the histogram is displayed, the max value and the saved values corresponding to the max are displayed following the rest of the fields:: - # cat /sys/kernel/debug/tracing/events/sched/sched_switch/hist + # cat /sys/kernel/tracing/events/sched/sched_switch/hist { next_pid: 2255 } hitcount: 239 common_timestamp-ts0: 0 max: 27 @@ -2191,48 +2345,48 @@ The following commonly-used handler.action pairs are available: resulting latency, stored in wakeup_lat, exceeds the current maximum latency, a snapshot is taken. As part of the setup, all the scheduler events are also enabled, which are the events that - will show up in the snapshot when it is taken at some point: + will show up in the snapshot when it is taken at some point:: - # echo 1 > /sys/kernel/debug/tracing/events/sched/enable + # echo 1 > /sys/kernel/tracing/events/sched/enable - # echo 'hist:keys=pid:ts0=common_timestamp.usecs \ - if comm=="cyclictest"' >> \ - /sys/kernel/debug/tracing/events/sched/sched_waking/trigger + # echo 'hist:keys=pid:ts0=common_timestamp.usecs \ + if comm=="cyclictest"' >> \ + /sys/kernel/tracing/events/sched/sched_waking/trigger - # echo 'hist:keys=next_pid:wakeup_lat=common_timestamp.usecs-$ts0: \ - onmax($wakeup_lat).save(next_prio,next_comm,prev_pid,prev_prio, \ - prev_comm):onmax($wakeup_lat).snapshot() \ - if next_comm=="cyclictest"' >> \ - /sys/kernel/debug/tracing/events/sched/sched_switch/trigger + # echo 'hist:keys=next_pid:wakeup_lat=common_timestamp.usecs-$ts0: \ + onmax($wakeup_lat).save(next_prio,next_comm,prev_pid,prev_prio, \ + prev_comm):onmax($wakeup_lat).snapshot() \ + if next_comm=="cyclictest"' >> \ + /sys/kernel/tracing/events/sched/sched_switch/trigger When the histogram is displayed, for each bucket the max value and the saved values corresponding to the max are displayed following the rest of the fields. If a snapshot was taken, there is also a message indicating that, - along with the value and event that triggered the global maximum: + along with the value and event that triggered the global maximum:: - # cat /sys/kernel/debug/tracing/events/sched/sched_switch/hist - { next_pid: 2101 } hitcount: 200 - max: 52 next_prio: 120 next_comm: cyclictest \ - prev_pid: 0 prev_prio: 120 prev_comm: swapper/6 + # cat /sys/kernel/tracing/events/sched/sched_switch/hist + { next_pid: 2101 } hitcount: 200 + max: 52 next_prio: 120 next_comm: cyclictest \ + prev_pid: 0 prev_prio: 120 prev_comm: swapper/6 - { next_pid: 2103 } hitcount: 1326 - max: 572 next_prio: 19 next_comm: cyclictest \ - prev_pid: 0 prev_prio: 120 prev_comm: swapper/1 + { next_pid: 2103 } hitcount: 1326 + max: 572 next_prio: 19 next_comm: cyclictest \ + prev_pid: 0 prev_prio: 120 prev_comm: swapper/1 - { next_pid: 2102 } hitcount: 1982 \ - max: 74 next_prio: 19 next_comm: cyclictest \ - prev_pid: 0 prev_prio: 120 prev_comm: swapper/5 + { next_pid: 2102 } hitcount: 1982 \ + max: 74 next_prio: 19 next_comm: cyclictest \ + prev_pid: 0 prev_prio: 120 prev_comm: swapper/5 - Snapshot taken (see tracing/snapshot). Details: - triggering value { onmax($wakeup_lat) }: 572 \ - triggered by event with key: { next_pid: 2103 } + Snapshot taken (see tracing/snapshot). Details: + triggering value { onmax($wakeup_lat) }: 572 \ + triggered by event with key: { next_pid: 2103 } - Totals: - Hits: 3508 - Entries: 3 - Dropped: 0 + Totals: + Hits: 3508 + Entries: 3 + Dropped: 0 In the above case, the event that triggered the global maximum has the key with next_pid == 2103. If you look at the bucket that has @@ -2247,7 +2401,7 @@ The following commonly-used handler.action pairs are available: sched_switch events, which should match the time displayed in the global maximum):: - # cat /sys/kernel/debug/tracing/snapshot + # cat /sys/kernel/tracing/snapshot <...>-2103 [005] d..3 309.873125: sched_switch: prev_comm=cyclictest prev_pid=2103 prev_prio=19 prev_state=D ==> next_comm=swapper/5 next_pid=0 next_prio=120 <idle>-0 [005] d.h3 309.873611: sched_waking: comm=cyclictest pid=2102 prio=19 target_cpu=005 @@ -2310,15 +2464,15 @@ The following commonly-used handler.action pairs are available: $cwnd variable. If the value has changed, a snapshot is taken. As part of the setup, all the scheduler and tcp events are also enabled, which are the events that will show up in the snapshot - when it is taken at some point: + when it is taken at some point:: - # echo 1 > /sys/kernel/debug/tracing/events/sched/enable - # echo 1 > /sys/kernel/debug/tracing/events/tcp/enable + # echo 1 > /sys/kernel/tracing/events/sched/enable + # echo 1 > /sys/kernel/tracing/events/tcp/enable - # echo 'hist:keys=dport:cwnd=snd_cwnd: \ - onchange($cwnd).save(snd_wnd,srtt,rcv_wnd): \ - onchange($cwnd).snapshot()' >> \ - /sys/kernel/debug/tracing/events/tcp/tcp_probe/trigger + # echo 'hist:keys=dport:cwnd=snd_cwnd: \ + onchange($cwnd).save(snd_wnd,srtt,rcv_wnd): \ + onchange($cwnd).snapshot()' >> \ + /sys/kernel/tracing/events/tcp/tcp_probe/trigger When the histogram is displayed, for each bucket the tracked value and the saved values corresponding to that value are displayed @@ -2327,7 +2481,7 @@ The following commonly-used handler.action pairs are available: If a snapshot was taken, there is also a message indicating that, along with the value and event that triggered the snapshot:: - # cat /sys/kernel/debug/tracing/events/tcp/tcp_probe/hist + # cat /sys/kernel/tracing/events/tcp/tcp_probe/hist { dport: 1521 } hitcount: 8 changed: 10 snd_wnd: 35456 srtt: 154262 rcv_wnd: 42112 @@ -2341,10 +2495,10 @@ The following commonly-used handler.action pairs are available: { dport: 443 } hitcount: 211 changed: 10 snd_wnd: 26960 srtt: 17379 rcv_wnd: 28800 - Snapshot taken (see tracing/snapshot). Details:: + Snapshot taken (see tracing/snapshot). Details: - triggering value { onchange($cwnd) }: 10 - triggered by event with key: { dport: 80 } + triggering value { onchange($cwnd) }: 10 + triggered by event with key: { dport: 80 } Totals: Hits: 414 @@ -2361,7 +2515,7 @@ The following commonly-used handler.action pairs are available: And finally, looking at the snapshot data should show at or near the end the event that triggered the snapshot:: - # cat /sys/kernel/debug/tracing/snapshot + # cat /sys/kernel/tracing/snapshot gnome-shell-1261 [006] dN.3 49.823113: sched_stat_runtime: comm=gnome-shell pid=1261 runtime=49347 [ns] vruntime=1835730389 [ns] kworker/u16:4-773 [003] d..3 49.823114: sched_switch: prev_comm=kworker/u16:4 prev_pid=773 prev_prio=120 prev_state=R+ ==> next_comm=kworker/3:2 next_pid=135 next_prio=120 diff --git a/Documentation/trace/kprobetrace.rst b/Documentation/trace/kprobetrace.rst index 08a2a6a3782f..651f9ab53f3e 100644 --- a/Documentation/trace/kprobetrace.rst +++ b/Documentation/trace/kprobetrace.rst @@ -6,21 +6,21 @@ Kprobe-based Event Tracing Overview -------- -These events are similar to tracepoint based events. Instead of Tracepoint, +These events are similar to tracepoint-based events. Instead of tracepoints, this is based on kprobes (kprobe and kretprobe). So it can probe wherever kprobes can probe (this means, all functions except those with __kprobes/nokprobe_inline annotation and those marked NOKPROBE_SYMBOL). -Unlike the Tracepoint based event, this can be added and removed +Unlike the tracepoint-based event, this can be added and removed dynamically, on the fly. To enable this feature, build your kernel with CONFIG_KPROBE_EVENTS=y. -Similar to the events tracer, this doesn't need to be activated via +Similar to the event tracer, this doesn't need to be activated via current_tracer. Instead of that, add probe points via -/sys/kernel/debug/tracing/kprobe_events, and enable it via -/sys/kernel/debug/tracing/events/kprobes/<EVENT>/enable. +/sys/kernel/tracing/kprobe_events, and enable it via +/sys/kernel/tracing/events/kprobes/<EVENT>/enable. -You can also use /sys/kernel/debug/tracing/dynamic_events instead of +You can also use /sys/kernel/tracing/dynamic_events instead of kprobe_events. That interface will provide unified access to other dynamic events too. @@ -58,7 +58,7 @@ Synopsis of kprobe_events NAME=FETCHARG : Set NAME as the argument name of FETCHARG. FETCHARG:TYPE : Set TYPE as the type of FETCHARG. Currently, basic types (u8/u16/u32/u64/s8/s16/s32/s64), hexadecimal types - (x8/x16/x32/x64), "string", "ustring", "symbol", "symstr" + (x8/x16/x32/x64), "char", "string", "ustring", "symbol", "symstr" and bitfield are supported. (\*1) only for the probe on function entry (offs == 0). @@ -68,22 +68,27 @@ Synopsis of kprobe_events Types ----- -Several types are supported for fetch-args. Kprobe tracer will access memory +Several types are supported for fetchargs. Kprobe tracer will access memory by given type. Prefix 's' and 'u' means those types are signed and unsigned respectively. 'x' prefix implies it is unsigned. Traced arguments are shown in decimal ('s' and 'u') or hexadecimal ('x'). Without type casting, 'x32' or 'x64' is used depends on the architecture (e.g. x86-32 uses x32, and x86-64 uses x64). + These value types can be an array. To record array data, you can add '[N]' (where N is a fixed number, less than 64) to the base type. -E.g. 'x16[4]' means an array of x16 (2bytes hex) with 4 elements. +E.g. 'x16[4]' means an array of x16 (2-byte hex) with 4 elements. Note that the array can be applied to memory type fetchargs, you can not apply it to registers/stack-entries etc. (for example, '$stack1:x8[8]' is wrong, but '+8($stack):x8[8]' is OK.) + +Char type can be used to show the character value of traced arguments. + String type is a special type, which fetches a "null-terminated" string from kernel space. This means it will fail and store NULL if the string container has been paged out. "ustring" type is an alternative of string for user-space. -See :ref:`user_mem_access` for more info.. +See :ref:`user_mem_access` for more info. + The string array type is a bit different from other types. For other base types, <base-type>[1] is equal to <base-type> (e.g. +0(%di):x32[1] is same as +0(%di):x32.) But string[1] is not equal to string. The string type itself @@ -120,8 +125,8 @@ space. 'ustring' is a shortcut way of performing the same task. That is, Note that kprobe-event provides the user-memory access syntax but it doesn't use it transparently. This means if you use normal dereference or string type -for user memory, it might fail, and may always fail on some archs. The user -has to carefully check if the target data is in kernel or user space. +for user memory, it might fail, and may always fail on some architectures. The +user has to carefully check if the target data is in kernel or user space. Per-Probe Event Filtering ------------------------- @@ -150,7 +155,7 @@ trigger: Event Profiling --------------- You can check the total number of probe hits and probe miss-hits via -/sys/kernel/debug/tracing/kprobe_profile. +/sys/kernel/tracing/kprobe_profile. The first column is event name, the second is the number of probe hits, the third is the number of probe miss-hits. @@ -160,11 +165,11 @@ You can add and enable new kprobe events when booting up the kernel by "kprobe_event=" parameter. The parameter accepts a semicolon-delimited kprobe events, which format is similar to the kprobe_events. The difference is that the probe definition parameters are comma-delimited -instead of space. For example, adding myprobe event on do_sys_open like below +instead of space. For example, adding myprobe event on do_sys_open like below:: p:myprobe do_sys_open dfd=%ax filename=%dx flags=%cx mode=+4($stack) -should be below for kernel boot parameter (just replace spaces with comma) +should be below for kernel boot parameter (just replace spaces with comma):: p:myprobe,do_sys_open,dfd=%ax,filename=%dx,flags=%cx,mode=+4($stack) @@ -174,7 +179,7 @@ Usage examples To add a probe as a new event, write a new definition to kprobe_events as below:: - echo 'p:myprobe do_sys_open dfd=%ax filename=%dx flags=%cx mode=+4($stack)' > /sys/kernel/debug/tracing/kprobe_events + echo 'p:myprobe do_sys_open dfd=%ax filename=%dx flags=%cx mode=+4($stack)' > /sys/kernel/tracing/kprobe_events This sets a kprobe on the top of do_sys_open() function with recording 1st to 4th arguments as "myprobe" event. Note, which register/stack entry is @@ -184,15 +189,15 @@ under tools/perf/). As this example shows, users can choose more familiar names for each arguments. :: - echo 'r:myretprobe do_sys_open $retval' >> /sys/kernel/debug/tracing/kprobe_events + echo 'r:myretprobe do_sys_open $retval' >> /sys/kernel/tracing/kprobe_events This sets a kretprobe on the return point of do_sys_open() function with recording return value as "myretprobe" event. You can see the format of these events via -/sys/kernel/debug/tracing/events/kprobes/<EVENT>/format. +/sys/kernel/tracing/events/kprobes/<EVENT>/format. :: - cat /sys/kernel/debug/tracing/events/kprobes/myprobe/format + cat /sys/kernel/tracing/events/kprobes/myprobe/format name: myprobe ID: 780 format: @@ -215,7 +220,7 @@ You can see the format of these events via You can see that the event has 4 arguments as in the expressions you specified. :: - echo > /sys/kernel/debug/tracing/kprobe_events + echo > /sys/kernel/tracing/kprobe_events This clears all probe points. @@ -230,8 +235,8 @@ Right after definition, each event is disabled by default. For tracing these events, you need to enable it. :: - echo 1 > /sys/kernel/debug/tracing/events/kprobes/myprobe/enable - echo 1 > /sys/kernel/debug/tracing/events/kprobes/myretprobe/enable + echo 1 > /sys/kernel/tracing/events/kprobes/myprobe/enable + echo 1 > /sys/kernel/tracing/events/kprobes/myretprobe/enable Use the following command to start tracing in an interval. :: @@ -240,10 +245,10 @@ Use the following command to start tracing in an interval. Open something... # echo 0 > tracing_on -And you can see the traced information via /sys/kernel/debug/tracing/trace. +And you can see the traced information via /sys/kernel/tracing/trace. :: - cat /sys/kernel/debug/tracing/trace + cat /sys/kernel/tracing/trace # tracer: nop # # TASK-PID CPU# TIMESTAMP FUNCTION diff --git a/Documentation/trace/mmiotrace.rst b/Documentation/trace/mmiotrace.rst index fed13eaead89..95b750722a13 100644 --- a/Documentation/trace/mmiotrace.rst +++ b/Documentation/trace/mmiotrace.rst @@ -36,11 +36,11 @@ Usage Quick Reference :: $ mount -t debugfs debugfs /sys/kernel/debug - $ echo mmiotrace > /sys/kernel/debug/tracing/current_tracer - $ cat /sys/kernel/debug/tracing/trace_pipe > mydump.txt & + $ echo mmiotrace > /sys/kernel/tracing/current_tracer + $ cat /sys/kernel/tracing/trace_pipe > mydump.txt & Start X or whatever. - $ echo "X is up" > /sys/kernel/debug/tracing/trace_marker - $ echo nop > /sys/kernel/debug/tracing/current_tracer + $ echo "X is up" > /sys/kernel/tracing/trace_marker + $ echo nop > /sys/kernel/tracing/current_tracer Check for lost events. @@ -56,11 +56,11 @@ Check that the driver you are about to trace is not loaded. Activate mmiotrace (requires root privileges):: - $ echo mmiotrace > /sys/kernel/debug/tracing/current_tracer + $ echo mmiotrace > /sys/kernel/tracing/current_tracer Start storing the trace:: - $ cat /sys/kernel/debug/tracing/trace_pipe > mydump.txt & + $ cat /sys/kernel/tracing/trace_pipe > mydump.txt & The 'cat' process should stay running (sleeping) in the background. @@ -68,14 +68,14 @@ Load the driver you want to trace and use it. Mmiotrace will only catch MMIO accesses to areas that are ioremapped while mmiotrace is active. During tracing you can place comments (markers) into the trace by -$ echo "X is up" > /sys/kernel/debug/tracing/trace_marker +$ echo "X is up" > /sys/kernel/tracing/trace_marker This makes it easier to see which part of the (huge) trace corresponds to which action. It is recommended to place descriptive markers about what you do. Shut down mmiotrace (requires root privileges):: - $ echo nop > /sys/kernel/debug/tracing/current_tracer + $ echo nop > /sys/kernel/tracing/current_tracer The 'cat' process exits. If it does not, kill it by issuing 'fg' command and pressing ctrl+c. @@ -93,12 +93,12 @@ events were lost, the trace is incomplete. You should enlarge the buffers and try again. Buffers are enlarged by first seeing how large the current buffers are:: - $ cat /sys/kernel/debug/tracing/buffer_size_kb + $ cat /sys/kernel/tracing/buffer_size_kb gives you a number. Approximately double this number and write it back, for instance:: - $ echo 128000 > /sys/kernel/debug/tracing/buffer_size_kb + $ echo 128000 > /sys/kernel/tracing/buffer_size_kb Then start again from the top. diff --git a/Documentation/trace/postprocess/trace-pagealloc-postprocess.pl b/Documentation/trace/postprocess/trace-pagealloc-postprocess.pl index b9b7d80c2f9d..d16494c5e200 100644 --- a/Documentation/trace/postprocess/trace-pagealloc-postprocess.pl +++ b/Documentation/trace/postprocess/trace-pagealloc-postprocess.pl @@ -4,7 +4,7 @@ # to extract some high-level information on what is going on. The accuracy of the parser # may vary considerably # -# Example usage: trace-pagealloc-postprocess.pl < /sys/kernel/debug/tracing/trace_pipe +# Example usage: trace-pagealloc-postprocess.pl < /sys/kernel/tracing/trace_pipe # other options # --prepend-parent Report on the parent proc and PID # --read-procstat If the trace lacks process info, get it from /proc @@ -94,7 +94,7 @@ sub generate_traceevent_regex { my $regex; # Read the event format or use the default - if (!open (FORMAT, "/sys/kernel/debug/tracing/events/$event/format")) { + if (!open (FORMAT, "/sys/kernel/tracing/events/$event/format")) { $regex = $default; } else { my $line; diff --git a/Documentation/trace/postprocess/trace-vmscan-postprocess.pl b/Documentation/trace/postprocess/trace-vmscan-postprocess.pl index 2f4e39875fb3..e24c009789a0 100644 --- a/Documentation/trace/postprocess/trace-vmscan-postprocess.pl +++ b/Documentation/trace/postprocess/trace-vmscan-postprocess.pl @@ -3,7 +3,7 @@ # page reclaim. It makes an attempt to extract some high-level information on # what is going on. The accuracy of the parser may vary # -# Example usage: trace-vmscan-postprocess.pl < /sys/kernel/debug/tracing/trace_pipe +# Example usage: trace-vmscan-postprocess.pl < /sys/kernel/tracing/trace_pipe # other options # --read-procstat If the trace lacks process info, get it from /proc # --ignore-pid Aggregate processes of the same name together @@ -140,7 +140,7 @@ sub generate_traceevent_regex { my $regex; # Read the event format or use the default - if (!open (FORMAT, "/sys/kernel/debug/tracing/events/$event/format")) { + if (!open (FORMAT, "/sys/kernel/tracing/events/$event/format")) { print("WARNING: Event $event format string not found\n"); return $default; } else { diff --git a/Documentation/trace/tracepoint-analysis.rst b/Documentation/trace/tracepoint-analysis.rst index 716326b9f152..be01bf7b47e5 100644 --- a/Documentation/trace/tracepoint-analysis.rst +++ b/Documentation/trace/tracepoint-analysis.rst @@ -26,10 +26,10 @@ assumed that the PCL tool tools/perf has been installed and is in your path. 2.1 Standard Utilities ---------------------- -All possible events are visible from /sys/kernel/debug/tracing/events. Simply +All possible events are visible from /sys/kernel/tracing/events. Simply calling:: - $ find /sys/kernel/debug/tracing/events -type d + $ find /sys/kernel/tracing/events -type d will give a fair indication of the number of events available. @@ -59,7 +59,7 @@ See Documentation/trace/events.rst for a proper description on how events can be enabled system-wide. A short example of enabling all events related to page allocation would look something like:: - $ for i in `find /sys/kernel/debug/tracing/events -name "enable" | grep mm_`; do echo 1 > $i; done + $ for i in `find /sys/kernel/tracing/events -name "enable" | grep mm_`; do echo 1 > $i; done 3.2 System-Wide Event Enabling with SystemTap --------------------------------------------- @@ -189,7 +189,7 @@ time on a system-wide basis using -a and sleep. ============================================ When events are enabled the events that are triggering can be read from -/sys/kernel/debug/tracing/trace_pipe in human-readable format although binary +/sys/kernel/tracing/trace_pipe in human-readable format although binary options exist as well. By post-processing the output, further information can be gathered on-line as appropriate. Examples of post-processing might include diff --git a/Documentation/trace/uprobetracer.rst b/Documentation/trace/uprobetracer.rst index 3a1797d707f4..122d15572fd5 100644 --- a/Documentation/trace/uprobetracer.rst +++ b/Documentation/trace/uprobetracer.rst @@ -12,13 +12,13 @@ To enable this feature, build your kernel with CONFIG_UPROBE_EVENTS=y. Similar to the kprobe-event tracer, this doesn't need to be activated via current_tracer. Instead of that, add probe points via -/sys/kernel/debug/tracing/uprobe_events, and enable it via -/sys/kernel/debug/tracing/events/uprobes/<EVENT>/enable. +/sys/kernel/tracing/uprobe_events, and enable it via +/sys/kernel/tracing/events/uprobes/<EVENT>/enable. However unlike kprobe-event tracer, the uprobe event interface expects the user to calculate the offset of the probepoint in the object. -You can also use /sys/kernel/debug/tracing/dynamic_events instead of +You can also use /sys/kernel/tracing/dynamic_events instead of uprobe_events. That interface will provide unified access to other dynamic events too. @@ -79,7 +79,7 @@ For $comm, the default type is "string"; any other type is invalid. Event Profiling --------------- You can check the total number of probe hits per event via -/sys/kernel/debug/tracing/uprobe_profile. The first column is the filename, +/sys/kernel/tracing/uprobe_profile. The first column is the filename, the second is the event name, the third is the number of probe hits. Usage examples @@ -87,28 +87,28 @@ Usage examples * Add a probe as a new uprobe event, write a new definition to uprobe_events as below (sets a uprobe at an offset of 0x4245c0 in the executable /bin/bash):: - echo 'p /bin/bash:0x4245c0' > /sys/kernel/debug/tracing/uprobe_events + echo 'p /bin/bash:0x4245c0' > /sys/kernel/tracing/uprobe_events * Add a probe as a new uretprobe event:: - echo 'r /bin/bash:0x4245c0' > /sys/kernel/debug/tracing/uprobe_events + echo 'r /bin/bash:0x4245c0' > /sys/kernel/tracing/uprobe_events * Unset registered event:: - echo '-:p_bash_0x4245c0' >> /sys/kernel/debug/tracing/uprobe_events + echo '-:p_bash_0x4245c0' >> /sys/kernel/tracing/uprobe_events * Print out the events that are registered:: - cat /sys/kernel/debug/tracing/uprobe_events + cat /sys/kernel/tracing/uprobe_events * Clear all events:: - echo > /sys/kernel/debug/tracing/uprobe_events + echo > /sys/kernel/tracing/uprobe_events Following example shows how to dump the instruction pointer and %ax register at the probed text address. Probe zfree function in /bin/zsh:: - # cd /sys/kernel/debug/tracing/ + # cd /sys/kernel/tracing/ # cat /proc/`pgrep zsh`/maps | grep /bin/zsh | grep r-xp 00400000-0048a000 r-xp 00000000 08:03 130904 /bin/zsh # objdump -T /bin/zsh | grep -w zfree @@ -168,7 +168,7 @@ Also, you can disable the event by:: # echo 0 > events/uprobes/enable -And you can see the traced information via /sys/kernel/debug/tracing/trace. +And you can see the traced information via /sys/kernel/tracing/trace. :: # cat trace diff --git a/Documentation/trace/user_events.rst b/Documentation/trace/user_events.rst index 9f181f342a70..422802ef4025 100644 --- a/Documentation/trace/user_events.rst +++ b/Documentation/trace/user_events.rst @@ -11,10 +11,10 @@ that can be viewed via existing tools, such as ftrace and perf. To enable this feature, build your kernel with CONFIG_USER_EVENTS=y. Programs can view status of the events via -/sys/kernel/debug/tracing/user_events_status and can both register and write -data out via /sys/kernel/debug/tracing/user_events_data. +/sys/kernel/tracing/user_events_status and can both register and write +data out via /sys/kernel/tracing/user_events_data. -Programs can also use /sys/kernel/debug/tracing/dynamic_events to register and +Programs can also use /sys/kernel/tracing/dynamic_events to register and delete user based events via the u: prefix. The format of the command to dynamic_events is the same as the ioctl with the u: prefix applied. @@ -22,9 +22,9 @@ Typically programs will register a set of events that they wish to expose to tools that can read trace_events (such as ftrace and perf). The registration process gives back two ints to the program for each event. The first int is the status bit. This describes which bit in little-endian format in the -/sys/kernel/debug/tracing/user_events_status file represents this event. The +/sys/kernel/tracing/user_events_status file represents this event. The second int is the write index which describes the data when a write() or -writev() is called on the /sys/kernel/debug/tracing/user_events_data file. +writev() is called on the /sys/kernel/tracing/user_events_data file. The structures referenced in this document are contained within the /include/uapi/linux/user_events.h file in the source tree. @@ -35,7 +35,7 @@ filesystem and may be mounted at different paths than above.* Registering ----------- Registering within a user process is done via ioctl() out to the -/sys/kernel/debug/tracing/user_events_data file. The command to issue is +/sys/kernel/tracing/user_events_data file. The command to issue is DIAG_IOCSREG. This command takes a packed struct user_reg as an argument:: @@ -54,7 +54,7 @@ and the write index. User based events show up under tracefs like any other event under the subsystem named "user_events". This means tools that wish to attach to the -events need to use /sys/kernel/debug/tracing/events/user_events/[name]/enable +events need to use /sys/kernel/tracing/events/user_events/[name]/enable or perf record -e user_events:[name] when attaching/recording. **NOTE:** *The write_index returned is only valid for the FD that was used* @@ -96,7 +96,7 @@ Would be represented by the following field:: Deleting ----------- Deleting an event from within a user process is done via ioctl() out to the -/sys/kernel/debug/tracing/user_events_data file. The command to issue is +/sys/kernel/tracing/user_events_data file. The command to issue is DIAG_IOCSDEL. This command only requires a single string specifying the event to delete by @@ -110,7 +110,7 @@ When tools attach/record user based events the status of the event is updated in realtime. This allows user programs to only incur the cost of the write() or writev() calls when something is actively attached to the event. -User programs call mmap() on /sys/kernel/debug/tracing/user_events_status to +User programs call mmap() on /sys/kernel/tracing/user_events_status to check the status for each event that is registered. The bit to check in the file is given back after the register ioctl() via user_reg.status_bit. The bit is always in little-endian format. Programs can check if the bit is set either |