About Kernel Documentation Linux Kernel Contact Linux Resources Linux Blog

Documentation / trace / events.txt




Custom Search

Based on kernel version 3.16. Page generated on 2014-08-06 21:41 EST.

1				     Event Tracing
2	
3			Documentation written by Theodore Ts'o
4			Updated by Li Zefan and Tom Zanussi
5	
6	1. Introduction
7	===============
8	
9	Tracepoints (see Documentation/trace/tracepoints.txt) can be used
10	without creating custom kernel modules to register probe functions
11	using the event tracing infrastructure.
12	
13	Not all tracepoints can be traced using the event tracing system;
14	the kernel developer must provide code snippets which define how the
15	tracing information is saved into the tracing buffer, and how the
16	tracing information should be printed.
17	
18	2. Using Event Tracing
19	======================
20	
21	2.1 Via the 'set_event' interface
22	---------------------------------
23	
24	The events which are available for tracing can be found in the file
25	/sys/kernel/debug/tracing/available_events.
26	
27	To enable a particular event, such as 'sched_wakeup', simply echo it
28	to /sys/kernel/debug/tracing/set_event. For example:
29	
30		# echo sched_wakeup >> /sys/kernel/debug/tracing/set_event
31	
32	[ Note: '>>' is necessary, otherwise it will firstly disable
33	  all the events. ]
34	
35	To disable an event, echo the event name to the set_event file prefixed
36	with an exclamation point:
37	
38		# echo '!sched_wakeup' >> /sys/kernel/debug/tracing/set_event
39	
40	To disable all events, echo an empty line to the set_event file:
41	
42		# echo > /sys/kernel/debug/tracing/set_event
43	
44	To enable all events, echo '*:*' or '*:' to the set_event file:
45	
46		# echo *:* > /sys/kernel/debug/tracing/set_event
47	
48	The events are organized into subsystems, such as ext4, irq, sched,
49	etc., and a full event name looks like this: <subsystem>:<event>.  The
50	subsystem name is optional, but it is displayed in the available_events
51	file.  All of the events in a subsystem can be specified via the syntax
52	"<subsystem>:*"; for example, to enable all irq events, you can use the
53	command:
54	
55		# echo 'irq:*' > /sys/kernel/debug/tracing/set_event
56	
57	2.2 Via the 'enable' toggle
58	---------------------------
59	
60	The events available are also listed in /sys/kernel/debug/tracing/events/ hierarchy
61	of directories.
62	
63	To enable event 'sched_wakeup':
64	
65		# echo 1 > /sys/kernel/debug/tracing/events/sched/sched_wakeup/enable
66	
67	To disable it:
68	
69		# echo 0 > /sys/kernel/debug/tracing/events/sched/sched_wakeup/enable
70	
71	To enable all events in sched subsystem:
72	
73		# echo 1 > /sys/kernel/debug/tracing/events/sched/enable
74	
75	To enable all events:
76	
77		# echo 1 > /sys/kernel/debug/tracing/events/enable
78	
79	When reading one of these enable files, there are four results:
80	
81	 0 - all events this file affects are disabled
82	 1 - all events this file affects are enabled
83	 X - there is a mixture of events enabled and disabled
84	 ? - this file does not affect any event
85	
86	2.3 Boot option
87	---------------
88	
89	In order to facilitate early boot debugging, use boot option:
90	
91		trace_event=[event-list]
92	
93	event-list is a comma separated list of events. See section 2.1 for event
94	format.
95	
96	3. Defining an event-enabled tracepoint
97	=======================================
98	
99	See The example provided in samples/trace_events
100	
101	4. Event formats
102	================
103	
104	Each trace event has a 'format' file associated with it that contains
105	a description of each field in a logged event.  This information can
106	be used to parse the binary trace stream, and is also the place to
107	find the field names that can be used in event filters (see section 5).
108	
109	It also displays the format string that will be used to print the
110	event in text mode, along with the event name and ID used for
111	profiling.
112	
113	Every event has a set of 'common' fields associated with it; these are
114	the fields prefixed with 'common_'.  The other fields vary between
115	events and correspond to the fields defined in the TRACE_EVENT
116	definition for that event.
117	
118	Each field in the format has the form:
119	
120	     field:field-type field-name; offset:N; size:N;
121	
122	where offset is the offset of the field in the trace record and size
123	is the size of the data item, in bytes.
124	
125	For example, here's the information displayed for the 'sched_wakeup'
126	event:
127	
128	# cat /sys/kernel/debug/tracing/events/sched/sched_wakeup/format
129	
130	name: sched_wakeup
131	ID: 60
132	format:
133		field:unsigned short common_type;	offset:0;	size:2;
134		field:unsigned char common_flags;	offset:2;	size:1;
135		field:unsigned char common_preempt_count;	offset:3;	size:1;
136		field:int common_pid;	offset:4;	size:4;
137		field:int common_tgid;	offset:8;	size:4;
138	
139		field:char comm[TASK_COMM_LEN];	offset:12;	size:16;
140		field:pid_t pid;	offset:28;	size:4;
141		field:int prio;	offset:32;	size:4;
142		field:int success;	offset:36;	size:4;
143		field:int cpu;	offset:40;	size:4;
144	
145	print fmt: "task %s:%d [%d] success=%d [%03d]", REC->comm, REC->pid,
146		   REC->prio, REC->success, REC->cpu
147	
148	This event contains 10 fields, the first 5 common and the remaining 5
149	event-specific.  All the fields for this event are numeric, except for
150	'comm' which is a string, a distinction important for event filtering.
151	
152	5. Event filtering
153	==================
154	
155	Trace events can be filtered in the kernel by associating boolean
156	'filter expressions' with them.  As soon as an event is logged into
157	the trace buffer, its fields are checked against the filter expression
158	associated with that event type.  An event with field values that
159	'match' the filter will appear in the trace output, and an event whose
160	values don't match will be discarded.  An event with no filter
161	associated with it matches everything, and is the default when no
162	filter has been set for an event.
163	
164	5.1 Expression syntax
165	---------------------
166	
167	A filter expression consists of one or more 'predicates' that can be
168	combined using the logical operators '&&' and '||'.  A predicate is
169	simply a clause that compares the value of a field contained within a
170	logged event with a constant value and returns either 0 or 1 depending
171	on whether the field value matched (1) or didn't match (0):
172	
173		  field-name relational-operator value
174	
175	Parentheses can be used to provide arbitrary logical groupings and
176	double-quotes can be used to prevent the shell from interpreting
177	operators as shell metacharacters.
178	
179	The field-names available for use in filters can be found in the
180	'format' files for trace events (see section 4).
181	
182	The relational-operators depend on the type of the field being tested:
183	
184	The operators available for numeric fields are:
185	
186	==, !=, <, <=, >, >=, &
187	
188	And for string fields they are:
189	
190	==, !=, ~
191	
192	The glob (~) only accepts a wild card character (*) at the start and or
193	end of the string. For example:
194	
195	  prev_comm ~ "*sh"
196	  prev_comm ~ "sh*"
197	  prev_comm ~ "*sh*"
198	
199	But does not allow for it to be within the string:
200	
201	  prev_comm ~ "ba*sh"   <-- is invalid
202	
203	5.2 Setting filters
204	-------------------
205	
206	A filter for an individual event is set by writing a filter expression
207	to the 'filter' file for the given event.
208	
209	For example:
210	
211	# cd /sys/kernel/debug/tracing/events/sched/sched_wakeup
212	# echo "common_preempt_count > 4" > filter
213	
214	A slightly more involved example:
215	
216	# cd /sys/kernel/debug/tracing/events/signal/signal_generate
217	# echo "((sig >= 10 && sig < 15) || sig == 17) && comm != bash" > filter
218	
219	If there is an error in the expression, you'll get an 'Invalid
220	argument' error when setting it, and the erroneous string along with
221	an error message can be seen by looking at the filter e.g.:
222	
223	# cd /sys/kernel/debug/tracing/events/signal/signal_generate
224	# echo "((sig >= 10 && sig < 15) || dsig == 17) && comm != bash" > filter
225	-bash: echo: write error: Invalid argument
226	# cat filter
227	((sig >= 10 && sig < 15) || dsig == 17) && comm != bash
228	^
229	parse_error: Field not found
230	
231	Currently the caret ('^') for an error always appears at the beginning of
232	the filter string; the error message should still be useful though
233	even without more accurate position info.
234	
235	5.3 Clearing filters
236	--------------------
237	
238	To clear the filter for an event, write a '0' to the event's filter
239	file.
240	
241	To clear the filters for all events in a subsystem, write a '0' to the
242	subsystem's filter file.
243	
244	5.3 Subsystem filters
245	---------------------
246	
247	For convenience, filters for every event in a subsystem can be set or
248	cleared as a group by writing a filter expression into the filter file
249	at the root of the subsystem.  Note however, that if a filter for any
250	event within the subsystem lacks a field specified in the subsystem
251	filter, or if the filter can't be applied for any other reason, the
252	filter for that event will retain its previous setting.  This can
253	result in an unintended mixture of filters which could lead to
254	confusing (to the user who might think different filters are in
255	effect) trace output.  Only filters that reference just the common
256	fields can be guaranteed to propagate successfully to all events.
257	
258	Here are a few subsystem filter examples that also illustrate the
259	above points:
260	
261	Clear the filters on all events in the sched subsystem:
262	
263	# cd /sys/kernel/debug/tracing/events/sched
264	# echo 0 > filter
265	# cat sched_switch/filter
266	none
267	# cat sched_wakeup/filter
268	none
269	
270	Set a filter using only common fields for all events in the sched
271	subsystem (all events end up with the same filter):
272	
273	# cd /sys/kernel/debug/tracing/events/sched
274	# echo common_pid == 0 > filter
275	# cat sched_switch/filter
276	common_pid == 0
277	# cat sched_wakeup/filter
278	common_pid == 0
279	
280	Attempt to set a filter using a non-common field for all events in the
281	sched subsystem (all events but those that have a prev_pid field retain
282	their old filters):
283	
284	# cd /sys/kernel/debug/tracing/events/sched
285	# echo prev_pid == 0 > filter
286	# cat sched_switch/filter
287	prev_pid == 0
288	# cat sched_wakeup/filter
289	common_pid == 0
290	
291	6. Event triggers
292	=================
293	
294	Trace events can be made to conditionally invoke trigger 'commands'
295	which can take various forms and are described in detail below;
296	examples would be enabling or disabling other trace events or invoking
297	a stack trace whenever the trace event is hit.  Whenever a trace event
298	with attached triggers is invoked, the set of trigger commands
299	associated with that event is invoked.  Any given trigger can
300	additionally have an event filter of the same form as described in
301	section 5 (Event filtering) associated with it - the command will only
302	be invoked if the event being invoked passes the associated filter.
303	If no filter is associated with the trigger, it always passes.
304	
305	Triggers are added to and removed from a particular event by writing
306	trigger expressions to the 'trigger' file for the given event.
307	
308	A given event can have any number of triggers associated with it,
309	subject to any restrictions that individual commands may have in that
310	regard.
311	
312	Event triggers are implemented on top of "soft" mode, which means that
313	whenever a trace event has one or more triggers associated with it,
314	the event is activated even if it isn't actually enabled, but is
315	disabled in a "soft" mode.  That is, the tracepoint will be called,
316	but just will not be traced, unless of course it's actually enabled.
317	This scheme allows triggers to be invoked even for events that aren't
318	enabled, and also allows the current event filter implementation to be
319	used for conditionally invoking triggers.
320	
321	The syntax for event triggers is roughly based on the syntax for
322	set_ftrace_filter 'ftrace filter commands' (see the 'Filter commands'
323	section of Documentation/trace/ftrace.txt), but there are major
324	differences and the implementation isn't currently tied to it in any
325	way, so beware about making generalizations between the two.
326	
327	6.1 Expression syntax
328	---------------------
329	
330	Triggers are added by echoing the command to the 'trigger' file:
331	
332	  # echo 'command[:count] [if filter]' > trigger
333	
334	Triggers are removed by echoing the same command but starting with '!'
335	to the 'trigger' file:
336	
337	  # echo '!command[:count] [if filter]' > trigger
338	
339	The [if filter] part isn't used in matching commands when removing, so
340	leaving that off in a '!' command will accomplish the same thing as
341	having it in.
342	
343	The filter syntax is the same as that described in the 'Event
344	filtering' section above.
345	
346	For ease of use, writing to the trigger file using '>' currently just
347	adds or removes a single trigger and there's no explicit '>>' support
348	('>' actually behaves like '>>') or truncation support to remove all
349	triggers (you have to use '!' for each one added.)
350	
351	6.2 Supported trigger commands
352	------------------------------
353	
354	The following commands are supported:
355	
356	- enable_event/disable_event
357	
358	  These commands can enable or disable another trace event whenever
359	  the triggering event is hit.  When these commands are registered,
360	  the other trace event is activated, but disabled in a "soft" mode.
361	  That is, the tracepoint will be called, but just will not be traced.
362	  The event tracepoint stays in this mode as long as there's a trigger
363	  in effect that can trigger it.
364	
365	  For example, the following trigger causes kmalloc events to be
366	  traced when a read system call is entered, and the :1 at the end
367	  specifies that this enablement happens only once:
368	
369	  # echo 'enable_event:kmem:kmalloc:1' > \
370	      /sys/kernel/debug/tracing/events/syscalls/sys_enter_read/trigger
371	
372	  The following trigger causes kmalloc events to stop being traced
373	  when a read system call exits.  This disablement happens on every
374	  read system call exit:
375	
376	  # echo 'disable_event:kmem:kmalloc' > \
377	      /sys/kernel/debug/tracing/events/syscalls/sys_exit_read/trigger
378	
379	  The format is:
380	
381	      enable_event:<system>:<event>[:count]
382	      disable_event:<system>:<event>[:count]
383	
384	  To remove the above commands:
385	
386	  # echo '!enable_event:kmem:kmalloc:1' > \
387	      /sys/kernel/debug/tracing/events/syscalls/sys_enter_read/trigger
388	
389	  # echo '!disable_event:kmem:kmalloc' > \
390	      /sys/kernel/debug/tracing/events/syscalls/sys_exit_read/trigger
391	
392	  Note that there can be any number of enable/disable_event triggers
393	  per triggering event, but there can only be one trigger per
394	  triggered event. e.g. sys_enter_read can have triggers enabling both
395	  kmem:kmalloc and sched:sched_switch, but can't have two kmem:kmalloc
396	  versions such as kmem:kmalloc and kmem:kmalloc:1 or 'kmem:kmalloc if
397	  bytes_req == 256' and 'kmem:kmalloc if bytes_alloc == 256' (they
398	  could be combined into a single filter on kmem:kmalloc though).
399	
400	- stacktrace
401	
402	  This command dumps a stacktrace in the trace buffer whenever the
403	  triggering event occurs.
404	
405	  For example, the following trigger dumps a stacktrace every time the
406	  kmalloc tracepoint is hit:
407	
408	  # echo 'stacktrace' > \
409	        /sys/kernel/debug/tracing/events/kmem/kmalloc/trigger
410	
411	  The following trigger dumps a stacktrace the first 5 times a kmalloc
412	  request happens with a size >= 64K
413	
414	  # echo 'stacktrace:5 if bytes_req >= 65536' > \
415	        /sys/kernel/debug/tracing/events/kmem/kmalloc/trigger
416	
417	  The format is:
418	
419	      stacktrace[:count]
420	
421	  To remove the above commands:
422	
423	  # echo '!stacktrace' > \
424	        /sys/kernel/debug/tracing/events/kmem/kmalloc/trigger
425	
426	  # echo '!stacktrace:5 if bytes_req >= 65536' > \
427	        /sys/kernel/debug/tracing/events/kmem/kmalloc/trigger
428	
429	  The latter can also be removed more simply by the following (without
430	  the filter):
431	
432	  # echo '!stacktrace:5' > \
433	        /sys/kernel/debug/tracing/events/kmem/kmalloc/trigger
434	
435	  Note that there can be only one stacktrace trigger per triggering
436	  event.
437	
438	- snapshot
439	
440	  This command causes a snapshot to be triggered whenever the
441	  triggering event occurs.
442	
443	  The following command creates a snapshot every time a block request
444	  queue is unplugged with a depth > 1.  If you were tracing a set of
445	  events or functions at the time, the snapshot trace buffer would
446	  capture those events when the trigger event occurred:
447	
448	  # echo 'snapshot if nr_rq > 1' > \
449	        /sys/kernel/debug/tracing/events/block/block_unplug/trigger
450	
451	  To only snapshot once:
452	
453	  # echo 'snapshot:1 if nr_rq > 1' > \
454	        /sys/kernel/debug/tracing/events/block/block_unplug/trigger
455	
456	  To remove the above commands:
457	
458	  # echo '!snapshot if nr_rq > 1' > \
459	        /sys/kernel/debug/tracing/events/block/block_unplug/trigger
460	
461	  # echo '!snapshot:1 if nr_rq > 1' > \
462	        /sys/kernel/debug/tracing/events/block/block_unplug/trigger
463	
464	  Note that there can be only one snapshot trigger per triggering
465	  event.
466	
467	- traceon/traceoff
468	
469	  These commands turn tracing on and off when the specified events are
470	  hit. The parameter determines how many times the tracing system is
471	  turned on and off. If unspecified, there is no limit.
472	
473	  The following command turns tracing off the first time a block
474	  request queue is unplugged with a depth > 1.  If you were tracing a
475	  set of events or functions at the time, you could then examine the
476	  trace buffer to see the sequence of events that led up to the
477	  trigger event:
478	
479	  # echo 'traceoff:1 if nr_rq > 1' > \
480	        /sys/kernel/debug/tracing/events/block/block_unplug/trigger
481	
482	  To always disable tracing when nr_rq  > 1 :
483	
484	  # echo 'traceoff if nr_rq > 1' > \
485	        /sys/kernel/debug/tracing/events/block/block_unplug/trigger
486	
487	  To remove the above commands:
488	
489	  # echo '!traceoff:1 if nr_rq > 1' > \
490	        /sys/kernel/debug/tracing/events/block/block_unplug/trigger
491	
492	  # echo '!traceoff if nr_rq > 1' > \
493	        /sys/kernel/debug/tracing/events/block/block_unplug/trigger
494	
495	  Note that there can be only one traceon or traceoff trigger per
496	  triggering event.
Hide Line Numbers
About Kernel Documentation Linux Kernel Contact Linux Resources Linux Blog

Information is copyright its respective author. All material is available from the Linux Kernel Source distributed under a GPL License. This page is provided as a free service by mjmwired.net.