summaryrefslogblamecommitdiff
path: root/Documentation/fault-injection/fault-injection.txt
blob: cf075c20eda0fe5fa75af8d2dd2612a5971638b2 (plain) (tree)























































































                                                                            















































































                                                                              




















































                                                                              
Fault injection capabilities infrastructure
===========================================

See also drivers/md/faulty.c and "every_nth" module option for scsi_debug.


Available fault injection capabilities
--------------------------------------

o failslab

  injects slab allocation failures. (kmalloc(), kmem_cache_alloc(), ...)

o fail_page_alloc

  injects page allocation failures. (alloc_pages(), get_free_pages(), ...)

o fail_make_request

  injects disk IO errors on permitted devices by
  /sys/block/<device>/make-it-fail or
  /sys/block/<device>/<partition>/make-it-fail. (generic_make_request())

Configure fault-injection capabilities behavior
-----------------------------------------------

o debugfs entries

fault-inject-debugfs kernel module provides some debugfs entries for runtime
configuration of fault-injection capabilities.

- /debug/*/probability:

	likelihood of failure injection, in percent.
	Format: <percent>

	Note that one-failure-per-handred is a very high error rate
	for some testcases. Please set probably=100 and configure
	/debug/*/interval for such testcases.

- /debug/*/interval:

	specifies the interval between failures, for calls to
	should_fail() that pass all the other tests.

	Note that if you enable this, by setting interval>1, you will
	probably want to set probability=100.

- /debug/*/times:

	specifies how many times failures may happen at most.
	A value of -1 means "no limit".

- /debug/*/space:

	specifies an initial resource "budget", decremented by "size"
	on each call to should_fail(,size).  Failure injection is
	suppressed until "space" reaches zero.

- /debug/*/verbose

	Format: { 0 | 1 | 2 }
	specifies the verbosity of the messages when failure is injected.
	We default to 0 (no extra messages), setting it to '1' will
	print only to tell failure happened, '2' will print call trace too -
	it is useful to debug the problems revealed by fault injection
	capabilities.

- /debug/*/task-filter:

	Format: { 0 | 1 }
	A value of '0' disables filtering by process (default).
	Any positive value limits failures to only processes indicated by
	/proc/<pid>/make-it-fail==1.

- /debug/*/address-start:
- /debug/*/address-end:

	specifies the range of virtual addresses tested during
	stacktrace walking.  Failure is injected only if some caller
	in the walked stacktrace lies within this range.
	Default is [0,ULONG_MAX) (whole of virtual address space).

- /debug/*/stacktrace-depth:

	specifies the maximum stacktrace depth walked during search
	for a caller within [address-start,address-end).

- /debug/fail_page_alloc/ignore-gfp-highmem:

	Format: { 0 | 1 }
	default is 0, setting it to '1' won't inject failures into
	highmem/user allocations.

- /debug/failslab/ignore-gfp-wait:
- /debug/fail_page_alloc/ignore-gfp-wait:

	Format: { 0 | 1 }
	default is 0, setting it to '1' will inject failures
	only into non-sleep allocations (GFP_ATOMIC allocations).

o Boot option

In order to inject faults while debugfs is not available (early boot time),
use the boot option:

	failslab=
	fail_page_alloc=
	fail_make_request=<interval>,<probability>,<space>,<times>

How to add new fault injection capability
-----------------------------------------

o #include <linux/fault-inject.h>

o define the fault attributes

  DECLARE_FAULT_INJECTION(name);

  Please see the definition of struct fault_attr in fault-inject.h
  for details.

o provide the way to configure fault attributes

- boot option

  If you need to enable the fault injection capability from boot time, you can
  provide boot option to configure it. There is a helper function for it.

  setup_fault_attr(attr, str);

- debugfs entries

  failslab, fail_page_alloc, and fail_make_request use this way.
  There is a helper function for it.

  init_fault_attr_entries(entries, attr, name);
  void cleanup_fault_attr_entries(entries);

- module parameters

  If the scope of the fault injection capability is limited to a
  single kernel module, it is better to provide module parameters to
  configure the fault attributes.

o add a hook to insert failures

  should_fail() returns 1 when failures should happen.

	should_fail(attr,size);

Application Examples
--------------------

o inject slab allocation failures into module init/cleanup code

------------------------------------------------------------------------------
#!/bin/bash

FAILCMD=Documentation/fault-injection/failcmd.sh
BLACKLIST="root_plug evbug"

FAILNAME=failslab
echo Y > /debug/$FAILNAME/task-filter
echo 10 > /debug/$FAILNAME/probability
echo 100 > /debug/$FAILNAME/interval
echo -1 > /debug/$FAILNAME/times
echo 2 > /debug/$FAILNAME/verbose
echo 1 > /debug/$FAILNAME/ignore-gfp-wait

blacklist()
{
	echo $BLACKLIST | grep $1 > /dev/null 2>&1
}

oops()
{
	dmesg | grep BUG > /dev/null 2>&1
}

find /lib/modules/`uname -r` -name '*.ko' -exec basename {} .ko \; |
	while read i
	do
		oops && exit 1

		if ! blacklist $i
		then
			echo inserting $i...
			bash $FAILCMD modprobe $i
		fi
	done

lsmod | awk '{ if ($3 == 0) { print $1 } }' |
	while read i
	do
		oops && exit 1

		if ! blacklist $i
		then
			echo removing $i...
			bash $FAILCMD modprobe -r $i
		fi
	done

------------------------------------------------------------------------------

o inject slab allocation failures only for a specific module

------------------------------------------------------------------------------
#!/bin/bash

FAILMOD=Documentation/fault-injection/failmodule.sh

echo injecting errors into the module $1...

modprobe $1
bash $FAILMOD failslab $1 10
echo 25 > /debug/failslab/probability

------------------------------------------------------------------------------