docs: filesystems: vfs: Render method descriptions

Currently vfs.rst does not render well into HTML the method descriptions for VFS data structures. We can improve the HTML output by putting the description string on a new line following the method name. Suggested-by: Jonathan Corbet <corbet@lwn.net> Signed-off-by: Tobin C. Harding <tobin@kernel.org> Signed-off-by: Jonathan Corbet <corbet@lwn.net>
author: Tobin C. Harding <tobin@kernel.org> 2019-06-04 10:26:56 +1000
committer: Jonathan Corbet <corbet@lwn.net> 2019-06-06 09:41:13 -0600
commit: ee5dc0491c38ae4e4e583d7532d470754bb173f6 (patch)
tree: 6b0d39e34a968dcb90387bf7f13bd67e3ce560aa /Documentation/filesystems/vfs.rst
parent: af96c1e304f7051bf2ee64c9957724bdace05c58 (diff)
download: lwn-ee5dc0491c38ae4e4e583d7532d470754bb173f6.tar.gz
lwn-ee5dc0491c38ae4e4e583d7532d470754bb173f6.zip
1 files changed, 642 insertions, 505 deletions
diff --git a/Documentation/filesystems/vfs.rst b/Documentation/filesystems/vfs.rst
index 2ffbdf5f392c..0f85ab21c2ca 100644
--- a/Documentation/filesystems/vfs.rst
+++ b/Documentation/filesystems/vfs.rst
@@ -125,35 +125,46 @@ members are defined:
 		struct lock_class_key s_umount_key;
 	};
 
-``name``: the name of the filesystem type, such as "ext2", "iso9660",
+``name``
+	the name of the filesystem type, such as "ext2", "iso9660",
 	"msdos" and so on
 
-``fs_flags``: various flags (i.e. FS_REQUIRES_DEV, FS_NO_DCACHE, etc.)
+``fs_flags``
+	various flags (i.e. FS_REQUIRES_DEV, FS_NO_DCACHE, etc.)
 
-``mount``: the method to call when a new instance of this filesystem should
-be mounted
+``mount``
+	the method to call when a new instance of this filesystem should
+	be mounted
 
-``kill_sb``: the method to call when an instance of this filesystem
-	should be shut down
+``kill_sb``
+	the method to call when an instance of this filesystem should be
+	shut down
 
-``owner``: for internal VFS use: you should initialize this to THIS_MODULE in
-	most cases.
 
-``next``: for internal VFS use: you should initialize this to NULL
+``owner``
+	for internal VFS use: you should initialize this to THIS_MODULE
+	in most cases.
+
+``next``
+	for internal VFS use: you should initialize this to NULL
 
   s_lock_key, s_umount_key: lockdep-specific
 
 The mount() method has the following arguments:
 
-``struct file_system_type *fs_type``: describes the filesystem, partly initialized
-	by the specific filesystem code
+``struct file_system_type *fs_type``
+	describes the filesystem, partly initialized by the specific
+	filesystem code
 
-``int flags``: mount flags
+``int flags``
+	mount flags
 
-``const char *dev_name``: the device name we are mounting.
+``const char *dev_name``
+	the device name we are mounting.
 
-``void *data``: arbitrary mount options, usually comes as an ASCII
-	string (see "Mount Options" section)
+``void *data``
+	arbitrary mount options, usually comes as an ASCII string (see
+	"Mount Options" section)
 
 The mount() method must return the root dentry of the tree requested by
 caller.  An active reference to its superblock must be grabbed and the
@@ -178,22 +189,27 @@ implementation.
 Usually, a filesystem uses one of the generic mount() implementations
 and provides a fill_super() callback instead.  The generic variants are:
 
-``mount_bdev``: mount a filesystem residing on a block device
+``mount_bdev``
+	mount a filesystem residing on a block device
 
-``mount_nodev``: mount a filesystem that is not backed by a device
+``mount_nodev``
+	mount a filesystem that is not backed by a device
 
-``mount_single``: mount a filesystem which shares the instance between
-	all mounts
+``mount_single``
+	mount a filesystem which shares the instance between all mounts
 
 A fill_super() callback implementation has the following arguments:
 
-``struct super_block *sb``: the superblock structure.  The callback
-	must initialize this properly.
+``struct super_block *sb``
+	the superblock structure.  The callback must initialize this
+	properly.
 
-``void *data``: arbitrary mount options, usually comes as an ASCII
-	string (see "Mount Options" section)
+``void *data``
+	arbitrary mount options, usually comes as an ASCII string (see
+	"Mount Options" section)
 
-``int silent``: whether or not to be silent on error
+``int silent``
+	whether or not to be silent on error
 
 
 The Superblock Object
@@ -240,87 +256,106 @@ noted.  This means that most methods can block safely.  All methods are
 only called from a process context (i.e. not from an interrupt handler
 or bottom half).
 
-``alloc_inode``: this method is called by alloc_inode() to allocate memory
-	for struct inode and initialize it.  If this function is not
+``alloc_inode``
+	this method is called by alloc_inode() to allocate memory for
+	struct inode and initialize it.  If this function is not
 	defined, a simple 'struct inode' is allocated.  Normally
 	alloc_inode will be used to allocate a larger structure which
 	contains a 'struct inode' embedded within it.
 
-``destroy_inode``: this method is called by destroy_inode() to release
-	resources allocated for struct inode.  It is only required if
+``destroy_inode``
+	this method is called by destroy_inode() to release resources
+	allocated for struct inode.  It is only required if
 	->alloc_inode was defined and simply undoes anything done by
 	->alloc_inode.
 
-``dirty_inode``: this method is called by the VFS to mark an inode dirty.
+``dirty_inode``
+	this method is called by the VFS to mark an inode dirty.
 
-``write_inode``: this method is called when the VFS needs to write an
-	inode to disc.  The second parameter indicates whether the write
-	should be synchronous or not, not all filesystems check this flag.
+``write_inode``
+	this method is called when the VFS needs to write an inode to
+	disc.  The second parameter indicates whether the write should
+	be synchronous or not, not all filesystems check this flag.
 
-``drop_inode``: called when the last access to the inode is dropped,
-	with the inode->i_lock spinlock held.
+``drop_inode``
+	called when the last access to the inode is dropped, with the
+	inode->i_lock spinlock held.
 
 	This method should be either NULL (normal UNIX filesystem
-	semantics) or "generic_delete_inode" (for filesystems that do not
-	want to cache inodes - causing "delete_inode" to always be
+	semantics) or "generic_delete_inode" (for filesystems that do
+	not want to cache inodes - causing "delete_inode" to always be
 	called regardless of the value of i_nlink)
 
-	The "generic_delete_inode()" behavior is equivalent to the
-	old practice of using "force_delete" in the put_inode() case,
-	but does not have the races that the "force_delete()" approach
-	had. 
+	The "generic_delete_inode()" behavior is equivalent to the old
+	practice of using "force_delete" in the put_inode() case, but
+	does not have the races that the "force_delete()" approach had.
 
-``delete_inode``: called when the VFS wants to delete an inode
+``delete_inode``
+	called when the VFS wants to delete an inode
 
-``put_super``: called when the VFS wishes to free the superblock
+``put_super``
+	called when the VFS wishes to free the superblock
 	(i.e. unmount).  This is called with the superblock lock held
 
-``sync_fs``: called when VFS is writing out all dirty data associated with
-	a superblock.  The second parameter indicates whether the method
+``sync_fs``
+	called when VFS is writing out all dirty data associated with a
+	superblock.  The second parameter indicates whether the method
 	should wait until the write out has been completed.  Optional.
 
-``freeze_fs``: called when VFS is locking a filesystem and
-	forcing it into a consistent state.  This method is currently
-	used by the Logical Volume Manager (LVM).
+``freeze_fs``
+	called when VFS is locking a filesystem and forcing it into a
+	consistent state.  This method is currently used by the Logical
+	Volume Manager (LVM).
 
-``unfreeze_fs``: called when VFS is unlocking a filesystem and making it writable
+``unfreeze_fs``
+	called when VFS is unlocking a filesystem and making it writable
 	again.
 
-``statfs``: called when the VFS needs to get filesystem statistics.
+``statfs``
+	called when the VFS needs to get filesystem statistics.
 
-``remount_fs``: called when the filesystem is remounted.  This is called
-	with the kernel lock held
+``remount_fs``
+	called when the filesystem is remounted.  This is called with
+	the kernel lock held
 
-``clear_inode``: called then the VFS clears the inode.  Optional
+``clear_inode``
+	called then the VFS clears the inode.  Optional
 
-``umount_begin``: called when the VFS is unmounting a filesystem.
+``umount_begin``
+	called when the VFS is unmounting a filesystem.
 
-``show_options``: called by the VFS to show mount options for
-	/proc/<pid>/mounts.  (see "Mount Options" section)
+``show_options``
+	called by the VFS to show mount options for /proc/<pid>/mounts.
+	(see "Mount Options" section)
 
-``quota_read``: called by the VFS to read from filesystem quota file.
+``quota_read``
+	called by the VFS to read from filesystem quota file.
 
-``quota_write``: called by the VFS to write to filesystem quota file.
+``quota_write``
+	called by the VFS to write to filesystem quota file.
 
-``nr_cached_objects``: called by the sb cache shrinking function for the
-	filesystem to return the number of freeable cached objects it contains.
+``nr_cached_objects``
+	called by the sb cache shrinking function for the filesystem to
+	return the number of freeable cached objects it contains.
 	Optional.
 
-``free_cache_objects``: called by the sb cache shrinking function for the
-	filesystem to scan the number of objects indicated to try to free them.
-	Optional, but any filesystem implementing this method needs to also
-	implement ->nr_cached_objects for it to be called correctly.
+``free_cache_objects``
+	called by the sb cache shrinking function for the filesystem to
+	scan the number of objects indicated to try to free them.
+	Optional, but any filesystem implementing this method needs to
+	also implement ->nr_cached_objects for it to be called
+	correctly.
 
 	We can't do anything with any errors that the filesystem might
-	encountered, hence the void return type.  This will never be called if
-	the VM is trying to reclaim under GFP_NOFS conditions, hence this
-	method does not need to handle that situation itself.
+	encountered, hence the void return type.  This will never be
+	called if the VM is trying to reclaim under GFP_NOFS conditions,
+	hence this method does not need to handle that situation itself.
 
-	Implementations must include conditional reschedule calls inside any
-	scanning loop that is done.  This allows the VFS to determine
-	appropriate scan batch sizes without having to worry about whether
-	implementations will cause holdoff problems due to large scan batch
-	sizes.
+	Implementations must include conditional reschedule calls inside
+	any scanning loop that is done.  This allows the VFS to
+	determine appropriate scan batch sizes without having to worry
+	about whether implementations will cause holdoff problems due to
+	large scan batch sizes.
 
 Whoever sets up the inode is responsible for filling in the "i_op"
 field.  This is a pointer to a "struct inode_operations" which describes
@@ -334,23 +369,31 @@ On filesystems that support extended attributes (xattrs), the s_xattr
 superblock field points to a NULL-terminated array of xattr handlers.
 Extended attributes are name:value pairs.
 
-``name``: Indicates that the handler matches attributes with the specified name
-	(such as "system.posix_acl_access"); the prefix field must be NULL.
+``name``
+	Indicates that the handler matches attributes with the specified
+	name (such as "system.posix_acl_access"); the prefix field must
+	be NULL.
 
-``prefix``: Indicates that the handler matches all attributes with the specified
-	name prefix (such as "user."); the name field must be NULL.
+``prefix``
+	Indicates that the handler matches all attributes with the
+	specified name prefix (such as "user."); the name field must be
+	NULL.
 
-``list``: Determine if attributes matching this xattr handler should be listed
-	for a particular dentry.  Used by some listxattr implementations like
-	generic_listxattr.
+``list``
+	Determine if attributes matching this xattr handler should be
+	listed for a particular dentry.  Used by some listxattr
+	implementations like generic_listxattr.
 
-``get``: Called by the VFS to get the value of a particular extended attribute.
-	This method is called by the getxattr(2) system call.
+``get``
+	Called by the VFS to get the value of a particular extended
+	attribute.  This method is called by the getxattr(2) system
+	call.
 
-``set``: Called by the VFS to set the value of a particular extended attribute.
-	When the new value is NULL, called to remove a particular extended
-	attribute.  This method is called by the the setxattr(2) and
-	removexattr(2) system calls.
+``set``
+	Called by the VFS to set the value of a particular extended
+	attribute.  When the new value is NULL, called to remove a
+	particular extended attribute.  This method is called by the the
+	setxattr(2) and removexattr(2) system calls.
 
 When none of the xattr handlers of a filesystem match the specified
 attribute name or when a filesystem doesn't support extended attributes,
@@ -399,128 +442,147 @@ As of kernel 2.6.22, the following members are defined:
 Again, all methods are called without any locks being held, unless
 otherwise noted.
 
-``create``: called by the open(2) and creat(2) system calls.  Only
-	required if you want to support regular files.  The dentry you
-	get should not have an inode (i.e. it should be a negative
-	dentry).  Here you will probably call d_instantiate() with the
-	dentry and the newly created inode
+``create``
+	called by the open(2) and creat(2) system calls.  Only required
+	if you want to support regular files.  The dentry you get should
+	not have an inode (i.e. it should be a negative dentry).  Here
+	you will probably call d_instantiate() with the dentry and the
+	newly created inode
 
-``lookup``: called when the VFS needs to look up an inode in a parent
+``lookup``
+	called when the VFS needs to look up an inode in a parent
 	directory.  The name to look for is found in the dentry.  This
 	method must call d_add() to insert the found inode into the
 	dentry.  The "i_count" field in the inode structure should be
 	incremented.  If the named inode does not exist a NULL inode
 	should be inserted into the dentry (this is called a negative
-	dentry).  Returning an error code from this routine must only
-	be done on a real error, otherwise creating inodes with system
+	dentry).  Returning an error code from this routine must only be
+	done on a real error, otherwise creating inodes with system
 	calls like create(2), mknod(2), mkdir(2) and so on will fail.
 	If you wish to overload the dentry methods then you should
-	initialise the "d_dop" field in the dentry; this is a pointer
-	to a struct "dentry_operations".
-	This method is called with the directory inode semaphore held
+	initialise the "d_dop" field in the dentry; this is a pointer to
+	a struct "dentry_operations".  This method is called with the
+	directory inode semaphore held
 
-``link``: called by the link(2) system call.  Only required if you want
-	to support hard links.  You will probably need to call
+``link``
+	called by the link(2) system call.  Only required if you want to
+	support hard links.  You will probably need to call
 	d_instantiate() just as you would in the create() method
 
-``unlink``: called by the unlink(2) system call.  Only required if you
-	want to support deleting inodes
+``unlink``
+	called by the unlink(2) system call.  Only required if you want
+	to support deleting inodes
 
-``symlink``: called by the symlink(2) system call.  Only required if you
-	want to support symlinks.  You will probably need to call
+``symlink``
+	called by the symlink(2) system call.  Only required if you want
+	to support symlinks.  You will probably need to call
 	d_instantiate() just as you would in the create() method
 
-``mkdir``: called by the mkdir(2) system call.  Only required if you want
+``mkdir``
+	called by the mkdir(2) system call.  Only required if you want
 	to support creating subdirectories.  You will probably need to
 	call d_instantiate() just as you would in the create() method
 
-``rmdir``: called by the rmdir(2) system call.  Only required if you want
+``rmdir``
+	called by the rmdir(2) system call.  Only required if you want
 	to support deleting subdirectories
 
-``mknod``: called by the mknod(2) system call to create a device (char,
-	block) inode or a named pipe (FIFO) or socket.  Only required
-	if you want to support creating these types of inodes.  You
-	will probably need to call d_instantiate() just as you would
-	in the create() method
+``mknod``
+	called by the mknod(2) system call to create a device (char,
+	block) inode or a named pipe (FIFO) or socket.  Only required if
+	you want to support creating these types of inodes.  You will
+	probably need to call d_instantiate() just as you would in the
+	create() method
 
-``rename``: called by the rename(2) system call to rename the object to
-	have the parent and name given by the second inode and dentry.
+``rename``
+	called by the rename(2) system call to rename the object to have
+	the parent and name given by the second inode and dentry.
 
 	The filesystem must return -EINVAL for any unsupported or
-	unknown	flags.  Currently the following flags are implemented:
-	(1) RENAME_NOREPLACE: this flag indicates that if the target
-	of the rename exists the rename should fail with -EEXIST
-	instead of replacing the target.  The VFS already checks for
-	existence, so for local filesystems the RENAME_NOREPLACE
-	implementation is equivalent to plain rename.
+	unknown flags.  Currently the following flags are implemented:
+	(1) RENAME_NOREPLACE: this flag indicates that if the target of
+	the rename exists the rename should fail with -EEXIST instead of
+	replacing the target.  The VFS already checks for existence, so
+	for local filesystems the RENAME_NOREPLACE implementation is
+	equivalent to plain rename.
 	(2) RENAME_EXCHANGE: exchange source and target.  Both must
-	exist; this is checked by the VFS.  Unlike plain rename,
-	source and target may be of different type.
-
-``get_link``: called by the VFS to follow a symbolic link to the
-	inode it points to.  Only required if you want to support
-	symbolic links.  This method returns the symlink body
-	to traverse (and possibly resets the current position with
-	nd_jump_link()).  If the body won't go away until the inode
-	is gone, nothing else is needed; if it needs to be otherwise
-	pinned, arrange for its release by having get_link(..., ..., done)
-	do set_delayed_call(done, destructor, argument).
-	In that case destructor(argument) will be called once VFS is
-	done with the body you've returned.
-	May be called in RCU mode; that is indicated by NULL dentry
+	exist; this is checked by the VFS.  Unlike plain rename, source
+	and target may be of different type.
+
+``get_link``
+	called by the VFS to follow a symbolic link to the inode it
+	points to.  Only required if you want to support symbolic links.
+	This method returns the symlink body to traverse (and possibly
+	resets the current position with nd_jump_link()).  If the body
+	won't go away until the inode is gone, nothing else is needed;
+	if it needs to be otherwise pinned, arrange for its release by
+	having get_link(..., ..., done) do set_delayed_call(done,
+	destructor, argument).  In that case destructor(argument) will
+	be called once VFS is done with the body you've returned.  May
+	be called in RCU mode; that is indicated by NULL dentry
 	argument.  If request can't be handled without leaving RCU mode,
 	have it return ERR_PTR(-ECHILD).
 
-
 	If the filesystem stores the symlink target in ->i_link, the
 	VFS may use it directly without calling ->get_link(); however,
 	->get_link() must still be provided.  ->i_link must not be
 	freed until after an RCU grace period.  Writing to ->i_link
 	post-iget() time requires a 'release' memory barrier.
 
-``readlink``: this is now just an override for use by readlink(2) for the
+``readlink``
+	this is now just an override for use by readlink(2) for the
 	cases when ->get_link uses nd_jump_link() or object is not in
 	fact a symlink.  Normally filesystems should only implement
 	->get_link for symlinks and readlink(2) will automatically use
 	that.
 
-``permission``: called by the VFS to check for access rights on a POSIX-like
+``permission``
+	called by the VFS to check for access rights on a POSIX-like
 	filesystem.
 
-	May be called in rcu-walk mode (mask & MAY_NOT_BLOCK).  If in rcu-walk
-	mode, the filesystem must check the permission without blocking or
-	storing to the inode.
+	May be called in rcu-walk mode (mask & MAY_NOT_BLOCK).  If in
+	rcu-walk mode, the filesystem must check the permission without
+	blocking or storing to the inode.
 
-	If a situation is encountered that rcu-walk cannot handle, return
+	If a situation is encountered that rcu-walk cannot handle,
+	return
 	-ECHILD and it will be called again in ref-walk mode.
 
-``setattr``: called by the VFS to set attributes for a file.  This method
-	is called by chmod(2) and related system calls.
-
-``getattr``: called by the VFS to get attributes of a file.  This method
-	is called by stat(2) and related system calls.
-
-``listxattr``: called by the VFS to list all extended attributes for a
-	given file.  This method is called by the listxattr(2) system call.
-
-``update_time``: called by the VFS to update a specific time or the i_version of
-	an inode.  If this is not defined the VFS will update the inode itself
-	and call mark_inode_dirty_sync.
-
-``atomic_open``: called on the last component of an open.  Using this optional
-	method the filesystem can look up, possibly create and open the file in
-	one atomic operation.  If it wants to leave actual opening to the
-	caller (e.g. if the file turned out to be a symlink, device, or just
-	something filesystem won't do atomic open for), it may signal this by
-	returning finish_no_open(file, dentry).  This method is only called if
-	the last component is negative or needs lookup.  Cached positive dentries
-	are still handled by f_op->open().  If the file was created,
-	FMODE_CREATED flag should be set in file->f_mode.  In case of O_EXCL
-	the method must only succeed if the file didn't exist and hence FMODE_CREATED
-	shall always be set on success.
-
-``tmpfile``: called in the end of O_TMPFILE open().  Optional, equivalent to
-	atomically creating, opening and unlinking a file in given directory.
+``setattr``
+	called by the VFS to set attributes for a file.  This method is
+	called by chmod(2) and related system calls.
+
+``getattr``
+	called by the VFS to get attributes of a file.  This method is
+	called by stat(2) and related system calls.
+
+``listxattr``
+	called by the VFS to list all extended attributes for a given
+	file.  This method is called by the listxattr(2) system call.
+
+``update_time``
+	called by the VFS to update a specific time or the i_version of
+	an inode.  If this is not defined the VFS will update the inode
+	itself and call mark_inode_dirty_sync.
+
+``atomic_open``
+	called on the last component of an open.  Using this optional
+	method the filesystem can look up, possibly create and open the
+	file in one atomic operation.  If it wants to leave actual
+	opening to the caller (e.g. if the file turned out to be a
+	symlink, device, or just something filesystem won't do atomic
+	open for), it may signal this by returning finish_no_open(file,
+	dentry).  This method is only called if the last component is
+	negative or needs lookup.  Cached positive dentries are still
+	handled by f_op->open().  If the file was created, FMODE_CREATED
+	flag should be set in file->f_mode.  In case of O_EXCL the
+	method must only succeed if the file didn't exist and hence
+	FMODE_CREATED shall always be set on success.
+
+``tmpfile``
+	called in the end of O_TMPFILE open().  Optional, equivalent to
+	atomically creating, opening and unlinking a file in given
+	directory.
 
 
 The Address Space Object
@@ -673,70 +735,75 @@ cache in your filesystem.  The following members are defined:
 		int (*swap_deactivate)(struct file *);
 	};
 
-``writepage``: called by the VM to write a dirty page to backing store.
-      This may happen for data integrity reasons (i.e. 'sync'), or
-      to free up memory (flush).  The difference can be seen in
-      wbc->sync_mode.
-      The PG_Dirty flag has been cleared and PageLocked is true.
-      writepage should start writeout, should set PG_Writeback,
-      and should make sure the page is unlocked, either synchronously
-      or asynchronously when the write operation completes.
-
-      If wbc->sync_mode is WB_SYNC_NONE, ->writepage doesn't have to
-      try too hard if there are problems, and may choose to write out
-      other pages from the mapping if that is easier (e.g. due to
-      internal dependencies).  If it chooses not to start writeout, it
-      should return AOP_WRITEPAGE_ACTIVATE so that the VM will not keep
-      calling ->writepage on that page.
-
-      See the file "Locking" for more details.
-
-``readpage``: called by the VM to read a page from backing store.
-       The page will be Locked when readpage is called, and should be
-       unlocked and marked uptodate once the read completes.
-       If ->readpage discovers that it needs to unlock the page for
-       some reason, it can do so, and then return AOP_TRUNCATED_PAGE.
-       In this case, the page will be relocated, relocked and if
-       that all succeeds, ->readpage will be called again.
-
-``writepages``: called by the VM to write out pages associated with the
+``writepage``
+	called by the VM to write a dirty page to backing store.  This
+	may happen for data integrity reasons (i.e. 'sync'), or to free
+	up memory (flush).  The difference can be seen in
+	wbc->sync_mode.  The PG_Dirty flag has been cleared and
+	PageLocked is true.  writepage should start writeout, should set
+	PG_Writeback, and should make sure the page is unlocked, either
+	synchronously or asynchronously when the write operation
+	completes.
+
+	If wbc->sync_mode is WB_SYNC_NONE, ->writepage doesn't have to
+	try too hard if there are problems, and may choose to write out
+	other pages from the mapping if that is easier (e.g. due to
+	internal dependencies).  If it chooses not to start writeout, it
+	should return AOP_WRITEPAGE_ACTIVATE so that the VM will not
+	keep calling ->writepage on that page.
+
+	See the file "Locking" for more details.
+
+``readpage``
+	called by the VM to read a page from backing store.  The page
+	will be Locked when readpage is called, and should be unlocked
+	and marked uptodate once the read completes.  If ->readpage
+	discovers that it needs to unlock the page for some reason, it
+	can do so, and then return AOP_TRUNCATED_PAGE.  In this case,
+	the page will be relocated, relocked and if that all succeeds,
+	->readpage will be called again.
+
+``writepages``
+	called by the VM to write out pages associated with the
 	address_space object.  If wbc->sync_mode is WBC_SYNC_ALL, then
 	the writeback_control will specify a range of pages that must be
-	written out.  If it is WBC_SYNC_NONE, then a nr_to_write is given
-	and that many pages should be written if possible.
-	If no ->writepages is given, then mpage_writepages is used
-	instead.  This will choose pages from the address space that are
-	tagged as DIRTY and will pass them to ->writepage.
-
-``set_page_dirty``: called by the VM to set a page dirty.
-	This is particularly needed if an address space attaches
-	private data to a page, and that data needs to be updated when
-	a page is dirtied.  This is called, for example, when a memory
-	mapped page gets modified.
+	written out.  If it is WBC_SYNC_NONE, then a nr_to_write is
+	given and that many pages should be written if possible.  If no
+	->writepages is given, then mpage_writepages is used instead.
+	This will choose pages from the address space that are tagged as
+	DIRTY and will pass them to ->writepage.
+
+``set_page_dirty``
+	called by the VM to set a page dirty.  This is particularly
+	needed if an address space attaches private data to a page, and
+	that data needs to be updated when a page is dirtied.  This is
+	called, for example, when a memory mapped page gets modified.
 	If defined, it should set the PageDirty flag, and the
 	PAGECACHE_TAG_DIRTY tag in the radix tree.
 
-``readpages``: called by the VM to read pages associated with the address_space
-	object.  This is essentially just a vector version of
-	readpage.  Instead of just one page, several pages are
-	requested.
+``readpages``
+	called by the VM to read pages associated with the address_space
+	object.  This is essentially just a vector version of readpage.
+	Instead of just one page, several pages are requested.
 	readpages is only used for read-ahead, so read errors are
 	ignored.  If anything goes wrong, feel free to give up.
 
-``write_begin``:
-	Called by the generic buffered write code to ask the filesystem to
-	prepare to write len bytes at the given offset in the file.  The
-	address_space should check that the write will be able to complete,
-	by allocating space if necessary and doing any other internal
-	housekeeping.  If the write will update parts of any basic-blocks on
-	storage, then those blocks should be pre-read (if they haven't been
-	read already) so that the updated blocks can be written out properly.
+``write_begin``
+	Called by the generic buffered write code to ask the filesystem
+	to prepare to write len bytes at the given offset in the file.
+	The address_space should check that the write will be able to
+	complete, by allocating space if necessary and doing any other
+	internal housekeeping.  If the write will update parts of any
+	basic-blocks on storage, then those blocks should be pre-read
+	(if they haven't been read already) so that the updated blocks
+	can be written out properly.
 
-	The filesystem must return the locked pagecache page for the specified
-	offset, in ``*pagep``, for the caller to write into.
+	The filesystem must return the locked pagecache page for the
+	specified offset, in ``*pagep``, for the caller to write into.
 
-	It must be able to cope with short writes (where the length passed to
-	write_begin is greater than the number of bytes copied into the page).
+	It must be able to cope with short writes (where the length
+	passed to write_begin is greater than the number of bytes copied
+	into the page).
 
 	flags is a field for AOP_FLAG_xxx flags, described in
 	include/linux/fs.h.
@@ -744,114 +811,128 @@ cache in your filesystem.  The following members are defined:
 	A void * may be returned in fsdata, which then gets passed into
 	write_end.
 
-	Returns 0 on success; < 0 on failure (which is the error code), in
-	which case write_end is not called.
-
-``write_end``: After a successful write_begin, and data copy, write_end must
-	be called.  len is the original len passed to write_begin, and copied
-	is the amount that was able to be copied.
-
-	The filesystem must take care of unlocking the page and releasing it
-	refcount, and updating i_size.
-
-	Returns < 0 on failure, otherwise the number of bytes (<= 'copied')
-	that were able to be copied into pagecache.
-
-``bmap``: called by the VFS to map a logical block offset within object to
-	physical block number.  This method is used by the FIBMAP
-	ioctl and for working with swap-files.  To be able to swap to
-	a file, the file must have a stable mapping to a block
-	device.  The swap system does not go through the filesystem
-	but instead uses bmap to find out where the blocks in the file
-	are and uses those addresses directly.
-
-``invalidatepage``: If a page has PagePrivate set, then invalidatepage
-	will be called when part or all of the page is to be removed
-	from the address space.  This generally corresponds to either a
-	truncation, punch hole  or a complete invalidation of the address
+	Returns 0 on success; < 0 on failure (which is the error code),
+	in which case write_end is not called.
+
+``write_end``
+	After a successful write_begin, and data copy, write_end must be
+	called.  len is the original len passed to write_begin, and
+	copied is the amount that was able to be copied.
+
+	The filesystem must take care of unlocking the page and
+	releasing it refcount, and updating i_size.
+
+	Returns < 0 on failure, otherwise the number of bytes (<=
+	'copied') that were able to be copied into pagecache.
+
+``bmap``
+	called by the VFS to map a logical block offset within object to
+	physical block number.  This method is used by the FIBMAP ioctl
+	and for working with swap-files.  To be able to swap to a file,
+	the file must have a stable mapping to a block device.  The swap
+	system does not go through the filesystem but instead uses bmap
+	to find out where the blocks in the file are and uses those
+	addresses directly.
+
+``invalidatepage``
+	If a page has PagePrivate set, then invalidatepage will be
+	called when part or all of the page is to be removed from the
+	address space.  This generally corresponds to either a
+	truncation, punch hole or a complete invalidation of the address
 	space (in the latter case 'offset' will always be 0 and 'length'
 	will be PAGE_SIZE).  Any private data associated with the page
-	should be updated to reflect this truncation.  If offset is 0 and
-	length is PAGE_SIZE, then the private data should be released,
-	because the page must be able to be completely discarded.  This may
-	be done by calling the ->releasepage function, but in this case the
-	release MUST succeed.
-
-``releasepage``: releasepage is called on PagePrivate pages to indicate
-	that the page should be freed if possible.  ->releasepage
-	should remove any private data from the page and clear the
-	PagePrivate flag.  If releasepage() fails for some reason, it must
-	indicate failure with a 0 return value.
-	releasepage() is used in two distinct though related cases.  The
-	first is when the VM finds a clean page with no active users and
-	wants to make it a free page.  If ->releasepage succeeds, the
-	page will be removed from the address_space and become free.
+	should be updated to reflect this truncation.  If offset is 0
+	and length is PAGE_SIZE, then the private data should be
+	released, because the page must be able to be completely
+	discarded.  This may be done by calling the ->releasepage
+	function, but in this case the release MUST succeed.
+
+``releasepage``
+	releasepage is called on PagePrivate pages to indicate that the
+	page should be freed if possible.  ->releasepage should remove
+	any private data from the page and clear the PagePrivate flag.
+	If releasepage() fails for some reason, it must indicate failure
+	with a 0 return value.  releasepage() is used in two distinct
+	though related cases.  The first is when the VM finds a clean
+	page with no active users and wants to make it a free page.  If
+	->releasepage succeeds, the page will be removed from the
+	address_space and become free.
 
 	The second case is when a request has been made to invalidate
-	some or all pages in an address_space.  This can happen
-	through the fadvise(POSIX_FADV_DONTNEED) system call or by the
-	filesystem explicitly requesting it as nfs and 9fs do (when
-	they believe the cache may be out of date with storage) by
-	calling invalidate_inode_pages2().
-	If the filesystem makes such a call, and needs to be certain
-	that all pages are invalidated, then its releasepage will
-	need to ensure this.  Possibly it can clear the PageUptodate
-	bit if it cannot free private data yet.
-
-``freepage``: freepage is called once the page is no longer visible in
-	the page cache in order to allow the cleanup of any private
-	data.  Since it may be called by the memory reclaimer, it
-	should not assume that the original address_space mapping still
-	exists, and it should not block.
-
-``direct_IO``: called by the generic read/write routines to perform
-	direct_IO - that is IO requests which bypass the page cache
-	and transfer data directly between the storage and the
-	application's address space.
-
-``isolate_page``: Called by the VM when isolating a movable non-lru page.
-	If page is successfully isolated, VM marks the page as PG_isolated
-	via __SetPageIsolated.
-
-``migrate_page``:  This is used to compact the physical memory usage.
-	If the VM wants to relocate a page (maybe off a memory card
-	that is signalling imminent failure) it will pass a new page
-	and an old page to this function.  migrate_page should
-	transfer any private data across and update any references
-	that it has to the page.
-
-``putback_page``: Called by the VM when isolated page's migration fails.
-
-``launder_page``: Called before freeing a page - it writes back the dirty page.  To
-	prevent redirtying the page, it is kept locked during the whole
-	operation.
-
-``is_partially_uptodate``: Called by the VM when reading a file through the
-	pagecache when the underlying blocksize != pagesize.  If the required
-	block is up to date then the read can complete without needing the IO
-	to bring the whole page up to date.
-
-``is_dirty_writeback``: Called by the VM when attempting to reclaim a page.
-	The VM uses dirty and writeback information to determine if it needs
-	to stall to allow flushers a chance to complete some IO.  Ordinarily
-	it can use PageDirty and PageWriteback but some filesystems have
-	more complex state (unstable pages in NFS prevent reclaim) or
-	do not set those flags due to locking problems.  This callback
-	allows a filesystem to indicate to the VM if a page should be
-	treated as dirty or writeback for the purposes of stalling.
-
-``error_remove_page``: normally set to generic_error_remove_page if truncation
-	is ok for this address space.  Used for memory failure handling.
+	some or all pages in an address_space.  This can happen through
+	the fadvise(POSIX_FADV_DONTNEED) system call or by the
+	filesystem explicitly requesting it as nfs and 9fs do (when they
+	believe the cache may be out of date with storage) by calling
+	invalidate_inode_pages2().  If the filesystem makes such a call,
+	and needs to be certain that all pages are invalidated, then its
+	releasepage will need to ensure this.  Possibly it can clear the
+	PageUptodate bit if it cannot free private data yet.
+
+``freepage``
+	freepage is called once the page is no longer visible in the
+	page cache in order to allow the cleanup of any private data.
+	Since it may be called by the memory reclaimer, it should not
+	assume that the original address_space mapping still exists, and
+	it should not block.
+
+``direct_IO``
+	called by the generic read/write routines to perform direct_IO -
+	that is IO requests which bypass the page cache and transfer
+	data directly between the storage and the application's address
+	space.
+
+``isolate_page``
+	Called by the VM when isolating a movable non-lru page.  If page
+	is successfully isolated, VM marks the page as PG_isolated via
+	__SetPageIsolated.
+
+``migrate_page``
+	This is used to compact the physical memory usage.  If the VM
+	wants to relocate a page (maybe off a memory card that is
+	signalling imminent failure) it will pass a new page and an old
+	page to this function.  migrate_page should transfer any private
+	data across and update any references that it has to the page.
+
+``putback_page``
+	Called by the VM when isolated page's migration fails.
+
+``launder_page``
+	Called before freeing a page - it writes back the dirty page.
+	To prevent redirtying the page, it is kept locked during the
+	whole operation.
+
+``is_partially_uptodate``
+	Called by the VM when reading a file through the pagecache when
+	the underlying blocksize != pagesize.  If the required block is
+	up to date then the read can complete without needing the IO to
+	bring the whole page up to date.
+
+``is_dirty_writeback``
+	Called by the VM when attempting to reclaim a page.  The VM uses
+	dirty and writeback information to determine if it needs to
+	stall to allow flushers a chance to complete some IO.
+	Ordinarily it can use PageDirty and PageWriteback but some
+	filesystems have more complex state (unstable pages in NFS
+	prevent reclaim) or do not set those flags due to locking
+	problems.  This callback allows a filesystem to indicate to the
+	VM if a page should be treated as dirty or writeback for the
+	purposes of stalling.
+
+``error_remove_page``
+	normally set to generic_error_remove_page if truncation is ok
+	for this address space.  Used for memory failure handling.
 	Setting this implies you deal with pages going away under you,
 	unless you have them locked or reference counts increased.
 
-``swap_activate``: Called when swapon is used on a file to allocate
-	space if necessary and pin the block lookup information in
-	memory.  A return value of zero indicates success,
-	in which case this file can be used to back swapspace.
+``swap_activate``
+	Called when swapon is used on a file to allocate space if
+	necessary and pin the block lookup information in memory.  A
+	return value of zero indicates success, in which case this file
+	can be used to back swapspace.
 
-``swap_deactivate``: Called during swapoff on files where swap_activate
-	was successful.
+``swap_deactivate``
+	Called during swapoff on files where swap_activate was
+	successful.
 
 
 The File Object
@@ -912,91 +993,120 @@ This describes how the VFS can manipulate an open file.  As of kernel
 Again, all methods are called without any locks being held, unless
 otherwise noted.
 
-``llseek``: called when the VFS needs to move the file position index
+``llseek``
+	called when the VFS needs to move the file position index
 
-``read``: called by read(2) and related system calls
+``read``
+	called by read(2) and related system calls
 
-``read_iter``: possibly asynchronous read with iov_iter as destination
+``read_iter``
+	possibly asynchronous read with iov_iter as destination
 
-``write``: called by write(2) and related system calls
+``write``
+	called by write(2) and related system calls
 
-``write_iter``: possibly asynchronous write with iov_iter as source
+``write_iter``
+	possibly asynchronous write with iov_iter as source
 
-``iopoll``: called when aio wants to poll for completions on HIPRI iocbs
+``iopoll``
+	called when aio wants to poll for completions on HIPRI iocbs
 
-``iterate``: called when the VFS needs to read the directory contents
+``iterate``
+	called when the VFS needs to read the directory contents
 
-``iterate_shared``: called when the VFS needs to read the directory contents
-	when filesystem supports concurrent dir iterators
+``iterate_shared``
+	called when the VFS needs to read the directory contents when
+	filesystem supports concurrent dir iterators
 
-``poll``: called by the VFS when a process wants to check if there is
+``poll``
+	called by the VFS when a process wants to check if there is
 	activity on this file and (optionally) go to sleep until there
 	is activity.  Called by the select(2) and poll(2) system calls
 
-``unlocked_ioctl``: called by the ioctl(2) system call.
+``unlocked_ioctl``
+	called by the ioctl(2) system call.
 
-``compat_ioctl``: called by the ioctl(2) system call when 32 bit system calls
-	 are used on 64 bit kernels.
+``compat_ioctl``
+	called by the ioctl(2) system call when 32 bit system calls are
+	 used on 64 bit kernels.
 
-``mmap``: called by the mmap(2) system call
+``mmap``
+	called by the mmap(2) system call
 
-``open``: called by the VFS when an inode should be opened.  When the VFS
+``open``
+	called by the VFS when an inode should be opened.  When the VFS
 	opens a file, it creates a new "struct file".  It then calls the
 	open method for the newly allocated file structure.  You might
-	think that the open method really belongs in
-	"struct inode_operations", and you may be right.  I think it's
-	done the way it is because it makes filesystems simpler to
-	implement.  The open() method is a good place to initialize the
+	think that the open method really belongs in "struct
+	inode_operations", and you may be right.  I think it's done the
+	way it is because it makes filesystems simpler to implement.
+	The open() method is a good place to initialize the
 	"private_data" member in the file structure if you want to point
 	to a device structure
 
-``flush``: called by the close(2) system call to flush a file
+``flush``
+	called by the close(2) system call to flush a file
 
-``release``: called when the last reference to an open file is closed
+``release``
+	called when the last reference to an open file is closed
 
-``fsync``: called by the fsync(2) system call.  Also see the section above
-	 entitled "Handling errors during writeback".
+``fsync``
+	called by the fsync(2) system call.  Also see the section above
+	entitled "Handling errors during writeback".
 
-``fasync``: called by the fcntl(2) system call when asynchronous
+``fasync``
+	called by the fcntl(2) system call when asynchronous
 	(non-blocking) mode is enabled for a file
 
-``lock``: called by the fcntl(2) system call for F_GETLK, F_SETLK, and F_SETLKW
-	commands
+``lock``
+	called by the fcntl(2) system call for F_GETLK, F_SETLK, and
+	F_SETLKW commands
 
-``get_unmapped_area``: called by the mmap(2) system call
+``get_unmapped_area``
+	called by the mmap(2) system call
 
-``check_flags``: called by the fcntl(2) system call for F_SETFL command
+``check_flags``
+	called by the fcntl(2) system call for F_SETFL command
 
-``flock``: called by the flock(2) system call
+``flock``
+	called by the flock(2) system call
 
-``splice_write``: called by the VFS to splice data from a pipe to a file.  This
-		method is used by the splice(2) system call
+``splice_write``
+	called by the VFS to splice data from a pipe to a file.  This
+	method is used by the splice(2) system call
 
-``splice_read``: called by the VFS to splice data from file to a pipe.  This
-	       method is used by the splice(2) system call
+``splice_read``
+	called by the VFS to splice data from file to a pipe.  This
+	method is used by the splice(2) system call
 
-``setlease``: called by the VFS to set or release a file lock lease.  setlease
-	    implementations should call generic_setlease to record or remove
-	    the lease in the inode after setting it.
+``setlease``
+	called by the VFS to set or release a file lock lease.  setlease
+	implementations should call generic_setlease to record or remove
+	the lease in the inode after setting it.
 
-``fallocate``: called by the VFS to preallocate blocks or punch a hole.
+``fallocate``
+	called by the VFS to preallocate blocks or punch a hole.
 
-``copy_file_range``: called by the copy_file_range(2) system call.
+``copy_file_range``
+	called by the copy_file_range(2) system call.
 
-``remap_file_range``: called by the ioctl(2) system call for FICLONERANGE and
-	FICLONE and FIDEDUPERANGE commands to remap file ranges.  An
-	implementation should remap len bytes at pos_in of the source file into
-	the dest file at pos_out.  Implementations must handle callers passing
-	in len == 0; this means "remap to the end of the source file".  The
-	return value should the number of bytes remapped, or the usual
-	negative error code if errors occurred before any bytes were remapped.
-	The remap_flags parameter accepts REMAP_FILE_* flags.  If
-	REMAP_FILE_DEDUP is set then the implementation must only remap if the
-	requested file ranges have identical contents.  If REMAP_CAN_SHORTEN is
-	set, the caller is ok with the implementation shortening the request
-	length to satisfy alignment or EOF requirements (or any other reason).
+``remap_file_range``
+	called by the ioctl(2) system call for FICLONERANGE and FICLONE
+	and FIDEDUPERANGE commands to remap file ranges.  An
+	implementation should remap len bytes at pos_in of the source
+	file into the dest file at pos_out.  Implementations must handle
+	callers passing in len == 0; this means "remap to the end of the
+	source file".  The return value should the number of bytes
+	remapped, or the usual negative error code if errors occurred
+	before any bytes were remapped.  The remap_flags parameter
+	accepts REMAP_FILE_* flags.  If REMAP_FILE_DEDUP is set then the
+	implementation must only remap if the requested file ranges have
+	identical contents.  If REMAP_CAN_SHORTEN is set, the caller is
+	ok with the implementation shortening the request length to
+	satisfy alignment or EOF requirements (or any other reason).
 
-``fadvise``: possibly called by the fadvise64() system call.
+``fadvise``
+	possibly called by the fadvise64() system call.
 
 Note that the file operations are implemented by the specific
 filesystem in which the inode resides.  When opening a device node
@@ -1041,89 +1151,104 @@ defined:
 		struct dentry *(*d_real)(struct dentry *, const struct inode *);
 	};
 
-``d_revalidate``: called when the VFS needs to revalidate a dentry.  This
-	is called whenever a name look-up finds a dentry in the
-	dcache.  Most local filesystems leave this as NULL, because all their
-	dentries in the dcache are valid.  Network filesystems are different
-	since things can change on the server without the client necessarily
-	being aware of it.
-
-	This function should return a positive value if the dentry is still
-	valid, and zero or a negative error code if it isn't.
-
-	d_revalidate may be called in rcu-walk mode (flags & LOOKUP_RCU).
-	If in rcu-walk mode, the filesystem must revalidate the dentry without
-	blocking or storing to the dentry, d_parent and d_inode should not be
-	used without care (because they can change and, in d_inode case, even
-	become NULL under us).
-
-	If a situation is encountered that rcu-walk cannot handle, return
+``d_revalidate``
+	called when the VFS needs to revalidate a dentry.  This is
+	called whenever a name look-up finds a dentry in the dcache.
+	Most local filesystems leave this as NULL, because all their
+	dentries in the dcache are valid.  Network filesystems are
+	different since things can change on the server without the
+	client necessarily being aware of it.
+
+	This function should return a positive value if the dentry is
+	still valid, and zero or a negative error code if it isn't.
+
+	d_revalidate may be called in rcu-walk mode (flags &
+	LOOKUP_RCU).  If in rcu-walk mode, the filesystem must
+	revalidate the dentry without blocking or storing to the dentry,
+	d_parent and d_inode should not be used without care (because
+	they can change and, in d_inode case, even become NULL under
+	us).
+
+	If a situation is encountered that rcu-walk cannot handle,
+	return
 	-ECHILD and it will be called again in ref-walk mode.
 
-``_weak_revalidate``: called when the VFS needs to revalidate a "jumped" dentry.
-	This is called when a path-walk ends at dentry that was not acquired by
-	doing a lookup in the parent directory.  This includes "/", "." and "..",
-	as well as procfs-style symlinks and mountpoint traversal.
+``_weak_revalidate``
+	called when the VFS needs to revalidate a "jumped" dentry.  This
+	is called when a path-walk ends at dentry that was not acquired
+	by doing a lookup in the parent directory.  This includes "/",
+	"." and "..", as well as procfs-style symlinks and mountpoint
+	traversal.
 
-	In this case, we are less concerned with whether the dentry is still
-	fully correct, but rather that the inode is still valid.  As with
-	d_revalidate, most local filesystems will set this to NULL since their
-	dcache entries are always valid.
+	In this case, we are less concerned with whether the dentry is
+	still fully correct, but rather that the inode is still valid.
+	As with d_revalidate, most local filesystems will set this to
+	NULL since their dcache entries are always valid.
 
-	This function has the same return code semantics as d_revalidate.
+	This function has the same return code semantics as
+	d_revalidate.
 
 	d_weak_revalidate is only called after leaving rcu-walk mode.
 
-``d_hash``: called when the VFS adds a dentry to the hash table.  The first
+``d_hash``
+	called when the VFS adds a dentry to the hash table.  The first
 	dentry passed to d_hash is the parent directory that the name is
 	to be hashed into.
 
 	Same locking and synchronisation rules as d_compare regarding
 	what is safe to dereference etc.
 
-``d_compare``: called to compare a dentry name with a given name.  The first
+``d_compare``
+	called to compare a dentry name with a given name.  The first
 	dentry is the parent of the dentry to be compared, the second is
-	the child dentry.  len and name string are properties of the dentry
-	to be compared.  qstr is the name to compare it with.
+	the child dentry.  len and name string are properties of the
+	dentry to be compared.  qstr is the name to compare it with.
 
 	Must be constant and idempotent, and should not take locks if
-	possible, and should not or store into the dentry.
-	Should not dereference pointers outside the dentry without
-	lots of care (eg.  d_parent, d_inode, d_name should not be used).
-
-	However, our vfsmount is pinned, and RCU held, so the dentries and
-	inodes won't disappear, neither will our sb or filesystem module.
-	->d_sb may be used.
-
-	It is a tricky calling convention because it needs to be called under
-	"rcu-walk", ie. without any locks or references on things.
-
-``d_delete``: called when the last reference to a dentry is dropped and the
-	dcache is deciding whether or not to cache it.  Return 1 to delete
-	immediately, or 0 to cache the dentry.  Default is NULL which means to
-	always cache a reachable dentry.  d_delete must be constant and
-	idempotent.
-
-``d_init``: called when a dentry is allocated
-
-``d_release``: called when a dentry is really deallocated
-
-``d_iput``: called when a dentry loses its inode (just prior to its
-	being deallocated).  The default when this is NULL is that the
-	VFS calls iput().  If you define this method, you must call
-	iput() yourself
-
-``d_dname``: called when the pathname of a dentry should be generated.
-	Useful for some pseudo filesystems (sockfs, pipefs, ...) to delay
-	pathname generation.  (Instead of doing it when dentry is created,
-	it's done only when the path is needed.).  Real filesystems probably
-	dont want to use it, because their dentries are present in global
-	dcache hash, so their hash should be an invariant.  As no lock is
-	held, d_dname() should not try to modify the dentry itself, unless
-	appropriate SMP safety is used.  CAUTION : d_path() logic is quite
-	tricky.  The correct way to return for example "Hello" is to put it
-	at the end of the buffer, and returns a pointer to the first char.
-	dynamic_dname() helper function is provided to take care of this.
+	possible, and should not or store into the dentry.  Should not
+	dereference pointers outside the dentry without lots of care
+	(eg.  d_parent, d_inode, d_name should not be used).
+
+	However, our vfsmount is pinned, and RCU held, so the dentries
+	and inodes won't disappear, neither will our sb or filesystem
+	module.  ->d_sb may be used.
+
+	It is a tricky calling convention because it needs to be called
+	under "rcu-walk", ie. without any locks or references on things.
+
+``d_delete``
+	called when the last reference to a dentry is dropped and the
+	dcache is deciding whether or not to cache it.  Return 1 to
+	delete immediately, or 0 to cache the dentry.  Default is NULL
+	which means to always cache a reachable dentry.  d_delete must
+	be constant and idempotent.
+
+``d_init``
+	called when a dentry is allocated
+
+``d_release``
+	called when a dentry is really deallocated
+
+``d_iput``
+	called when a dentry loses its inode (just prior to its being
+	deallocated).  The default when this is NULL is that the VFS
+	calls iput().  If you define this method, you must call iput()
+	yourself
+
+``d_dname``
+	called when the pathname of a dentry should be generated.
+	Useful for some pseudo filesystems (sockfs, pipefs, ...) to
+	delay pathname generation.  (Instead of doing it when dentry is
+	created, it's done only when the path is needed.).  Real
+	filesystems probably dont want to use it, because their dentries
+	are present in global dcache hash, so their hash should be an
+	invariant.  As no lock is held, d_dname() should not try to
+	modify the dentry itself, unless appropriate SMP safety is used.
+	CAUTION : d_path() logic is quite tricky.  The correct way to
+	return for example "Hello" is to put it at the end of the
+	buffer, and returns a pointer to the first char.
+	dynamic_dname() helper function is provided to take care of
+	this.
 
 	Example :
 
@@ -1135,52 +1260,57 @@ defined:
 				dentry->d_inode->i_ino);
 	}
 
-``d_automount``: called when an automount dentry is to be traversed (optional).
-	This should create a new VFS mount record and return the record to the
-	caller.  The caller is supplied with a path parameter giving the
-	automount directory to describe the automount target and the parent
-	VFS mount record to provide inheritable mount parameters.  NULL should
-	be returned if someone else managed to make the automount first.  If
-	the vfsmount creation failed, then an error code should be returned.
-	If -EISDIR is returned, then the directory will be treated as an
-	ordinary directory and returned to pathwalk to continue walking.
-
-	If a vfsmount is returned, the caller will attempt to mount it on the
-	mountpoint and will remove the vfsmount from its expiration list in
-	the case of failure.  The vfsmount should be returned with 2 refs on
-	it to prevent automatic expiration - the caller will clean up the
-	additional ref.
-
-	This function is only used if DCACHE_NEED_AUTOMOUNT is set on the
-	dentry.  This is set by __d_instantiate() if S_AUTOMOUNT is set on the
-	inode being added.
-
-``d_manage``: called to allow the filesystem to manage the transition from a
-	dentry (optional).  This allows autofs, for example, to hold up clients
-	waiting to explore behind a 'mountpoint' while letting the daemon go
-	past and construct the subtree there.  0 should be returned to let the
-	calling process continue.  -EISDIR can be returned to tell pathwalk to
-	use this directory as an ordinary directory and to ignore anything
-	mounted on it and not to check the automount flag.  Any other error
-	code will abort pathwalk completely.
+``d_automount``
+	called when an automount dentry is to be traversed (optional).
+	This should create a new VFS mount record and return the record
+	to the caller.  The caller is supplied with a path parameter
+	giving the automount directory to describe the automount target
+	and the parent VFS mount record to provide inheritable mount
+	parameters.  NULL should be returned if someone else managed to
+	make the automount first.  If the vfsmount creation failed, then
+	an error code should be returned.  If -EISDIR is returned, then
+	the directory will be treated as an ordinary directory and
+	returned to pathwalk to continue walking.
+
+	If a vfsmount is returned, the caller will attempt to mount it
+	on the mountpoint and will remove the vfsmount from its
+	expiration list in the case of failure.  The vfsmount should be
+	returned with 2 refs on it to prevent automatic expiration - the
+	caller will clean up the additional ref.
+
+	This function is only used if DCACHE_NEED_AUTOMOUNT is set on
+	the dentry.  This is set by __d_instantiate() if S_AUTOMOUNT is
+	set on the inode being added.
+
+``d_manage``
+	called to allow the filesystem to manage the transition from a
+	dentry (optional).  This allows autofs, for example, to hold up
+	clients waiting to explore behind a 'mountpoint' while letting
+	the daemon go past and construct the subtree there.  0 should be
+	returned to let the calling process continue.  -EISDIR can be
+	returned to tell pathwalk to use this directory as an ordinary
+	directory and to ignore anything mounted on it and not to check
+	the automount flag.  Any other error code will abort pathwalk
+	completely.
 
 	If the 'rcu_walk' parameter is true, then the caller is doing a
-	pathwalk in RCU-walk mode.  Sleeping is not permitted in this mode,
-	and the caller can be asked to leave it and call again by returning
-	-ECHILD.  -EISDIR may also be returned to tell pathwalk to
-	ignore d_automount or any mounts.
+	pathwalk in RCU-walk mode.  Sleeping is not permitted in this
+	mode, and the caller can be asked to leave it and call again by
+	returning -ECHILD.  -EISDIR may also be returned to tell
+	pathwalk to ignore d_automount or any mounts.
 
-	This function is only used if DCACHE_MANAGE_TRANSIT is set on the
-	dentry being transited from.
+	This function is only used if DCACHE_MANAGE_TRANSIT is set on
+	the dentry being transited from.
 
-``d_real``: overlay/union type filesystems implement this method to return one of
-	the underlying dentries hidden by the overlay.  It is used in two
-	different modes:
+``d_real``
+	overlay/union type filesystems implement this method to return
+	one of the underlying dentries hidden by the overlay.  It is
+	used in two different modes:
 
-	Called from file_dentry() it returns the real dentry matching the inode
-	argument.  The real dentry may be from a lower layer already copied up,
-	but still referenced from the file.  This mode is selected with a
-	non-NULL inode argument.
+	Called from file_dentry() it returns the real dentry matching
+	the inode argument.  The real dentry may be from a lower layer
+	already copied up, but still referenced from the file.  This
+	mode is selected with a non-NULL inode argument.
 
 	With NULL inode the topmost real underlying dentry is returned.
 
@@ -1195,40 +1325,47 @@ Directory Entry Cache API
 There are a number of functions defined which permit a filesystem to
 manipulate dentries:
 
-``dget``: open a new handle for an existing dentry (this just increments
+``dget``
+	open a new handle for an existing dentry (this just increments
 	the usage count)
 
-``dput``: close a handle for a dentry (decrements the usage count).  If
+``dput``
+	close a handle for a dentry (decrements the usage count).  If
 	the usage count drops to 0, and the dentry is still in its
 	parent's hash, the "d_delete" method is called to check whether
-	it should be cached.  If it should not be cached, or if the dentry
-	is not hashed, it is deleted.  Otherwise cached dentries are put
-	into an LRU list to be reclaimed on memory shortage.
-
-``d_drop``: this unhashes a dentry from its parents hash list.  A
-	subsequent call to dput() will deallocate the dentry if its
-	usage count drops to 0
-
-``d_delete``: delete a dentry.  If there are no other open references to
-	the dentry then the dentry is turned into a negative dentry
-	(the d_iput() method is called).  If there are other
-	references, then d_drop() is called instead
-
-``d_add``: add a dentry to its parents hash list and then calls
+	it should be cached.  If it should not be cached, or if the
+	dentry is not hashed, it is deleted.  Otherwise cached dentries
+	are put into an LRU list to be reclaimed on memory shortage.
+
+``d_drop``
+	this unhashes a dentry from its parents hash list.  A subsequent
+	call to dput() will deallocate the dentry if its usage count
+	drops to 0
+
+``d_delete``
+	delete a dentry.  If there are no other open references to the
+	dentry then the dentry is turned into a negative dentry (the
+	d_iput() method is called).  If there are other references, then
+	d_drop() is called instead
+
+``d_add``
+	add a dentry to its parents hash list and then calls
 	d_instantiate()
 
-``d_instantiate``: add a dentry to the alias hash list for the inode and
-	updates the "d_inode" member.  The "i_count" member in the
-	inode structure should be set/incremented.  If the inode
-	pointer is NULL, the dentry is called a "negative
-	dentry".  This function is commonly called when an inode is
-	created for an existing negative dentry
-
-``d_lookup``: look up a dentry given its parent and path name component
-	It looks up the child of that given name from the dcache
-	hash table.  If it is found, the reference count is incremented
-	and the dentry is returned.  The caller must use dput()
-	to free the dentry when it finishes using it.
+``d_instantiate``
+	add a dentry to the alias hash list for the inode and updates
+	the "d_inode" member.  The "i_count" member in the inode
+	structure should be set/incremented.  If the inode pointer is
+	NULL, the dentry is called a "negative dentry".  This function
+	is commonly called when an inode is created for an existing
+	negative dentry
+
+``d_lookup``
+	look up a dentry given its parent and path name component It
+	looks up the child of that given name from the dcache hash
+	table.  If it is found, the reference count is incremented and
+	the dentry is returned.  The caller must use dput() to free the
+	dentry when it finishes using it.
 
 
 Mount Options
author	Tobin C. Harding <tobin@kernel.org>	2019-06-04 10:26:56 +1000
committer	Jonathan Corbet <corbet@lwn.net>	2019-06-06 09:41:13 -0600
commit	ee5dc0491c38ae4e4e583d7532d470754bb173f6 (patch)
tree	6b0d39e34a968dcb90387bf7f13bd67e3ce560aa /Documentation/filesystems/vfs.rst
parent	af96c1e304f7051bf2ee64c9957724bdace05c58 (diff)
download	lwn-ee5dc0491c38ae4e4e583d7532d470754bb173f6.tar.gz lwn-ee5dc0491c38ae4e4e583d7532d470754bb173f6.zip