lwn.git/fs/overlayfs/dir.c, branch docs-6.9

rename(): avoid a deadlock in the case of parents having no common ancestor

2023-11-25T07:54:14+00:00

... and fix the directory locking documentation and proof of correctness. Holding ->s_vfs_rename_mutex *almost* prevents ->d_parent changes; the case where we really don't want it is splicing the root of disconnected tree to somewhere. In other words, ->s_vfs_rename_mutex is sufficient to stabilize "X is an ancestor of Y" only if X and Y are already in the same tree. Otherwise it can go from false to true, and one can construct a deadlock on that. Make lock_two_directories() report an error in such case and update the callers of lock_rename()/lock_rename_child() to handle such errors. And yes, such conditions are not impossible to create ;-/ Reviewed-by: Jan Kara Signed-off-by: Al Viro

ovl: Add an alternative type of whiteout

2023-10-30T22:12:59+00:00

An xattr whiteout (called "xwhiteout" in the code) is a reguar file of zero size with the "overlay.whiteout" xattr set. A file like this in a directory with the "overlay.whiteouts" xattrs set will be treated the same way as a regular whiteout. The "overlay.whiteouts" directory xattr is used in order to efficiently handle overlay checks in readdir(), as we only need to checks xattrs in affected directories. The advantage of this kind of whiteout is that they can be escaped using the standard overlay xattr escaping mechanism. So, a file with a "overlay.overlay.whiteout" xattr would be unescaped to "overlay.whiteout", which could then be consumed by another overlayfs as a whiteout. Overlayfs itself doesn't create whiteouts like this, but a userspace mechanism could use this alternative mechanism to convert images that may contain whiteouts to be used with overlayfs. To work as a whiteout for both regular overlayfs mounts as well as userxattr mounts both the "user.overlay.whiteout*" and the "trusted.overlay.whiteout*" xattrs will need to be created. Signed-off-by: Alexander Larsson Reviewed-by: Amir Goldstein Signed-off-by: Amir Goldstein

ovl: reorder ovl_want_write() after ovl_inode_lock()

2023-10-30T22:12:57+00:00

Make the locking order of ovl_inode_lock() strictly between the two vfs stacked layers, i.e.: - ovl vfs locks: sb_writers, inode_lock, ... - ovl_inode_lock - upper vfs locks: sb_writers, inode_lock, ... To that effect, move ovl_want_write() into the helpers ovl_nlink_start() and ovl_copy_up_start which currently take the ovl_inode_lock() after ovl_want_write(). Signed-off-by: Amir Goldstein

ovl: store enum redirect_mode in config instead of a string

2023-06-19T11:02:01+00:00

Do all the logic to set the mode during mount options parsing and do not keep the option string around. Use a constant_table to translate from enum redirect mode to string in preperation for new mount api option parsing. The mount option "off" is translated to either "follow" or "nofollow", depending on the "redirect_always_follow" build/module config, so in effect, there are only three possible redirect modes. This results in a minor change to the string that is displayed in show_options() - when redirect_dir is enabled by default and the user mounts with the option "redirect_dir=off", instead of displaying the mode "redirect_dir=off" in show_options(), the displayed mode will be either "redirect_dir=follow" or "redirect_dir=nofollow", depending on the value of "redirect_always_follow" build/module config. The displayed mode reflects the effective mode, so mounting overlayfs again with the dispalyed redirect_dir option will result with the same effective and displayed mode. Signed-off-by: Amir Goldstein

ovl: negate the ofs->share_whiteout boolean

2023-06-19T11:02:00+00:00

The default common case is that whiteout sharing is enabled. Change to storing the negated no_shared_whiteout state, so we will not need to initialize it. This is the first step towards removing all config and feature initializations out of ovl_fill_super(). Signed-off-by: Amir Goldstein

ovl: move ovl_entry into ovl_inode

2023-06-19T11:01:13+00:00

The lower stacks of all the ovl inode aliases should be identical and there is redundant information in ovl_entry and ovl_inode. Move lowerstack into ovl_inode and keep only the OVL_E_FLAGS per overlay dentry. Following patches will deduplicate redundant ovl_inode fields. Note that for pure upper and negative dentries, OVL_E(dentry) may be NULL now, so it is imporatnt to use the ovl_numlower() accessor. Reviewed-by: Alexander Larsson Signed-off-by: Amir Goldstein Signed-off-by: Miklos Szeredi

ovl: update of dentry revalidate flags after copy up

2023-06-19T11:01:12+00:00

After copy up, we may need to update d_flags if upper dentry is on a remote fs and lower dentries are not. Add helpers to allow incremental update of the revalidate flags. Fixes: bccece1ead36 ("ovl: allow remote upper") Reviewed-by: Gao Xiang Signed-off-by: Amir Goldstein Signed-off-by: Miklos Szeredi

fs: port inode_init_owner() to mnt_idmap

2023-01-19T08:24:28+00:00

Convert to struct mnt_idmap. Last cycle we merged the necessary infrastructure in 256c8aed2b42 ("fs: introduce dedicated idmap type for mounts"). This is just the conversion to struct mnt_idmap. Currently we still pass around the plain namespace that was attached to a mount. This is in general pretty convenient but it makes it easy to conflate namespaces that are relevant on the filesystem with namespaces that are relevent on the mount level. Especially for non-vfs developers without detailed knowledge in this area this can be a potential source for bugs. Once the conversion to struct mnt_idmap is done all helpers down to the really low-level helpers will take a struct mnt_idmap argument instead of two namespace arguments. This way it becomes impossible to conflate the two eliminating the possibility of any bugs. All of the vfs and all filesystems only operate on struct mnt_idmap. Acked-by: Dave Chinner Reviewed-by: Christoph Hellwig Signed-off-by: Christian Brauner (Microsoft)

fs: port ->rename() to pass mnt_idmap

2023-01-19T08:24:26+00:00

fs: port ->mknod() to pass mnt_idmap

2023-01-19T08:24:26+00:00