diff options
| author | Linus Torvalds <torvalds@linux-foundation.org> | 2026-02-09 14:43:47 -0800 |
|---|---|---|
| committer | Linus Torvalds <torvalds@linux-foundation.org> | 2026-02-09 14:43:47 -0800 |
| commit | 157d3d6efd5a58466d90be3a134f9667486fe6f9 (patch) | |
| tree | 8058c46480391d19b0e6166d4b45172c5c7de567 /Documentation/filesystems | |
| parent | 8113b3998d5c96aca885b967e6aa47e428ebc632 (diff) | |
| parent | 1bce1a664ac25d37a327c433a01bc347f0a81bd6 (diff) | |
| download | lwn-157d3d6efd5a58466d90be3a134f9667486fe6f9.tar.gz lwn-157d3d6efd5a58466d90be3a134f9667486fe6f9.zip | |
Merge tag 'vfs-7.0-rc1.namespace' of git://git.kernel.org/pub/scm/linux/kernel/git/vfs/vfs
Pull vfs mount updates from Christian Brauner:
- statmount: accept fd as a parameter
Extend struct mnt_id_req with a file descriptor field and a new
STATMOUNT_BY_FD flag. When set, statmount() returns mount information
for the mount the fd resides on — including detached mounts
(unmounted via umount2(MNT_DETACH)).
For detached mounts the STATMOUNT_MNT_POINT and STATMOUNT_MNT_NS_ID
mask bits are cleared since neither is meaningful. The capability
check is skipped for STATMOUNT_BY_FD since holding an fd already
implies prior access to the mount and equivalent information is
available through fstatfs() and /proc/pid/mountinfo without
privilege. Includes comprehensive selftests covering both attached
and detached mount cases.
- fs: Remove internal old mount API code (1 patch)
Now that every in-tree filesystem has been converted to the new
mount API, remove all the legacy shim code in fs_context.c that
handled unconverted filesystems. This deletes ~280 lines including
legacy_init_fs_context(), the legacy_fs_context struct, and
associated wrappers. The mount(2) syscall path for userspace remains
untouched. Documentation references to the legacy callbacks are
cleaned up.
- mount: add OPEN_TREE_NAMESPACE to open_tree()
Container runtimes currently use CLONE_NEWNS to copy the caller's
entire mount namespace — only to then pivot_root() and recursively
unmount everything they just copied. With large mount tables and
thousands of parallel container launches this creates significant
contention on the namespace semaphore.
OPEN_TREE_NAMESPACE copies only the specified mount tree (like
OPEN_TREE_CLONE) but returns a mount namespace fd instead of a
detached mount fd. The new namespace contains the copied tree mounted
on top of a clone of the real rootfs.
This functions as a combined unshare(CLONE_NEWNS) + pivot_root() in a
single syscall. Works with user namespaces: an unshare(CLONE_NEWUSER)
followed by OPEN_TREE_NAMESPACE creates a mount namespace owned by
the new user namespace. Mount namespace file mounts are excluded from
the copy to prevent cycles. Includes ~1000 lines of selftests"
* tag 'vfs-7.0-rc1.namespace' of git://git.kernel.org/pub/scm/linux/kernel/git/vfs/vfs:
selftests/open_tree: add OPEN_TREE_NAMESPACE tests
mount: add OPEN_TREE_NAMESPACE
fs: Remove internal old mount API code
selftests: statmount: tests for STATMOUNT_BY_FD
statmount: accept fd as a parameter
statmount: permission check should return EPERM
Diffstat (limited to 'Documentation/filesystems')
| -rw-r--r-- | Documentation/filesystems/locking.rst | 8 | ||||
| -rw-r--r-- | Documentation/filesystems/mount_api.rst | 2 | ||||
| -rw-r--r-- | Documentation/filesystems/porting.rst | 7 | ||||
| -rw-r--r-- | Documentation/filesystems/vfs.rst | 58 |
4 files changed, 5 insertions, 70 deletions
diff --git a/Documentation/filesystems/locking.rst b/Documentation/filesystems/locking.rst index 3837891e933d..8025df6e6499 100644 --- a/Documentation/filesystems/locking.rst +++ b/Documentation/filesystems/locking.rst @@ -180,7 +180,6 @@ prototypes:: int (*freeze_fs) (struct super_block *); int (*unfreeze_fs) (struct super_block *); int (*statfs) (struct dentry *, struct kstatfs *); - int (*remount_fs) (struct super_block *, int *, char *); void (*umount_begin) (struct super_block *); int (*show_options)(struct seq_file *, struct dentry *); ssize_t (*quota_read)(struct super_block *, int, char *, size_t, loff_t); @@ -204,7 +203,6 @@ sync_fs: read freeze_fs: write unfreeze_fs: write statfs: maybe(read) (see below) -remount_fs: write umount_begin: no show_options: no (namespace_sem) quota_read: no (see below) @@ -229,8 +227,6 @@ file_system_type prototypes:: - struct dentry *(*mount) (struct file_system_type *, int, - const char *, void *); void (*kill_sb) (struct super_block *); locking rules: @@ -238,13 +234,9 @@ locking rules: ======= ========= ops may block ======= ========= -mount yes kill_sb yes ======= ========= -->mount() returns ERR_PTR or the root dentry; its superblock should be locked -on return. - ->kill_sb() takes a write-locked superblock, does all shutdown work on it, unlocks and drops the reference. diff --git a/Documentation/filesystems/mount_api.rst b/Documentation/filesystems/mount_api.rst index c99ab1f7fea4..a064234fed5b 100644 --- a/Documentation/filesystems/mount_api.rst +++ b/Documentation/filesystems/mount_api.rst @@ -299,8 +299,6 @@ manage the filesystem context. They are as follows: On success it should return 0. In the case of an error, it should return a negative error code. - .. Note:: reconfigure is intended as a replacement for remount_fs. - Filesystem context Security =========================== diff --git a/Documentation/filesystems/porting.rst b/Documentation/filesystems/porting.rst index c0f7103628ab..ed3ac56e3c76 100644 --- a/Documentation/filesystems/porting.rst +++ b/Documentation/filesystems/porting.rst @@ -448,11 +448,8 @@ a file off. **mandatory** -->get_sb() is gone. Switch to use of ->mount(). Typically it's just -a matter of switching from calling ``get_sb_``... to ``mount_``... and changing -the function type. If you were doing it manually, just switch from setting -->mnt_root to some pointer to returning that pointer. On errors return -ERR_PTR(...). +->get_sb() and ->mount() are gone. Switch to using the new mount API. See +Documentation/filesystems/mount_api.rst for more details. --- diff --git a/Documentation/filesystems/vfs.rst b/Documentation/filesystems/vfs.rst index 85654eb91594..7c753148af88 100644 --- a/Documentation/filesystems/vfs.rst +++ b/Documentation/filesystems/vfs.rst @@ -94,11 +94,9 @@ functions: The passed struct file_system_type describes your filesystem. When a request is made to mount a filesystem onto a directory in your -namespace, the VFS will call the appropriate mount() method for the -specific filesystem. New vfsmount referring to the tree returned by -->mount() will be attached to the mountpoint, so that when pathname -resolution reaches the mountpoint it will jump into the root of that -vfsmount. +namespace, the VFS will call the appropriate get_tree() method for the +specific filesystem. See Documentation/filesystems/mount_api.rst +for more details. You can see all filesystems that are registered to the kernel in the file /proc/filesystems. @@ -117,8 +115,6 @@ members are defined: int fs_flags; int (*init_fs_context)(struct fs_context *); const struct fs_parameter_spec *parameters; - struct dentry *(*mount) (struct file_system_type *, int, - const char *, void *); void (*kill_sb) (struct super_block *); struct module *owner; struct file_system_type * next; @@ -151,10 +147,6 @@ members are defined: 'struct fs_parameter_spec'. More info in Documentation/filesystems/mount_api.rst. -``mount`` - the method to call when a new instance of this filesystem should - be mounted - ``kill_sb`` the method to call when an instance of this filesystem should be shut down @@ -173,45 +165,6 @@ members are defined: s_lock_key, s_umount_key, s_vfs_rename_key, s_writers_key, i_lock_key, i_mutex_key, invalidate_lock_key, i_mutex_dir_key: lockdep-specific -The mount() method has the following arguments: - -``struct file_system_type *fs_type`` - describes the filesystem, partly initialized by the specific - filesystem code - -``int flags`` - mount flags - -``const char *dev_name`` - the device name we are mounting. - -``void *data`` - arbitrary mount options, usually comes as an ASCII string (see - "Mount Options" section) - -The mount() method must return the root dentry of the tree requested by -caller. An active reference to its superblock must be grabbed and the -superblock must be locked. On failure it should return ERR_PTR(error). - -The arguments match those of mount(2) and their interpretation depends -on filesystem type. E.g. for block filesystems, dev_name is interpreted -as block device name, that device is opened and if it contains a -suitable filesystem image the method creates and initializes struct -super_block accordingly, returning its root dentry to caller. - -->mount() may choose to return a subtree of existing filesystem - it -doesn't have to create a new one. The main result from the caller's -point of view is a reference to dentry at the root of (sub)tree to be -attached; creation of new superblock is a common side effect. - -The most interesting member of the superblock structure that the mount() -method fills in is the "s_op" field. This is a pointer to a "struct -super_operations" which describes the next level of the filesystem -implementation. - -For more information on mounting (and the new mount API), see -Documentation/filesystems/mount_api.rst. - The Superblock Object ===================== @@ -244,7 +197,6 @@ filesystem. The following members are defined: enum freeze_wholder who); int (*unfreeze_fs) (struct super_block *); int (*statfs) (struct dentry *, struct kstatfs *); - int (*remount_fs) (struct super_block *, int *, char *); void (*umount_begin) (struct super_block *); int (*show_options)(struct seq_file *, struct dentry *); @@ -351,10 +303,6 @@ or bottom half). ``statfs`` called when the VFS needs to get filesystem statistics. -``remount_fs`` - called when the filesystem is remounted. This is called with - the kernel lock held - ``umount_begin`` called when the VFS is unmounting a filesystem. |
