diff options
author | Linus Torvalds <torvalds@linux-foundation.org> | 2012-05-23 17:42:39 -0700 |
---|---|---|
committer | Linus Torvalds <torvalds@linux-foundation.org> | 2012-05-23 17:42:39 -0700 |
commit | 644473e9c60c1ff4f6351fed637a6e5551e3dce7 (patch) | |
tree | 10316518bedc735a2c6552886658d69dfd9f1eb0 /init | |
parent | fb827ec68446c83e9e8754fa9b55aed27ecc4661 (diff) | |
parent | 4b06a81f1daee668fbd6de85557bfb36dd36078f (diff) | |
download | lwn-644473e9c60c1ff4f6351fed637a6e5551e3dce7.tar.gz lwn-644473e9c60c1ff4f6351fed637a6e5551e3dce7.zip |
Merge branch 'for-linus' of git://git.kernel.org/pub/scm/linux/kernel/git/ebiederm/user-namespace
Pull user namespace enhancements from Eric Biederman:
"This is a course correction for the user namespace, so that we can
reach an inexpensive, maintainable, and reasonably complete
implementation.
Highlights:
- Config guards make it impossible to enable the user namespace and
code that has not been converted to be user namespace safe.
- Use of the new kuid_t type ensures the if you somehow get past the
config guards the kernel will encounter type errors if you enable
user namespaces and attempt to compile in code whose permission
checks have not been updated to be user namespace safe.
- All uids from child user namespaces are mapped into the initial
user namespace before they are processed. Removing the need to add
an additional check to see if the user namespace of the compared
uids remains the same.
- With the user namespaces compiled out the performance is as good or
better than it is today.
- For most operations absolutely nothing changes performance or
operationally with the user namespace enabled.
- The worst case performance I could come up with was timing 1
billion cache cold stat operations with the user namespace code
enabled. This went from 156s to 164s on my laptop (or 156ns to
164ns per stat operation).
- (uid_t)-1 and (gid_t)-1 are reserved as an internal error value.
Most uid/gid setting system calls treat these value specially
anyway so attempting to use -1 as a uid would likely cause
entertaining failures in userspace.
- If setuid is called with a uid that can not be mapped setuid fails.
I have looked at sendmail, login, ssh and every other program I
could think of that would call setuid and they all check for and
handle the case where setuid fails.
- If stat or a similar system call is called from a context in which
we can not map a uid we lie and return overflowuid. The LFS
experience suggests not lying and returning an error code might be
better, but the historical precedent with uids is different and I
can not think of anything that would break by lying about a uid we
can't map.
- Capabilities are localized to the current user namespace making it
safe to give the initial user in a user namespace all capabilities.
My git tree covers all of the modifications needed to convert the core
kernel and enough changes to make a system bootable to runlevel 1."
Fix up trivial conflicts due to nearby independent changes in fs/stat.c
* 'for-linus' of git://git.kernel.org/pub/scm/linux/kernel/git/ebiederm/user-namespace: (46 commits)
userns: Silence silly gcc warning.
cred: use correct cred accessor with regards to rcu read lock
userns: Convert the move_pages, and migrate_pages permission checks to use uid_eq
userns: Convert cgroup permission checks to use uid_eq
userns: Convert tmpfs to use kuid and kgid where appropriate
userns: Convert sysfs to use kgid/kuid where appropriate
userns: Convert sysctl permission checks to use kuid and kgids.
userns: Convert proc to use kuid/kgid where appropriate
userns: Convert ext4 to user kuid/kgid where appropriate
userns: Convert ext3 to use kuid/kgid where appropriate
userns: Convert ext2 to use kuid/kgid where appropriate.
userns: Convert devpts to use kuid/kgid where appropriate
userns: Convert binary formats to use kuid/kgid where appropriate
userns: Add negative depends on entries to avoid building code that is userns unsafe
userns: signal remove unnecessary map_cred_ns
userns: Teach inode_capable to understand inodes whose uids map to other namespaces.
userns: Fail exec for suid and sgid binaries with ids outside our user namespace.
userns: Convert stat to return values mapped from kuids and kgids
userns: Convert user specfied uids and gids in chown into kuids and kgid
userns: Use uid_eq gid_eq helpers when comparing kuids and kgids in the vfs
...
Diffstat (limited to 'init')
-rw-r--r-- | init/Kconfig | 130 |
1 files changed, 129 insertions, 1 deletions
diff --git a/init/Kconfig b/init/Kconfig index a30fe085940e..ccb5248474c2 100644 --- a/init/Kconfig +++ b/init/Kconfig @@ -873,7 +873,10 @@ config IPC_NS config USER_NS bool "User namespace (EXPERIMENTAL)" depends on EXPERIMENTAL - default y + depends on UIDGID_CONVERTED + select UIDGID_STRICT_TYPE_CHECKS + + default n help This allows containers, i.e. vservers, to use user namespaces to provide different user info for different servers. @@ -897,6 +900,131 @@ config NET_NS endif # NAMESPACES +config UIDGID_CONVERTED + # True if all of the selected software conmponents are known + # to have uid_t and gid_t converted to kuid_t and kgid_t + # where appropriate and are otherwise safe to use with + # the user namespace. + bool + default y + + # List of kernel pieces that need user namespace work + # Features + depends on SYSVIPC = n + depends on IMA = n + depends on EVM = n + depends on KEYS = n + depends on AUDIT = n + depends on AUDITSYSCALL = n + depends on TASKSTATS = n + depends on TRACING = n + depends on FS_POSIX_ACL = n + depends on QUOTA = n + depends on QUOTACTL = n + depends on DEBUG_CREDENTIALS = n + depends on BSD_PROCESS_ACCT = n + depends on DRM = n + depends on PROC_EVENTS = n + + # Networking + depends on NET = n + depends on NET_9P = n + depends on IPX = n + depends on PHONET = n + depends on NET_CLS_FLOW = n + depends on NETFILTER_XT_MATCH_OWNER = n + depends on NETFILTER_XT_MATCH_RECENT = n + depends on NETFILTER_XT_TARGET_LOG = n + depends on NETFILTER_NETLINK_LOG = n + depends on INET = n + depends on IPV6 = n + depends on IP_SCTP = n + depends on AF_RXRPC = n + depends on LLC2 = n + depends on NET_KEY = n + depends on INET_DIAG = n + depends on DNS_RESOLVER = n + depends on AX25 = n + depends on ATALK = n + + # Filesystems + depends on USB_DEVICEFS = n + depends on USB_GADGETFS = n + depends on USB_FUNCTIONFS = n + depends on DEVTMPFS = n + depends on XENFS = n + + depends on 9P_FS = n + depends on ADFS_FS = n + depends on AFFS_FS = n + depends on AFS_FS = n + depends on AUTOFS4_FS = n + depends on BEFS_FS = n + depends on BFS_FS = n + depends on BTRFS_FS = n + depends on CEPH_FS = n + depends on CIFS = n + depends on CODA_FS = n + depends on CONFIGFS_FS = n + depends on CRAMFS = n + depends on DEBUG_FS = n + depends on ECRYPT_FS = n + depends on EFS_FS = n + depends on EXOFS_FS = n + depends on FAT_FS = n + depends on FUSE_FS = n + depends on GFS2_FS = n + depends on HFS_FS = n + depends on HFSPLUS_FS = n + depends on HPFS_FS = n + depends on HUGETLBFS = n + depends on ISO9660_FS = n + depends on JFFS2_FS = n + depends on JFS_FS = n + depends on LOGFS = n + depends on MINIX_FS = n + depends on NCP_FS = n + depends on NFSD = n + depends on NFS_FS = n + depends on NILFS2_FS = n + depends on NTFS_FS = n + depends on OCFS2_FS = n + depends on OMFS_FS = n + depends on QNX4FS_FS = n + depends on QNX6FS_FS = n + depends on REISERFS_FS = n + depends on SQUASHFS = n + depends on SYSV_FS = n + depends on UBIFS_FS = n + depends on UDF_FS = n + depends on UFS_FS = n + depends on VXFS_FS = n + depends on XFS_FS = n + + depends on !UML || HOSTFS = n + + # The rare drivers that won't build + depends on AIRO = n + depends on AIRO_CS = n + depends on TUN = n + depends on INFINIBAND_QIB = n + depends on BLK_DEV_LOOP = n + depends on ANDROID_BINDER_IPC = n + + # Security modules + depends on SECURITY_TOMOYO = n + depends on SECURITY_APPARMOR = n + +config UIDGID_STRICT_TYPE_CHECKS + bool "Require conversions between uid/gids and their internal representation" + depends on UIDGID_CONVERTED + default n + help + While the nececessary conversions are being added to all subsystems this option allows + the code to continue to build for unconverted subsystems. + + Say Y here if you want the strict type checking enabled + config SCHED_AUTOGROUP bool "Automatic process group scheduling" select EVENTFD |