Merge branch 'next' of git://git.kernel.org/pub/scm/linux/kernel/git/jmorris/linux-security

Pull security subsystem updates from James Morris: "New notable features: - The seccomp work from Will Drewry - PR_{GET,SET}_NO_NEW_PRIVS from Andy Lutomirski - Longer security labels for Smack from Casey Schaufler - Additional ptrace restriction modes for Yama by Kees Cook" Fix up trivial context conflicts in arch/x86/Kconfig and include/linux/filter.h * 'next' of git://git.kernel.org/pub/scm/linux/kernel/git/jmorris/linux-security: (65 commits) apparmor: fix long path failure due to disconnected path apparmor: fix profile lookup for unconfined ima: fix filename hint to reflect script interpreter name KEYS: Don't check for NULL key pointer in key_validate() Smack: allow for significantly longer Smack labels v4 gfp flags for security_inode_alloc()? Smack: recursive tramsmute Yama: replace capable() with ns_capable() TOMOYO: Accept manager programs which do not start with / . KEYS: Add invalidation support KEYS: Do LRU discard in full keyrings KEYS: Permit in-place link replacement in keyring list KEYS: Perform RCU synchronisation on keys prior to key destruction KEYS: Announce key type (un)registration KEYS: Reorganise keys Makefile KEYS: Move the key config into security/keys/Kconfig KEYS: Use the compat keyctl() syscall wrapper on Sparc64 for Sparc32 compat Yama: remove an unused variable samples/seccomp: fix dependencies on arch macros Yama: add additional ptrace scopes ...
author: Linus Torvalds <torvalds@linux-foundation.org> 2012-05-21 20:27:36 -0700
committer: Linus Torvalds <torvalds@linux-foundation.org> 2012-05-21 20:27:36 -0700
commit: cb60e3e65c1b96a4d6444a7a13dc7dd48bc15a2b (patch)
tree: 4322be35db678f6299348a76ad60a2023954af7d
parent: 99262a3dafa3290866512ddfb32609198f8973e9 (diff)
parent: ff2bb047c4bce9742e94911eeb44b4d6ff4734ab (diff)
download: lwn-cb60e3e65c1b96a4d6444a7a13dc7dd48bc15a2b.tar.gz
lwn-cb60e3e65c1b96a4d6444a7a13dc7dd48bc15a2b.zip
102 files changed, 3678 insertions, 1230 deletions
diff --git a/Documentation/prctl/seccomp_filter.txt b/Documentation/prctl/seccomp_filter.txt
new file mode 100644
index 000000000000..597c3c581375
--- /dev/null
+++ b/Documentation/prctl/seccomp_filter.txt
@@ -0,0 +1,163 @@
+		SECure COMPuting with filters
+		=============================
+
+Introduction
+------------
+
+A large number of system calls are exposed to every userland process
+with many of them going unused for the entire lifetime of the process.
+As system calls change and mature, bugs are found and eradicated.  A
+certain subset of userland applications benefit by having a reduced set
+of available system calls.  The resulting set reduces the total kernel
+surface exposed to the application.  System call filtering is meant for
+use with those applications.
+
+Seccomp filtering provides a means for a process to specify a filter for
+incoming system calls.  The filter is expressed as a Berkeley Packet
+Filter (BPF) program, as with socket filters, except that the data
+operated on is related to the system call being made: system call
+number and the system call arguments.  This allows for expressive
+filtering of system calls using a filter program language with a long
+history of being exposed to userland and a straightforward data set.
+
+Additionally, BPF makes it impossible for users of seccomp to fall prey
+to time-of-check-time-of-use (TOCTOU) attacks that are common in system
+call interposition frameworks.  BPF programs may not dereference
+pointers which constrains all filters to solely evaluating the system
+call arguments directly.
+
+What it isn't
+-------------
+
+System call filtering isn't a sandbox.  It provides a clearly defined
+mechanism for minimizing the exposed kernel surface.  It is meant to be
+a tool for sandbox developers to use.  Beyond that, policy for logical
+behavior and information flow should be managed with a combination of
+other system hardening techniques and, potentially, an LSM of your
+choosing.  Expressive, dynamic filters provide further options down this
+path (avoiding pathological sizes or selecting which of the multiplexed
+system calls in socketcall() is allowed, for instance) which could be
+construed, incorrectly, as a more complete sandboxing solution.
+
+Usage
+-----
+
+An additional seccomp mode is added and is enabled using the same
+prctl(2) call as the strict seccomp.  If the architecture has
+CONFIG_HAVE_ARCH_SECCOMP_FILTER, then filters may be added as below:
+
+PR_SET_SECCOMP:
+	Now takes an additional argument which specifies a new filter
+	using a BPF program.
+	The BPF program will be executed over struct seccomp_data
+	reflecting the system call number, arguments, and other
+	metadata.  The BPF program must then return one of the
+	acceptable values to inform the kernel which action should be
+	taken.
+
+	Usage:
+		prctl(PR_SET_SECCOMP, SECCOMP_MODE_FILTER, prog);
+
+	The 'prog' argument is a pointer to a struct sock_fprog which
+	will contain the filter program.  If the program is invalid, the
+	call will return -1 and set errno to EINVAL.
+
+	If fork/clone and execve are allowed by @prog, any child
+	processes will be constrained to the same filters and system
+	call ABI as the parent.
+
+	Prior to use, the task must call prctl(PR_SET_NO_NEW_PRIVS, 1) or
+	run with CAP_SYS_ADMIN privileges in its namespace.  If these are not
+	true, -EACCES will be returned.  This requirement ensures that filter
+	programs cannot be applied to child processes with greater privileges
+	than the task that installed them.
+
+	Additionally, if prctl(2) is allowed by the attached filter,
+	additional filters may be layered on which will increase evaluation
+	time, but allow for further decreasing the attack surface during
+	execution of a process.
+
+The above call returns 0 on success and non-zero on error.
+
+Return values
+-------------
+A seccomp filter may return any of the following values. If multiple
+filters exist, the return value for the evaluation of a given system
+call will always use the highest precedent value. (For example,
+SECCOMP_RET_KILL will always take precedence.)
+
+In precedence order, they are:
+
+SECCOMP_RET_KILL:
+	Results in the task exiting immediately without executing the
+	system call.  The exit status of the task (status & 0x7f) will
+	be SIGSYS, not SIGKILL.
+
+SECCOMP_RET_TRAP:
+	Results in the kernel sending a SIGSYS signal to the triggering
+	task without executing the system call.  The kernel will
+	rollback the register state to just before the system call
+	entry such that a signal handler in the task will be able to
+	inspect the ucontext_t->uc_mcontext registers and emulate
+	system call success or failure upon return from the signal
+	handler.
+
+	The SECCOMP_RET_DATA portion of the return value will be passed
+	as si_errno.
+
+	SIGSYS triggered by seccomp will have a si_code of SYS_SECCOMP.
+
+SECCOMP_RET_ERRNO:
+	Results in the lower 16-bits of the return value being passed
+	to userland as the errno without executing the system call.
+
+SECCOMP_RET_TRACE:
+	When returned, this value will cause the kernel to attempt to
+	notify a ptrace()-based tracer prior to executing the system
+	call.  If there is no tracer present, -ENOSYS is returned to
+	userland and the system call is not executed.
+
+	A tracer will be notified if it requests PTRACE_O_TRACESECCOMP
+	using ptrace(PTRACE_SETOPTIONS).  The tracer will be notified
+	of a PTRACE_EVENT_SECCOMP and the SECCOMP_RET_DATA portion of
+	the BPF program return value will be available to the tracer
+	via PTRACE_GETEVENTMSG.
+
+SECCOMP_RET_ALLOW:
+	Results in the system call being executed.
+
+If multiple filters exist, the return value for the evaluation of a
+given system call will always use the highest precedent value.
+
+Precedence is only determined using the SECCOMP_RET_ACTION mask.  When
+multiple filters return values of the same precedence, only the
+SECCOMP_RET_DATA from the most recently installed filter will be
+returned.
+
+Pitfalls
+--------
+
+The biggest pitfall to avoid during use is filtering on system call
+number without checking the architecture value.  Why?  On any
+architecture that supports multiple system call invocation conventions,
+the system call numbers may vary based on the specific invocation.  If
+the numbers in the different calling conventions overlap, then checks in
+the filters may be abused.  Always check the arch value!
+
+Example
+-------
+
+The samples/seccomp/ directory contains both an x86-specific example
+and a more generic example of a higher level macro interface for BPF
+program generation.
+
+
+
+Adding architecture support
+-----------------------
+
+See arch/Kconfig for the authoritative requirements.  In general, if an
+architecture supports both ptrace_event and seccomp, it will be able to
+support seccomp filter with minor fixup: SIGSYS support and seccomp return
+value checking.  Then it must just add CONFIG_HAVE_ARCH_SECCOMP_FILTER
+to its arch-specific Kconfig.
diff --git a/Documentation/security/Smack.txt b/Documentation/security/Smack.txt
index d2f72ae66432..a416479b8a1c 100644
--- a/Documentation/security/Smack.txt
+++ b/Documentation/security/Smack.txt
@@ -15,7 +15,7 @@ at hand.
 
 Smack consists of three major components:
     - The kernel
-    - A start-up script and a few modified applications
+    - Basic utilities, which are helpful but not required
     - Configuration data
 
 The kernel component of Smack is implemented as a Linux
@@ -23,37 +23,28 @@ Security Modules (LSM) module. It requires netlabel and
 works best with file systems that support extended attributes,
 although xattr support is not strictly required.
 It is safe to run a Smack kernel under a "vanilla" distribution.
+
 Smack kernels use the CIPSO IP option. Some network
 configurations are intolerant of IP options and can impede
 access to systems that use them as Smack does.
 
-The startup script etc-init.d-smack should be installed
-in /etc/init.d/smack and should be invoked early in the
-start-up process. On Fedora rc5.d/S02smack is recommended.
-This script ensures that certain devices have the correct
-Smack attributes and loads the Smack configuration if
-any is defined. This script invokes two programs that
-ensure configuration data is properly formatted. These
-programs are /usr/sbin/smackload and /usr/sin/smackcipso.
-The system will run just fine without these programs,
-but it will be difficult to set access rules properly.
-
-A version of "ls" that provides a "-M" option to display
-Smack labels on long listing is available.
+The current git repositories for Smack user space are:
 
-A hacked version of sshd that allows network logins by users
-with specific Smack labels is available. This version does
-not work for scp. You must set the /etc/ssh/sshd_config
-line:
-   UsePrivilegeSeparation no
+	git@gitorious.org:meego-platform-security/smackutil.git
+	git@gitorious.org:meego-platform-security/libsmack.git
 
-The format of /etc/smack/usr is:
+These should make and install on most modern distributions.
+There are three commands included in smackutil:
 
-   username smack
+smackload  - properly formats data for writing to /smack/load
+smackcipso - properly formats data for writing to /smack/cipso
+chsmack    - display or set Smack extended attribute values
 
 In keeping with the intent of Smack, configuration data is
 minimal and not strictly required. The most important
 configuration step is mounting the smackfs pseudo filesystem.
+If smackutil is installed the startup script will take care
+of this, but it can be manually as well.
 
 Add this line to /etc/fstab:
 
@@ -61,19 +52,148 @@ Add this line to /etc/fstab:
 
 and create the /smack directory for mounting.
 
-Smack uses extended attributes (xattrs) to store file labels.
-The command to set a Smack label on a file is:
+Smack uses extended attributes (xattrs) to store labels on filesystem
+objects. The attributes are stored in the extended attribute security
+name space. A process must have CAP_MAC_ADMIN to change any of these
+attributes.
+
+The extended attributes that Smack uses are:
+
+SMACK64
+	Used to make access control decisions. In almost all cases
+	the label given to a new filesystem object will be the label
+	of the process that created it.
+SMACK64EXEC
+	The Smack label of a process that execs a program file with
+	this attribute set will run with this attribute's value.
+SMACK64MMAP
+	Don't allow the file to be mmapped by a process whose Smack
+	label does not allow all of the access permitted to a process
+	with the label contained in this attribute. This is a very
+	specific use case for shared libraries.
+SMACK64TRANSMUTE
+	Can only have the value "TRUE". If this attribute is present
+	on a directory when an object is created in the directory and
+	the Smack rule (more below) that permitted the write access
+	to the directory includes the transmute ("t") mode the object
+	gets the label of the directory instead of the label of the
+	creating process. If the object being created is a directory
+	the SMACK64TRANSMUTE attribute is set as well.
+SMACK64IPIN
+	This attribute is only available on file descriptors for sockets.
+	Use the Smack label in this attribute for access control
+	decisions on packets being delivered to this socket.
+SMACK64IPOUT
+	This attribute is only available on file descriptors for sockets.
+	Use the Smack label in this attribute for access control
+	decisions on packets coming from this socket.
+
+There are multiple ways to set a Smack label on a file:
 
     # attr -S -s SMACK64 -V "value" path
+    # chsmack -a value path
 
-NOTE: Smack labels are limited to 23 characters. The attr command
-      does not enforce this restriction and can be used to set
-      invalid Smack labels on files.
-
-If you don't do anything special all users will get the floor ("_")
-label when they log in. If you do want to log in via the hacked ssh
-at other labels use the attr command to set the smack value on the
-home directory and its contents.
+A process can see the smack label it is running with by
+reading /proc/self/attr/current. A process with CAP_MAC_ADMIN
+can set the process smack by writing there.
+
+Most Smack configuration is accomplished by writing to files
+in the smackfs filesystem. This pseudo-filesystem is usually
+mounted on /smack.
+
+access
+	This interface reports whether a subject with the specified
+	Smack label has a particular access to an object with a
+	specified Smack label. Write a fixed format access rule to
+	this file. The next read will indicate whether the access
+	would be permitted. The text will be either "1" indicating
+	access, or "0" indicating denial.
+access2
+	This interface reports whether a subject with the specified
+	Smack label has a particular access to an object with a
+	specified Smack label. Write a long format access rule to
+	this file. The next read will indicate whether the access
+	would be permitted. The text will be either "1" indicating
+	access, or "0" indicating denial.
+ambient
+	This contains the Smack label applied to unlabeled network
+	packets.
+cipso
+	This interface allows a specific CIPSO header to be assigned
+	to a Smack label. The format accepted on write is:
+		"%24s%4d%4d"["%4d"]...
+	The first string is a fixed Smack label. The first number is
+	the level to use. The second number is the number of categories.
+	The following numbers are the categories.
+	"level-3-cats-5-19          3   2   5  19"
+cipso2
+	This interface allows a specific CIPSO header to be assigned
+	to a Smack label. The format accepted on write is:
+	"%s%4d%4d"["%4d"]...
+	The first string is a long Smack label. The first number is
+	the level to use. The second number is the number of categories.
+	The following numbers are the categories.
+	"level-3-cats-5-19   3   2   5  19"
+direct
+	This contains the CIPSO level used for Smack direct label
+	representation in network packets.
+doi
+	This contains the CIPSO domain of interpretation used in
+	network packets.
+load
+	This interface allows access control rules in addition to
+	the system defined rules to be specified. The format accepted
+	on write is:
+		"%24s%24s%5s"
+	where the first string is the subject label, the second the
+	object label, and the third the requested access. The access
+	string may contain only the characters "rwxat-", and specifies
+	which sort of access is allowed. The "-" is a placeholder for
+	permissions that are not allowed. The string "r-x--" would
+	specify read and execute access. Labels are limited to 23
+	characters in length.
+load2
+	This interface allows access control rules in addition to
+	the system defined rules to be specified. The format accepted
+	on write is:
+		"%s %s %s"
+	where the first string is the subject label, the second the
+	object label, and the third the requested access. The access
+	string may contain only the characters "rwxat-", and specifies
+	which sort of access is allowed. The "-" is a placeholder for
+	permissions that are not allowed. The string "r-x--" would
+	specify read and execute access.
+load-self
+	This interface allows process specific access rules to be
+	defined. These rules are only consulted if access would
+	otherwise be permitted, and are intended to provide additional
+	restrictions on the process. The format is the same as for
+	the load interface.
+load-self2
+	This interface allows process specific access rules to be
+	defined. These rules are only consulted if access would
+	otherwise be permitted, and are intended to provide additional
+	restrictions on the process. The format is the same as for
+	the load2 interface.
+logging
+	This contains the Smack logging state.
+mapped
+	This contains the CIPSO level used for Smack mapped label
+	representation in network packets.
+netlabel
+	This interface allows specific internet addresses to be
+	treated as single label hosts. Packets are sent to single
+	label hosts without CIPSO headers, but only from processes
+	that have Smack write access to the host label. All packets
+	received from single label hosts are given the specified
+	label. The format accepted on write is:
+		"%d.%d.%d.%d label" or "%d.%d.%d.%d/%d label".
+onlycap
+	This contains the label processes must have for CAP_MAC_ADMIN
+	and CAP_MAC_OVERRIDE to be effective. If this file is empty
+	these capabilities are effective at for processes with any
+	label. The value is set by writing the desired label to the
+	file or cleared by writing "-" to the file.
 
 You can add access rules in /etc/smack/accesses. They take the form:
 
@@ -83,10 +203,6 @@ access is a combination of the letters rwxa which specify the
 kind of access permitted a subject with subjectlabel on an
 object with objectlabel. If there is no rule no access is allowed.
 
-A process can see the smack label it is running with by
-reading /proc/self/attr/current. A privileged process can
-set the process smack by writing there.
-
 Look for additional programs on http://schaufler-ca.com
 
 From the Smack Whitepaper:
@@ -186,7 +302,7 @@ team. Smack labels are unstructured, case sensitive, and the only operation
 ever performed on them is comparison for equality. Smack labels cannot
 contain unprintable characters, the "/" (slash), the "\" (backslash), the "'"
 (quote) and '"' (double-quote) characters.
-Smack labels cannot begin with a '-', which is reserved for special options.
+Smack labels cannot begin with a '-'. This is reserved for special options.
 
 There are some predefined labels:
 
@@ -194,7 +310,7 @@ There are some predefined labels:
 	^ 	Pronounced "hat", a single circumflex character.
 	* 	Pronounced "star", a single asterisk character.
 	? 	Pronounced "huh", a single question mark character.
-	@ 	Pronounced "Internet", a single at sign character.
+	@ 	Pronounced "web", a single at sign character.
 
 Every task on a Smack system is assigned a label. System tasks, such as
 init(8) and systems daemons, are run with the floor ("_") label. User tasks
@@ -246,13 +362,14 @@ The format of an access rule is:
 
 Where subject-label is the Smack label of the task, object-label is the Smack
 label of the thing being accessed, and access is a string specifying the sort
-of access allowed. The Smack labels are limited to 23 characters. The access
-specification is searched for letters that describe access modes:
+of access allowed. The access specification is searched for letters that
+describe access modes:
 
 	a: indicates that append access should be granted.
 	r: indicates that read access should be granted.
 	w: indicates that write access should be granted.
 	x: indicates that execute access should be granted.
+	t: indicates that the rule requests transmutation.
 
 Uppercase values for the specification letters are allowed as well.
 Access mode specifications can be in any order. Examples of acceptable rules
@@ -273,7 +390,7 @@ Examples of unacceptable rules are:
 
 Spaces are not allowed in labels. Since a subject always has access to files
 with the same label specifying a rule for that case is pointless. Only
-valid letters (rwxaRWXA) and the dash ('-') character are allowed in
+valid letters (rwxatRWXAT) and the dash ('-') character are allowed in
 access specifications. The dash is a placeholder, so "a-r" is the same
 as "ar". A lone dash is used to specify that no access should be allowed.
 
@@ -297,6 +414,13 @@ but not any of its attributes by the circumstance of having read access to the
 containing directory but not to the differently labeled file. This is an
 artifact of the file name being data in the directory, not a part of the file.
 
+If a directory is marked as transmuting (SMACK64TRANSMUTE=TRUE) and the
+access rule that allows a process to create an object in that directory
+includes 't' access the label assigned to the new object will be that
+of the directory, not the creating process. This makes it much easier
+for two processes with different labels to share data without granting
+access to all of their files.
+
 IPC objects, message queues, semaphore sets, and memory segments exist in flat
 namespaces and access requests are only required to match the object in
 question.
diff --git a/Documentation/security/Yama.txt b/Documentation/security/Yama.txt
index a9511f179069..e369de2d48cd 100644
--- a/Documentation/security/Yama.txt
+++ b/Documentation/security/Yama.txt
@@ -34,7 +34,7 @@ parent to a child process (i.e. direct "gdb EXE" and "strace EXE" still
 work), or with CAP_SYS_PTRACE (i.e. "gdb --pid=PID", and "strace -p PID"
 still work as root).
 
-For software that has defined application-specific relationships
+In mode 1, software that has defined application-specific relationships
 between a debugging process and its inferior (crash handlers, etc),
 prctl(PR_SET_PTRACER, pid, ...) can be used. An inferior can declare which
 other process (and its descendents) are allowed to call PTRACE_ATTACH
@@ -46,6 +46,8 @@ restrictions, it can call prctl(PR_SET_PTRACER, PR_SET_PTRACER_ANY, ...)
 so that any otherwise allowed process (even those in external pid namespaces)
 may attach.
 
+These restrictions do not change how ptrace via PTRACE_TRACEME operates.
+
 The sysctl settings are:
 
 0 - classic ptrace permissions: a process can PTRACE_ATTACH to any other
@@ -60,6 +62,12 @@ The sysctl settings are:
     inferior can call prctl(PR_SET_PTRACER, debugger, ...) to declare
     an allowed debugger PID to call PTRACE_ATTACH on the inferior.
 
+2 - admin-only attach: only processes with CAP_SYS_PTRACE may use ptrace
+    with PTRACE_ATTACH.
+
+3 - no attach: no processes may use ptrace with PTRACE_ATTACH. Once set,
+    this sysctl cannot be changed to a lower value.
+
 The original children-only logic was based on the restrictions in grsecurity.
 
 ==============================================================
diff --git a/Documentation/security/keys.txt b/Documentation/security/keys.txt
index d389acd31e19..aa0dbd74b71b 100644
--- a/Documentation/security/keys.txt
+++ b/Documentation/security/keys.txt
@@ -805,6 +805,23 @@ The keyctl syscall functions are:
      kernel and resumes executing userspace.
 
 
+ (*) Invalidate a key.
+
+	long keyctl(KEYCTL_INVALIDATE, key_serial_t key);
+
+     This function marks a key as being invalidated and then wakes up the
+     garbage collector.  The garbage collector immediately removes invalidated
+     keys from all keyrings and deletes the key when its reference count
+     reaches zero.
+
+     Keys that are marked invalidated become invisible to normal key operations
+     immediately, though they are still visible in /proc/keys until deleted
+     (they're marked with an 'i' flag).
+
+     A process must have search permission on the key for this function to be
+     successful.
+
+
 ===============
 KERNEL SERVICES
 ===============
diff --git a/MAINTAINERS b/MAINTAINERS
index 5ccca1ca0077..d4abe7572ead 100644
--- a/MAINTAINERS
+++ b/MAINTAINERS
@@ -1733,6 +1733,7 @@ S:	Supported
 F:	include/linux/capability.h
 F:	security/capability.c
 F:	security/commoncap.c 
+F:	kernel/capability.c
 
 CELL BROADBAND ENGINE ARCHITECTURE
 M:	Arnd Bergmann <arnd@arndb.de>
@@ -5950,7 +5951,7 @@ SECURITY SUBSYSTEM
 M:	James Morris <james.l.morris@oracle.com>
 L:	linux-security-module@vger.kernel.org (suggested Cc:)
 T:	git git://git.kernel.org/pub/scm/linux/kernel/git/jmorris/linux-security.git
-W:	http://security.wiki.kernel.org/
+W:	http://kernsec.org/
 S:	Supported
 F:	security/
 
diff --git a/arch/Kconfig b/arch/Kconfig
index bd265a217bd2..1f9461b9cc89 100644
--- a/arch/Kconfig
+++ b/arch/Kconfig
@@ -231,4 +231,27 @@ config HAVE_CMPXCHG_DOUBLE
 config ARCH_WANT_OLD_COMPAT_IPC
 	bool
 
+config HAVE_ARCH_SECCOMP_FILTER
+	bool
+	help
+	  An arch should select this symbol if it provides all of these things:
+	  - syscall_get_arch()
+	  - syscall_get_arguments()
+	  - syscall_rollback()
+	  - syscall_set_return_value()
+	  - SIGSYS siginfo_t support
+	  - secure_computing is called from a ptrace_event()-safe context
+	  - secure_computing return value is checked and a return value of -1
+	    results in the system call being skipped immediately.
+
+config SECCOMP_FILTER
+	def_bool y
+	depends on HAVE_ARCH_SECCOMP_FILTER && SECCOMP && NET
+	help
+	  Enable tasks to build secure computing environments defined
+	  in terms of Berkeley Packet Filter programs which implement
+	  task-defined system call filtering polices.
+
+	  See Documentation/prctl/seccomp_filter.txt for details.
+
 source "kernel/gcov/Kconfig"
diff --git a/arch/microblaze/kernel/ptrace.c b/arch/microblaze/kernel/ptrace.c
index 6eb2aa927d89..ab1b9db661f3 100644
--- a/arch/microblaze/kernel/ptrace.c
+++ b/arch/microblaze/kernel/ptrace.c
@@ -136,7 +136,7 @@ asmlinkage long do_syscall_trace_enter(struct pt_regs *regs)
 {
 	long ret = 0;
 
-	secure_computing(regs->r12);
+	secure_computing_strict(regs->r12);
 
 	if (test_thread_flag(TIF_SYSCALL_TRACE) &&
 	    tracehook_report_syscall_entry(regs))
diff --git a/arch/mips/kernel/ptrace.c b/arch/mips/kernel/ptrace.c
index 7c24c2973c6d..4812c6d916e4 100644
--- a/arch/mips/kernel/ptrace.c
+++ b/arch/mips/kernel/ptrace.c
@@ -535,7 +535,7 @@ static inline int audit_arch(void)
 asmlinkage void syscall_trace_enter(struct pt_regs *regs)
 {
 	/* do the secure computing check first */
-	secure_computing(regs->regs[2]);
+	secure_computing_strict(regs->regs[2]);
 
 	if (!(current->ptrace & PT_PTRACED))
 		goto out;
diff --git a/arch/powerpc/kernel/ptrace.c b/arch/powerpc/kernel/ptrace.c
index 8d8e028893be..dd5e214cdf21 100644
--- a/arch/powerpc/kernel/ptrace.c
+++ b/arch/powerpc/kernel/ptrace.c
@@ -1710,7 +1710,7 @@ long do_syscall_trace_enter(struct pt_regs *regs)
 {
 	long ret = 0;
 
-	secure_computing(regs->gpr[0]);
+	secure_computing_strict(regs->gpr[0]);
 
 	if (test_thread_flag(TIF_SYSCALL_TRACE) &&
 	    tracehook_report_syscall_entry(regs))
diff --git a/arch/s390/kernel/ptrace.c b/arch/s390/kernel/ptrace.c
index 02f300fbf070..4993e689b2c2 100644
--- a/arch/s390/kernel/ptrace.c
+++ b/arch/s390/kernel/ptrace.c
@@ -719,7 +719,7 @@ asmlinkage long do_syscall_trace_enter(struct pt_regs *regs)
 	long ret = 0;
 
 	/* Do the secure computing check first. */
-	secure_computing(regs->gprs[2]);
+	secure_computing_strict(regs->gprs[2]);
 
 	/*
 	 * The sysc_tracesys code in entry.S stored the system
diff --git a/arch/sh/kernel/ptrace_32.c b/arch/sh/kernel/ptrace_32.c
index 9698671444e6..81f999a672f6 100644
--- a/arch/sh/kernel/ptrace_32.c
+++ b/arch/sh/kernel/ptrace_32.c
@@ -503,7 +503,7 @@ asmlinkage long do_syscall_trace_enter(struct pt_regs *regs)
 {
 	long ret = 0;
 
-	secure_computing(regs->regs[0]);
+	secure_computing_strict(regs->regs[0]);
 
 	if (test_thread_flag(TIF_SYSCALL_TRACE) &&
 	    tracehook_report_syscall_entry(regs))
diff --git a/arch/sh/kernel/ptrace_64.c b/arch/sh/kernel/ptrace_64.c
index bc81e07dc098..af90339dadcd 100644
--- a/arch/sh/kernel/ptrace_64.c
+++ b/arch/sh/kernel/ptrace_64.c
@@ -522,7 +522,7 @@ asmlinkage long long do_syscall_trace_enter(struct pt_regs *regs)
 {
 	long long ret = 0;
 
-	secure_computing(regs->regs[9]);
+	secure_computing_strict(regs->regs[9]);
 
 	if (test_thread_flag(TIF_SYSCALL_TRACE) &&
 	    tracehook_report_syscall_entry(regs))
diff --git a/arch/sparc/Kconfig b/arch/sparc/Kconfig
index b2b9daf59051..1ea3fd954756 100644
--- a/arch/sparc/Kconfig
+++ b/arch/sparc/Kconfig
@@ -583,6 +583,9 @@ config SYSVIPC_COMPAT
 	depends on COMPAT && SYSVIPC
 	default y
 
+config KEYS_COMPAT
+	def_bool y if COMPAT && KEYS
+
 endmenu
 
 source "net/Kconfig"
diff --git a/arch/sparc/kernel/ptrace_64.c b/arch/sparc/kernel/ptrace_64.c
index 6f97c0767995..484dabac7045 100644
--- a/arch/sparc/kernel/ptrace_64.c
+++ b/arch/sparc/kernel/ptrace_64.c
@@ -1062,7 +1062,7 @@ asmlinkage int syscall_trace_enter(struct pt_regs *regs)
 	int ret = 0;
 
 	/* do the secure computing check first */
-	secure_computing(regs->u_regs[UREG_G1]);
+	secure_computing_strict(regs->u_regs[UREG_G1]);
 
 	if (test_thread_flag(TIF_SYSCALL_TRACE))
 		ret = tracehook_report_syscall_entry(regs);
diff --git a/arch/sparc/kernel/systbls_64.S b/arch/sparc/kernel/systbls_64.S
index db86b1a0e9a9..3a58e0d66f51 100644
--- a/arch/sparc/kernel/systbls_64.S
+++ b/arch/sparc/kernel/systbls_64.S
@@ -74,7 +74,7 @@ sys_call_table32:
 	.word sys_timer_delete, compat_sys_timer_create, sys_ni_syscall, compat_sys_io_setup, sys_io_destroy
 /*270*/	.word sys32_io_submit, sys_io_cancel, compat_sys_io_getevents, sys32_mq_open, sys_mq_unlink
 	.word compat_sys_mq_timedsend, compat_sys_mq_timedreceive, compat_sys_mq_notify, compat_sys_mq_getsetattr, compat_sys_waitid
-/*280*/	.word sys32_tee, sys_add_key, sys_request_key, sys_keyctl, compat_sys_openat
+/*280*/	.word sys32_tee, sys_add_key, sys_request_key, compat_sys_keyctl, compat_sys_openat
 	.word sys_mkdirat, sys_mknodat, sys_fchownat, compat_sys_futimesat, compat_sys_fstatat64
 /*290*/	.word sys_unlinkat, sys_renameat, sys_linkat, sys_symlinkat, sys_readlinkat
 	.word sys_fchmodat, sys_faccessat, compat_sys_pselect6, compat_sys_ppoll, sys_unshare
diff --git a/arch/x86/Kconfig b/arch/x86/Kconfig
index c940cb6f0409..2787fbec7aed 100644
--- a/arch/x86/Kconfig
+++ b/arch/x86/Kconfig
@@ -83,6 +83,7 @@ config X86
 	select GENERIC_IOMAP
 	select DCACHE_WORD_ACCESS
 	select GENERIC_SMP_IDLE_THREAD
+	select HAVE_ARCH_SECCOMP_FILTER
 
 config INSTRUCTION_DECODER
 	def_bool (KPROBES || PERF_EVENTS)
diff --git a/arch/x86/ia32/ia32_signal.c b/arch/x86/ia32/ia32_signal.c
index a69245ba27e3..0b3f2354f6aa 100644
--- a/arch/x86/ia32/ia32_signal.c
+++ b/arch/x86/ia32/ia32_signal.c
@@ -67,6 +67,10 @@ int copy_siginfo_to_user32(compat_siginfo_t __user *to, siginfo_t *from)
 			switch (from->si_code >> 16) {
 			case __SI_FAULT >> 16:
 				break;
+			case __SI_SYS >> 16:
+				put_user_ex(from->si_syscall, &to->si_syscall);
+				put_user_ex(from->si_arch, &to->si_arch);
+				break;
 			case __SI_CHLD >> 16:
 				if (ia32) {
 					put_user_ex(from->si_utime, &to->si_utime);
diff --git a/arch/x86/include/asm/ia32.h b/arch/x86/include/asm/ia32.h
index ee52760549f0..b04cbdb138cd 100644
--- a/arch/x86/include/asm/ia32.h
+++ b/arch/x86/include/asm/ia32.h
@@ -144,6 +144,12 @@ typedef struct compat_siginfo {
 			int _band;	/* POLL_IN, POLL_OUT, POLL_MSG */
 			int _fd;
 		} _sigpoll;
+
+		struct {
+			unsigned int _call_addr; /* calling insn */
+			int _syscall;	/* triggering system call number */
+			unsigned int _arch;	/* AUDIT_ARCH_* of syscall */
+		} _sigsys;
 	} _sifields;
 } compat_siginfo_t;
 
diff --git a/arch/x86/include/asm/syscall.h b/arch/x86/include/asm/syscall.h
index 386b78686c4d..1ace47b62592 100644
--- a/arch/x86/include/asm/syscall.h
+++ b/arch/x86/include/asm/syscall.h
@@ -13,9 +13,11 @@
 #ifndef _ASM_X86_SYSCALL_H
 #define _ASM_X86_SYSCALL_H
 
+#include <linux/audit.h>
 #include <linux/sched.h>
 #include <linux/err.h>
 #include <asm/asm-offsets.h>	/* For NR_syscalls */
+#include <asm/thread_info.h>	/* for TS_COMPAT */
 #include <asm/unistd.h>
 
 extern const unsigned long sys_call_table[];
@@ -88,6 +90,12 @@ static inline void syscall_set_arguments(struct task_struct *task,
 	memcpy(&regs->bx + i, args, n * sizeof(args[0]));
 }
 
+static inline int syscall_get_arch(struct task_struct *task,
+				   struct pt_regs *regs)
+{
+	return AUDIT_ARCH_I386;
+}
+
 #else	 /* CONFIG_X86_64 */
 
 static inline void syscall_get_arguments(struct task_struct *task,
@@ -212,6 +220,25 @@ static inline void syscall_set_arguments(struct task_struct *task,
 		}
 }
 
+static inline int syscall_get_arch(struct task_struct *task,
+				   struct pt_regs *regs)
+{
+#ifdef CONFIG_IA32_EMULATION
+	/*
+	 * TS_COMPAT is set for 32-bit syscall entry and then
+	 * remains set until we return to user mode.
+	 *
+	 * TIF_IA32 tasks should always have TS_COMPAT set at
+	 * system call time.
+	 *
+	 * x32 tasks should be considered AUDIT_ARCH_X86_64.
+	 */
+	if (task_thread_info(task)->status & TS_COMPAT)
+		return AUDIT_ARCH_I386;
+#endif
+	/* Both x32 and x86_64 are considered "64-bit". */
+	return AUDIT_ARCH_X86_64;
+}
 #endif	/* CONFIG_X86_32 */
 
 #endif	/* _ASM_X86_SYSCALL_H */
diff --git a/arch/x86/kernel/ptrace.c b/arch/x86/kernel/ptrace.c
index 685845cf16e0..13b1990c7c58 100644
--- a/arch/x86/kernel/ptrace.c
+++ b/arch/x86/kernel/ptrace.c
@@ -1480,7 +1480,11 @@ long syscall_trace_enter(struct pt_regs *regs)
 		regs->flags |= X86_EFLAGS_TF;
 
 	/* do the secure computing check first */
-	secure_computing(regs->orig_ax);
+	if (secure_computing(regs->orig_ax)) {
+		/* seccomp failures shouldn't expose any additional code. */
+		ret = -1L;
+		goto out;
+	}
 
 	if (unlikely(test_thread_flag(TIF_SYSCALL_EMU)))
 		ret = -1L;
@@ -1505,6 +1509,7 @@ long syscall_trace_enter(struct pt_regs *regs)
 				    regs->dx, regs->r10);
 #endif
 
+out:
 	return ret ?: regs->orig_ax;
 }
 
diff --git a/fs/exec.c b/fs/exec.c
index b1fd2025e59a..d038968b54b4 100644
--- a/fs/exec.c
+++ b/fs/exec.c
@@ -1245,6 +1245,13 @@ static int check_unsafe_exec(struct linux_binprm *bprm)
 			bprm->unsafe |= LSM_UNSAFE_PTRACE;
 	}
 
+	/*
+	 * This isn't strictly necessary, but it makes it harder for LSMs to
+	 * mess up.
+	 */
+	if (current->no_new_privs)
+		bprm->unsafe |= LSM_UNSAFE_NO_NEW_PRIVS;
+
 	n_fs = 1;
 	spin_lock(&p->fs->lock);
 	rcu_read_lock();
@@ -1288,7 +1295,8 @@ int prepare_binprm(struct linux_binprm *bprm)
 	bprm->cred->euid = current_euid();
 	bprm->cred->egid = current_egid();
 
-	if (!(bprm->file->f_path.mnt->mnt_flags & MNT_NOSUID)) {
+	if (!(bprm->file->f_path.mnt->mnt_flags & MNT_NOSUID) &&
+	    !current->no_new_privs) {
 		/* Set-uid? */
 		if (mode & S_ISUID) {
 			bprm->per_clear |= PER_CLEAR_ON_SETID;
diff --git a/fs/open.c b/fs/open.c
index 5720854156db..5eccdcea2d1b 100644
--- a/fs/open.c
+++ b/fs/open.c
@@ -681,7 +681,7 @@ static struct file *__dentry_open(struct dentry *dentry, struct vfsmount *mnt,
 
 	f->f_op = fops_get(inode->i_fop);
 
-	error = security_dentry_open(f, cred);
+	error = security_file_open(f, cred);
 	if (error)
 		goto cleanup_all;
 
diff --git a/include/asm-generic/siginfo.h b/include/asm-generic/siginfo.h
index 5e5e3865f1ed..8ed67779fc09 100644
--- a/include/asm-generic/siginfo.h
+++ b/include/asm-generic/siginfo.h
@@ -98,9 +98,18 @@ typedef struct siginfo {
 			__ARCH_SI_BAND_T _band;	/* POLL_IN, POLL_OUT, POLL_MSG */
 			int _fd;
 		} _sigpoll;
+
+		/* SIGSYS */
+		struct {
+			void __user *_call_addr; /* calling user insn */
+			int _syscall;	/* triggering system call number */
+			unsigned int _arch;	/* AUDIT_ARCH_* of syscall */
+		} _sigsys;
 	} _sifields;
 } __ARCH_SI_ATTRIBUTES siginfo_t;
 
+/* If the arch shares siginfo, then it has SIGSYS. */
+#define __ARCH_SIGSYS
 #endif
 
 /*
@@ -124,6 +133,11 @@ typedef struct siginfo {
 #define si_addr_lsb	_sifields._sigfault._addr_lsb
 #define si_band		_sifields._sigpoll._band
 #define si_fd		_sifields._sigpoll._fd
+#ifdef __ARCH_SIGSYS
+#define si_call_addr	_sifields._sigsys._call_addr
+#define si_syscall	_sifields._sigsys._syscall
+#define si_arch		_sifields._sigsys._arch
+#endif
 
 #ifdef __KERNEL__
 #define __SI_MASK	0xffff0000u
@@ -134,6 +148,7 @@ typedef struct siginfo {
 #define __SI_CHLD	(4 << 16)
 #define __SI_RT		(5 << 16)
 #define __SI_MESGQ	(6 << 16)
+#define __SI_SYS	(7 << 16)
 #define __SI_CODE(T,N)	((T) | ((N) & 0xffff))
 #else
 #define __SI_KILL	0
@@ -143,6 +158,7 @@ typedef struct siginfo {
 #define __SI_CHLD	0
 #define __SI_RT		0
 #define __SI_MESGQ	0
+#define __SI_SYS	0
 #define __SI_CODE(T,N)	(N)
 #endif
 
@@ -240,6 +256,12 @@ typedef struct siginfo {
 #define NSIGPOLL	6
 
 /*
+ * SIGSYS si_codes
+ */
+#define SYS_SECCOMP		(__SI_SYS|1)	/* seccomp triggered */
+#define NSIGSYS	1
+
+/*
  * sigevent definitions
  * 
  * It seems likely that SIGEV_THREAD will have to be handled from 
diff --git a/include/asm-generic/syscall.h b/include/asm-generic/syscall.h
index 5c122ae6bfa6..5b09392db673 100644
--- a/include/asm-generic/syscall.h
+++ b/include/asm-generic/syscall.h
@@ -142,4 +142,18 @@ void syscall_set_arguments(struct task_struct *task, struct pt_regs *regs,
 			   unsigned int i, unsigned int n,
 			   const unsigned long *args);
 
+/**
+ * syscall_get_arch - return the AUDIT_ARCH for the current system call
+ * @task:	task of interest, must be in system call entry tracing
+ * @regs:	task_pt_regs() of @task
+ *
+ * Returns the AUDIT_ARCH_* based on the system call convention in use.
+ *
+ * It's only valid to call this when @task is stopped on entry to a system
+ * call, due to %TIF_SYSCALL_TRACE, %TIF_SYSCALL_AUDIT, or %TIF_SECCOMP.
+ *
+ * Architectures which permit CONFIG_HAVE_ARCH_SECCOMP_FILTER must
+ * provide an implementation of this.
+ */
+int syscall_get_arch(struct task_struct *task, struct pt_regs *regs);
 #endif	/* _ASM_SYSCALL_H */
diff --git a/include/keys/keyring-type.h b/include/keys/keyring-type.h
index 843f872a4b63..cf49159b0e3a 100644
--- a/include/keys/keyring-type.h
+++ b/include/keys/keyring-type.h
@@ -24,7 +24,7 @@ struct keyring_list {
 	unsigned short	maxkeys;	/* max keys this list can hold */
 	unsigned short	nkeys;		/* number of keys currently held */
 	unsigned short	delkey;		/* key to be unlinked by RCU */
-	struct key	*keys[0];
+	struct key __rcu *keys[0];
 };
 
 
diff --git a/include/linux/Kbuild b/include/linux/Kbuild
index b5d568fa19e8..0237b84ba541 100644
--- a/include/linux/Kbuild
+++ b/include/linux/Kbuild
@@ -330,6 +330,7 @@ header-y += scc.h
 header-y += sched.h
 header-y += screen_info.h
 header-y += sdla.h
+header-y += seccomp.h
 header-y += securebits.h
 header-y += selinux_netlink.h
 header-y += sem.h
diff --git a/include/linux/audit.h b/include/linux/audit.h
index ed3ef1972496..22f292a917a3 100644
--- a/include/linux/audit.h
+++ b/include/linux/audit.h
@@ -463,7 +463,7 @@ extern void audit_putname(const char *name);
 extern void __audit_inode(const char *name, const struct dentry *dentry);
 extern void __audit_inode_child(const struct dentry *dentry,
 				const struct inode *parent);
-extern void __audit_seccomp(unsigned long syscall);
+extern void __audit_seccomp(unsigned long syscall, long signr, int code);
 extern void __audit_ptrace(struct task_struct *t);
 
 static inline int audit_dummy_context(void)
@@ -508,10 +508,10 @@ static inline void audit_inode_child(const struct dentry *dentry,
 }
 void audit_core_dumps(long signr);
 
-static inline void audit_seccomp(unsigned long syscall)
+static inline void audit_seccomp(unsigned long syscall, long signr, int code)
 {
 	if (unlikely(!audit_dummy_context()))
-		__audit_seccomp(syscall);
+		__audit_seccomp(syscall, signr, code);
 }
 
 static inline void audit_ptrace(struct task_struct *t)
@@ -634,7 +634,7 @@ extern int audit_signals;
 #define audit_inode(n,d) do { (void)(d); } while (0)
 #define audit_inode_child(i,p) do { ; } while (0)
 #define audit_core_dumps(i) do { ; } while (0)
-#define audit_seccomp(i) do { ; } while (0)
+#define audit_seccomp(i,s,c) do { ; } while (0)
 #define auditsc_get_stamp(c,t,s) (0)
 #define audit_get_loginuid(t) (-1)
 #define audit_get_sessionid(t) (-1)
diff --git a/include/linux/filter.h b/include/linux/filter.h
index 72090994d789..82b01357af8b 100644
--- a/include/linux/filter.h
+++ b/include/linux/filter.h
@@ -10,6 +10,7 @@
 
 #ifdef __KERNEL__
 #include <linux/atomic.h>
+#include <linux/compat.h>
 #endif
 
 /*
@@ -133,6 +134,16 @@ struct sock_fprog {	/* Required for SO_ATTACH_FILTER. */
 
 #ifdef __KERNEL__
 
+#ifdef CONFIG_COMPAT
+/*
+ * A struct sock_filter is architecture independent.
+ */
+struct compat_sock_fprog {
+	u16		len;
+	compat_uptr_t	filter;		/* struct sock_filter * */
+};
+#endif
+
 struct sk_buff;
 struct sock;
 
@@ -233,6 +244,7 @@ enum {
 	BPF_S_ANC_RXHASH,
 	BPF_S_ANC_CPU,
 	BPF_S_ANC_ALU_XOR_X,
+	BPF_S_ANC_SECCOMP_LD_W,
 };
 
 #endif /* __KERNEL__ */
diff --git a/include/linux/key.h b/include/linux/key.h
index 96933b1e5d24..5231800770e1 100644
--- a/include/linux/key.h
+++ b/include/linux/key.h
@@ -124,7 +124,10 @@ static inline unsigned long is_key_possessed(const key_ref_t key_ref)
 struct key {
 	atomic_t		usage;		/* number of references */
 	key_serial_t		serial;		/* key serial number */
-	struct rb_node		serial_node;
+	union {
+		struct list_head graveyard_link;
+		struct rb_node	serial_node;
+	};
 	struct key_type		*type;		/* type of key */
 	struct rw_semaphore	sem;		/* change vs change sem */
 	struct key_user		*user;		/* owner of this key */
@@ -133,6 +136,7 @@ struct key {
 		time_t		expiry;		/* time at which key expires (or 0) */
 		time_t		revoked_at;	/* time at which key was revoked */
 	};
+	time_t			last_used_at;	/* last time used for LRU keyring discard */
 	uid_t			uid;
 	gid_t			gid;
 	key_perm_t		perm;		/* access permissions */
@@ -156,6 +160,7 @@ struct key {
 #define KEY_FLAG_USER_CONSTRUCT	4	/* set if key is being constructed in userspace */
 #define KEY_FLAG_NEGATIVE	5	/* set if key is negative */
 #define KEY_FLAG_ROOT_CAN_CLEAR	6	/* set if key can be cleared by root without permission */
+#define KEY_FLAG_INVALIDATED	7	/* set if key has been invalidated */
 
 	/* the description string
 	 * - this is used to match a key against search criteria
@@ -199,6 +204,7 @@ extern struct key *key_alloc(struct key_type *type,
 #define KEY_ALLOC_NOT_IN_QUOTA	0x0002	/* not in quota */
 
 extern void key_revoke(struct key *key);
+extern void key_invalidate(struct key *key);
 extern void key_put(struct key *key);
 
 static inline struct key *key_get(struct key *key)
@@ -236,7 +242,7 @@ extern struct key *request_key_async_with_auxdata(struct key_type *type,
 
 extern int wait_for_key_construction(struct key *key, bool intr);
 
-extern int key_validate(struct key *key);
+extern int key_validate(const struct key *key);
 
 extern key_ref_t key_create_or_update(key_ref_t keyring,
 				      const char *type,
@@ -319,6 +325,7 @@ extern void key_init(void);
 #define key_serial(k)			0
 #define key_get(k) 			({ NULL; })
 #define key_revoke(k)			do { } while(0)
+#define key_invalidate(k)		do { } while(0)
 #define key_put(k)			do { } while(0)
 #define key_ref_put(k)			do { } while(0)
 #define make_key_ref(k, p)		NULL
diff --git a/include/linux/keyctl.h b/include/linux/keyctl.h
index 9b0b865ce622..c9b7f4faf97a 100644
--- a/include/linux/keyctl.h
+++ b/include/linux/keyctl.h
@@ -55,5 +55,6 @@
 #define KEYCTL_SESSION_TO_PARENT	18	/* apply session keyring to parent process */
 #define KEYCTL_REJECT			19	/* reject a partially constructed key */
 #define KEYCTL_INSTANTIATE_IOV		20	/* instantiate a partially constructed key */
+#define KEYCTL_INVALIDATE		21	/* invalidate a key */
 
 #endif /*  _LINUX_KEYCTL_H */
diff --git a/include/linux/lsm_audit.h b/include/linux/lsm_audit.h
index fad48aab893b..1cc89e9df480 100644
--- a/include/linux/lsm_audit.h
+++ b/include/linux/lsm_audit.h
@@ -53,7 +53,6 @@ struct common_audit_data {
 #define LSM_AUDIT_DATA_KMOD	8
 #define LSM_AUDIT_DATA_INODE	9
 #define LSM_AUDIT_DATA_DENTRY	10
-	struct task_struct *tsk;
 	union 	{
 		struct path path;
 		struct dentry *dentry;
@@ -93,11 +92,6 @@ int ipv4_skb_to_auditdata(struct sk_buff *skb,
 int ipv6_skb_to_auditdata(struct sk_buff *skb,
 		struct common_audit_data *ad, u8 *proto);
 
-/* Initialize an LSM audit data structure. */
-#define COMMON_AUDIT_DATA_INIT(_d, _t) \
-	{ memset((_d), 0, sizeof(struct common_audit_data)); \
-	 (_d)->type = LSM_AUDIT_DATA_##_t; }
-
 void common_lsm_audit(struct common_audit_data *a,
 	void (*pre_audit)(struct audit_buffer *, void *),
 	void (*post_audit)(struct audit_buffer *, void *));
diff --git a/include/linux/prctl.h b/include/linux/prctl.h
index e0cfec2490aa..78b76e24cc7e 100644
--- a/include/linux/prctl.h
+++ b/include/linux/prctl.h
@@ -124,4 +124,19 @@
 #define PR_SET_CHILD_SUBREAPER 36
 #define PR_GET_CHILD_SUBREAPER 37
 
+/*
+ * If no_new_privs is set, then operations that grant new privileges (i.e.
+ * execve) will either fail or not grant them.  This affects suid/sgid,
+ * file capabilities, and LSMs.
+ *
+ * Operations that merely manipulate or drop existing privileges (setresuid,
+ * capset, etc.) will still work.  Drop those privileges if you want them gone.
+ *
+ * Changing LSM security domain is considered a new privilege.  So, for example,
+ * asking selinux for a specific new context (e.g. with runcon) will result
+ * in execve returning -EPERM.
+ */
+#define PR_SET_NO_NEW_PRIVS 38
+#define PR_GET_NO_NEW_PRIVS 39
+
 #endif /* _LINUX_PRCTL_H */
diff --git a/include/linux/ptrace.h b/include/linux/ptrace.h
index 5c719627c2aa..597e4fdb97fe 100644
--- a/include/linux/ptrace.h
+++ b/include/linux/ptrace.h
@@ -58,6 +58,7 @@
 #define PTRACE_EVENT_EXEC	4
 #define PTRACE_EVENT_VFORK_DONE	5
 #define PTRACE_EVENT_EXIT	6
+#define PTRACE_EVENT_SECCOMP	7
 /* Extended result codes which enabled by means other than options.  */
 #define PTRACE_EVENT_STOP	128
 
@@ -69,8 +70,9 @@
 #define PTRACE_O_TRACEEXEC	(1 << PTRACE_EVENT_EXEC)
 #define PTRACE_O_TRACEVFORKDONE	(1 << PTRACE_EVENT_VFORK_DONE)
 #define PTRACE_O_TRACEEXIT	(1 << PTRACE_EVENT_EXIT)
+#define PTRACE_O_TRACESECCOMP	(1 << PTRACE_EVENT_SECCOMP)
 
-#define PTRACE_O_MASK		0x0000007f
+#define PTRACE_O_MASK		0x000000ff
 
 #include <asm/ptrace.h>
 
@@ -98,6 +100,7 @@
 #define PT_TRACE_EXEC		PT_EVENT_FLAG(PTRACE_EVENT_EXEC)
 #define PT_TRACE_VFORK_DONE	PT_EVENT_FLAG(PTRACE_EVENT_VFORK_DONE)
 #define PT_TRACE_EXIT		PT_EVENT_FLAG(PTRACE_EVENT_EXIT)
+#define PT_TRACE_SECCOMP	PT_EVENT_FLAG(PTRACE_EVENT_SECCOMP)
 
 /* single stepping state bits (used on ARM and PA-RISC) */
 #define PT_SINGLESTEP_BIT	31
diff --git a/include/linux/sched.h b/include/linux/sched.h
index 8f3fd945070f..f774d88cd0aa 100644
--- a/include/linux/sched.h
+++ b/include/linux/sched.h
@@ -1341,6 +1341,8 @@ struct task_struct {
 				 * execve */
 	unsigned in_iowait:1;
 
+	/* task may not gain privileges */
+	unsigned no_new_privs:1;
 
 	/* Revert to default priority/policy when forking */
 	unsigned sched_reset_on_fork:1;
@@ -1450,7 +1452,7 @@ struct task_struct {
 	uid_t loginuid;
 	unsigned int sessionid;
 #endif
-	seccomp_t seccomp;
+	struct seccomp seccomp;
 
 /* Thread group tracking */
    	u32 parent_exec_id;
diff --git a/include/linux/seccomp.h b/include/linux/seccomp.h
index cc7a4e9cc7ad..84f6320da50f 100644
--- a/include/linux/seccomp.h
+++ b/include/linux/seccomp.h
@@ -1,25 +1,90 @@
 #ifndef _LINUX_SECCOMP_H
 #define _LINUX_SECCOMP_H
 
-
+#include <linux/compiler.h>
+#include <linux/types.h>
+
+
+/* Valid values for seccomp.mode and prctl(PR_SET_SECCOMP, <mode>) */
+#define SECCOMP_MODE_DISABLED	0 /* seccomp is not in use. */
+#define SECCOMP_MODE_STRICT	1 /* uses hard-coded filter. */
+#define SECCOMP_MODE_FILTER	2 /* uses user-supplied filter. */
+
+/*
+ * All BPF programs must return a 32-bit value.
+ * The bottom 16-bits are for optional return data.
+ * The upper 16-bits are ordered from least permissive values to most.
+ *
+ * The ordering ensures that a min_t() over composed return values always
+ * selects the least permissive choice.
+ */
+#define SECCOMP_RET_KILL	0x00000000U /* kill the task immediately */
+#define SECCOMP_RET_TRAP	0x00030000U /* disallow and force a SIGSYS */
+#define SECCOMP_RET_ERRNO	0x00050000U /* returns an errno */
+#define SECCOMP_RET_TRACE	0x7ff00000U /* pass to a tracer or disallow */
+#define SECCOMP_RET_ALLOW	0x7fff0000U /* allow */
+
+/* Masks for the return value sections. */
+#define SECCOMP_RET_ACTION	0x7fff0000U
+#define SECCOMP_RET_DATA	0x0000ffffU
+
+/**
+ * struct seccomp_data - the format the BPF program executes over.
+ * @nr: the system call number
+ * @arch: indicates system call convention as an AUDIT_ARCH_* value
+ *        as defined in <linux/audit.h>.
+ * @instruction_pointer: at the time of the system call.
+ * @args: up to 6 system call arguments always stored as 64-bit values
+ *        regardless of the architecture.
+ */
+struct seccomp_data {
+	int nr;
+	__u32 arch;
+	__u64 instruction_pointer;
+	__u64 args[6];
+};
+
+#ifdef __KERNEL__
 #ifdef CONFIG_SECCOMP
 
 #include <linux/thread_info.h>
 #include <asm/seccomp.h>
 
-typedef struct { int mode; } seccomp_t;
-
-extern void __secure_computing(int);
-static inline void secure_computing(int this_syscall)
+struct seccomp_filter;
+/**
+ * struct seccomp - the state of a seccomp'ed process
+ *
+ * @mode:  indicates one of the valid values above for controlled
+ *         system calls available to a process.
+ * @filter: The metadata and ruleset for determining what system calls
+ *          are allowed for a task.
+ *
+ *          @filter must only be accessed from the context of current as there
+ *          is no locking.
+ */
+struct seccomp {
+	int mode;
+	struct seccomp_filter *filter;
+};
+
+extern int __secure_computing(int);
+static inline int secure_computing(int this_syscall)
 {
 	if (unlikely(test_thread_flag(TIF_SECCOMP)))
-		__secure_computing(this_syscall);
+		return  __secure_computing(this_syscall);
+	return 0;
+}
+
+/* A wrapper for architectures supporting only SECCOMP_MODE_STRICT. */
+static inline void secure_computing_strict(int this_syscall)
+{
+	BUG_ON(secure_computing(this_syscall) != 0);
 }
 
 extern long prctl_get_seccomp(void);
-extern long prctl_set_seccomp(unsigned long);
+extern long prctl_set_seccomp(unsigned long, char __user *);
 
-static inline int seccomp_mode(seccomp_t *s)
+static inline int seccomp_mode(struct seccomp *s)
 {
 	return s->mode;
 }
@@ -28,25 +93,41 @@ static inline int seccomp_mode(seccomp_t *s)
 
 #include <linux/errno.h>
 
-typedef struct { } seccomp_t;
+struct seccomp { };
+struct seccomp_filter { };
 
-#define secure_computing(x) do { } while (0)
+static inline int secure_computing(int this_syscall) { return 0; }
+static inline void secure_computing_strict(int this_syscall) { return; }
 
 static inline long prctl_get_seccomp(void)
 {
 	return -EINVAL;
 }
 
-static inline long prctl_set_seccomp(unsigned long arg2)
+static inline long prctl_set_seccomp(unsigned long arg2, char __user *arg3)
 {
 	return -EINVAL;
 }
 
-static inline int seccomp_mode(seccomp_t *s)
+static inline int seccomp_mode(struct seccomp *s)
 {
 	return 0;
 }
-
 #endif /* CONFIG_SECCOMP */
 
+#ifdef CONFIG_SECCOMP_FILTER
+extern void put_seccomp_filter(struct task_struct *tsk);
+extern void get_seccomp_filter(struct task_struct *tsk);
+extern u32 seccomp_bpf_load(int off);
+#else  /* CONFIG_SECCOMP_FILTER */
+static inline void put_seccomp_filter(struct task_struct *tsk)
+{
+	return;
+}
+static inline void get_seccomp_filter(struct task_struct *tsk)
+{
+	return;
+}
+#endif /* CONFIG_SECCOMP_FILTER */
+#endif /* __KERNEL__ */
 #endif /* _LINUX_SECCOMP_H */
diff --git a/include/linux/security.h b/include/linux/security.h
index 673afbb8238a..ab0e091ce5fa 100644
--- a/include/linux/security.h
+++ b/include/linux/security.h
@@ -144,6 +144,7 @@ struct request_sock;
 #define LSM_UNSAFE_SHARE	1
 #define LSM_UNSAFE_PTRACE	2
 #define LSM_UNSAFE_PTRACE_CAP	4
+#define LSM_UNSAFE_NO_NEW_PRIVS	8
 
 #ifdef CONFIG_MMU
 extern int mmap_min_addr_handler(struct ctl_table *table, int write,
@@ -639,10 +640,7 @@ static inline void security_free_mnt_opts(struct security_mnt_opts *opts)
  *	to receive an open file descriptor via socket IPC.
  *	@file contains the file structure being received.
  *	Return 0 if permission is granted.
- *
- * Security hook for dentry
- *
- * @dentry_open
+ * @file_open
  *	Save open-time permission checking state for later use upon
  *	file_permission, and recheck access if anything has changed
  *	since inode_permission.
@@ -1497,7 +1495,7 @@ struct security_operations {
 	int (*file_send_sigiotask) (struct task_struct *tsk,
 				    struct fown_struct *fown, int sig);
 	int (*file_receive) (struct file *file);
-	int (*dentry_open) (struct file *file, const struct cred *cred);
+	int (*file_open) (struct file *file, const struct cred *cred);
 
 	int (*task_create) (unsigned long clone_flags);
 	void (*task_free) (struct task_struct *task);
@@ -1756,7 +1754,7 @@ int security_file_set_fowner(struct file *file);
 int security_file_send_sigiotask(struct task_struct *tsk,
 				 struct fown_struct *fown, int sig);
 int security_file_receive(struct file *file);
-int security_dentry_open(struct file *file, const struct cred *cred);
+int security_file_open(struct file *file, const struct cred *cred);
 int security_task_create(unsigned long clone_flags);
 void security_task_free(struct task_struct *task);
 int security_cred_alloc_blank(struct cred *cred, gfp_t gfp);
@@ -2227,8 +2225,8 @@ static inline int security_file_receive(struct file *file)
 	return 0;
 }
 
-static inline int security_dentry_open(struct file *file,
-				       const struct cred *cred)
+static inline int security_file_open(struct file *file,
+				     const struct cred *cred)
 {
 	return 0;
 }
diff --git a/kernel/auditsc.c b/kernel/auditsc.c
index af1de0f34eae..4b96415527b8 100644
--- a/kernel/auditsc.c
+++ b/kernel/auditsc.c
@@ -67,6 +67,7 @@
 #include <linux/syscalls.h>
 #include <linux/capability.h>
 #include <linux/fs_struct.h>
+#include <linux/compat.h>
 
 #include "audit.h"
 
@@ -2710,13 +2711,16 @@ void audit_core_dumps(long signr)
 	audit_log_end(ab);
 }
 
-void __audit_seccomp(unsigned long syscall)
+void __audit_seccomp(unsigned long syscall, long signr, int code)
 {
 	struct audit_buffer *ab;
 
 	ab = audit_log_start(NULL, GFP_KERNEL, AUDIT_ANOM_ABEND);
-	audit_log_abend(ab, "seccomp", SIGKILL);
+	audit_log_abend(ab, "seccomp", signr);
 	audit_log_format(ab, " syscall=%ld", syscall);
+	audit_log_format(ab, " compat=%d", is_compat_task());
+	audit_log_format(ab, " ip=0x%lx", KSTK_EIP(current));
+	audit_log_format(ab, " code=0x%x", code);
 	audit_log_end(ab);
 }
 
diff --git a/kernel/fork.c b/kernel/fork.c
index 9f9b296fa6df..ad54c833116a 100644
--- a/kernel/fork.c
+++ b/kernel/fork.c
@@ -34,6 +34,7 @@
 #include <linux/cgroup.h>
 #include <linux/security.h>
 #include <linux/hugetlb.h>
+#include <linux/seccomp.h>
 #include <linux/swap.h>
 #include <linux/syscalls.h>
 #include <linux/jiffies.h>
@@ -206,6 +207,7 @@ void free_task(struct task_struct *tsk)
 	free_thread_info(tsk->stack);
 	rt_mutex_debug_task_free(tsk);
 	ftrace_graph_exit_task(tsk);
+	put_seccomp_filter(tsk);
 	free_task_struct(tsk);
 }
 EXPORT_SYMBOL(free_task);
@@ -1192,6 +1194,7 @@ static struct task_struct *copy_process(unsigned long clone_flags,
 		goto fork_out;
 
 	ftrace_graph_init_task(p);
+	get_seccomp_filter(p);
 
 	rt_mutex_init_task(p);
 
diff --git a/kernel/seccomp.c b/kernel/seccomp.c
index e8d76c5895ea..ee376beedaf9 100644
--- a/kernel/seccomp.c
+++ b/kernel/seccomp.c
@@ -3,16 +3,357 @@
  *
  * Copyright 2004-2005  Andrea Arcangeli <andrea@cpushare.com>
  *
- * This defines a simple but solid secure-computing mode.
+ * Copyright (C) 2012 Google, Inc.
+ * Will Drewry <wad@chromium.org>
+ *
+ * This defines a simple but solid secure-computing facility.
+ *
+ * Mode 1 uses a fixed list of allowed system calls.
+ * Mode 2 allows user-defined system call filters in the form
+ *        of Berkeley Packet Filters/Linux Socket Filters.
  */
 
+#include <linux/atomic.h>
 #include <linux/audit.h>
-#include <linux/seccomp.h>
-#include <linux/sched.h>
 #include <linux/compat.h>
+#include <linux/sched.h>
+#include <linux/seccomp.h>
 
 /* #define SECCOMP_DEBUG 1 */
-#define NR_SECCOMP_MODES 1
+
+#ifdef CONFIG_SECCOMP_FILTER
+#include <asm/syscall.h>
+#include <linux/filter.h>
+#include <linux/ptrace.h>
+#include <linux/security.h>
+#include <linux/slab.h>
+#include <linux/tracehook.h>
+#include <linux/uaccess.h>
+
+/**
+ * struct seccomp_filter - container for seccomp BPF programs
+ *
+ * @usage: reference count to manage the object lifetime.
+ *         get/put helpers should be used when accessing an instance
+ *         outside of a lifetime-guarded section.  In general, this
+ *         is only needed for handling filters shared across tasks.
+ * @prev: points to a previously installed, or inherited, filter
+ * @len: the number of instructions in the program
+ * @insns: the BPF program instructions to evaluate
+ *
+ * seccomp_filter objects are organized in a tree linked via the @prev
+ * pointer.  For any task, it appears to be a singly-linked list starting
+ * with current->seccomp.filter, the most recently attached or inherited filter.
+ * However, multiple filters may share a @prev node, by way of fork(), which
+ * results in a unidirectional tree existing in memory.  This is similar to
+ * how namespaces work.
+ *
+ * seccomp_filter objects should never be modified after being attached
+ * to a task_struct (other than @usage).
+ */
+struct seccomp_filter {
+	atomic_t usage;
+	struct seccomp_filter *prev;
+	unsigned short len;  /* Instruction count */
+	struct sock_filter insns[];
+};
+
+/* Limit any path through the tree to 256KB worth of instructions. */
+#define MAX_INSNS_PER_PATH ((1 << 18) / sizeof(struct sock_filter))
+
+/**
+ * get_u32 - returns a u32 offset into data
+ * @data: a unsigned 64 bit value
+ * @index: 0 or 1 to return the first or second 32-bits
+ *
+ * This inline exists to hide the length of unsigned long.  If a 32-bit
+ * unsigned long is passed in, it will be extended and the top 32-bits will be
+ * 0. If it is a 64-bit unsigned long, then whatever data is resident will be
+ * properly returned.
+ *
+ * Endianness is explicitly ignored and left for BPF program authors to manage
+ * as per the specific architecture.
+ */
+static inline u32 get_u32(u64 data, int index)
+{
+	return ((u32 *)&data)[index];
+}
+
+/* Helper for bpf_load below. */
+#define BPF_DATA(_name) offsetof(struct seccomp_data, _name)
+/**
+ * bpf_load: checks and returns a pointer to the requested offset
+ * @off: offset into struct seccomp_data to load from
+ *
+ * Returns the requested 32-bits of data.
+ * seccomp_check_filter() should assure that @off is 32-bit aligned
+ * and not out of bounds.  Failure to do so is a BUG.
+ */
+u32 seccomp_bpf_load(int off)
+{
+	struct pt_regs *regs = task_pt_regs(current);
+	if (off == BPF_DATA(nr))
+		return syscall_get_nr(current, regs);
+	if (off == BPF_DATA(arch))
+		return syscall_get_arch(current, regs);
+	if (off >= BPF_DATA(args[0]) && off < BPF_DATA(args[6])) {
+		unsigned long value;
+		int arg = (off - BPF_DATA(args[0])) / sizeof(u64);
+		int index = !!(off % sizeof(u64));
+		syscall_get_arguments(current, regs, arg, 1, &value);
+		return get_u32(value, index);
+	}
+	if (off == BPF_DATA(instruction_pointer))
+		return get_u32(KSTK_EIP(current), 0);
+	if (off == BPF_DATA(instruction_pointer) + sizeof(u32))
+		return get_u32(KSTK_EIP(current), 1);
+	/* seccomp_check_filter should make this impossible. */
+	BUG();
+}
+
+/**
+ *	seccomp_check_filter - verify seccomp filter code
+ *	@filter: filter to verify
+ *	@flen: length of filter
+ *
+ * Takes a previously checked filter (by sk_chk_filter) and
+ * redirects all filter code that loads struct sk_buff data
+ * and related data through seccomp_bpf_load.  It also
+ * enforces length and alignment checking of those loads.
+ *
+ * Returns 0 if the rule set is legal or -EINVAL if not.
+ */
+static int seccomp_check_filter(struct sock_filter *filter, unsigned int flen)
+{
+	int pc;
+	for (pc = 0; pc < flen; pc++) {
+		struct sock_filter *ftest = &filter[pc];
+		u16 code = ftest->code;
+		u32 k = ftest->k;
+
+		switch (code) {
+		case BPF_S_LD_W_ABS:
+			ftest->code = BPF_S_ANC_SECCOMP_LD_W;
+			/* 32-bit aligned and not out of bounds. */
+			if (k >= sizeof(struct seccomp_data) || k & 3)
+				return -EINVAL;
+			continue;
+		case BPF_S_LD_W_LEN:
+			ftest->code = BPF_S_LD_IMM;
+			ftest->k = sizeof(struct seccomp_data);
+			continue;
+		case BPF_S_LDX_W_LEN:
+			ftest->code = BPF_S_LDX_IMM;
+			ftest->k = sizeof(struct seccomp_data);
+			continue;
+		/* Explicitly include allowed calls. */
+		case BPF_S_RET_K:
+		case BPF_S_RET_A:
+		case BPF_S_ALU_ADD_K:
+		case BPF_S_ALU_ADD_X:
+		case BPF_S_ALU_SUB_K:
+		case BPF_S_ALU_SUB_X:
+		case BPF_S_ALU_MUL_K:
+		case BPF_S_ALU_MUL_X:
+		case BPF_S_ALU_DIV_X:
+		case BPF_S_ALU_AND_K:
+		case BPF_S_ALU_AND_X:
+		case BPF_S_ALU_OR_K:
+		case BPF_S_ALU_OR_X:
+		case BPF_S_ALU_LSH_K:
+		case BPF_S_ALU_LSH_X:
+		case BPF_S_ALU_RSH_K:
+		case BPF_S_ALU_RSH_X:
+		case BPF_S_ALU_NEG:
+		case BPF_S_LD_IMM:
+		case BPF_S_LDX_IMM:
+		case BPF_S_MISC_TAX:
+		case BPF_S_MISC_TXA:
+		case BPF_S_ALU_DIV_K:
+		case BPF_S_LD_MEM:
+		case BPF_S_LDX_MEM:
+		case BPF_S_ST:
+		case BPF_S_STX:
+		case BPF_S_JMP_JA:
+		case BPF_S_JMP_JEQ_K:
+		case BPF_S_JMP_JEQ_X:
+		case BPF_S_JMP_JGE_K:
+		case BPF_S_JMP_JGE_X:
+		case BPF_S_JMP_JGT_K:
+		case BPF_S_JMP_JGT_X:
+		case BPF_S_JMP_JSET_K:
+		case BPF_S_JMP_JSET_X:
+			continue;
+		default:
+			return -EINVAL;
+		}
+	}
+	return 0;
+}
+
+/**
+ * seccomp_run_filters - evaluates all seccomp filters against @syscall
+ * @syscall: number of the current system call
+ *
+ * Returns valid seccomp BPF response codes.
+ */
+static u32 seccomp_run_filters(int syscall)
+{
+	struct seccomp_filter *f;
+	u32 ret = SECCOMP_RET_ALLOW;
+
+	/* Ensure unexpected behavior doesn't result in failing open. */
+	if (WARN_ON(current->seccomp.filter == NULL))
+		return SECCOMP_RET_KILL;
+
+	/*
+	 * All filters in the list are evaluated and the lowest BPF return
+	 * value always takes priority (ignoring the DATA).
+	 */
+	for (f = current->seccomp.filter; f; f = f->prev) {
+		u32 cur_ret = sk_run_filter(NULL, f->insns);
+		if ((cur_ret & SECCOMP_RET_ACTION) < (ret & SECCOMP_RET_ACTION))
+			ret = cur_ret;
+	}
+	return ret;
+}
+
+/**
+ * seccomp_attach_filter: Attaches a seccomp filter to current.
+ * @fprog: BPF program to install
+ *
+ * Returns 0 on success or an errno on failure.
+ */
+static long seccomp_attach_filter(struct sock_fprog *fprog)
+{
+	struct seccomp_filter *filter;
+	unsigned long fp_size = fprog->len * sizeof(struct sock_filter);
+	unsigned long total_insns = fprog->len;
+	long ret;
+
+	if (fprog->len == 0 || fprog->len > BPF_MAXINSNS)
+		return -EINVAL;
+
+	for (filter = current->seccomp.filter; filter; filter = filter->prev)
+		total_insns += filter->len + 4;  /* include a 4 instr penalty */
+	if (total_insns > MAX_INSNS_PER_PATH)
+		return -ENOMEM;
+
+	/*
+	 * Installing a seccomp filter requires that the task have
+	 * CAP_SYS_ADMIN in its namespace or be running with no_new_privs.
+	 * This avoids scenarios where unprivileged tasks can affect the
+	 * behavior of privileged children.
+	 */
+	if (!current->no_new_privs &&
+	    security_capable_noaudit(current_cred(), current_user_ns(),
+				     CAP_SYS_ADMIN) != 0)
+		return -EACCES;
+
+	/* Allocate a new seccomp_filter */
+	filter = kzalloc(sizeof(struct seccomp_filter) + fp_size,
+			 GFP_KERNEL|__GFP_NOWARN);
+	if (!filter)
+		return -ENOMEM;
+	atomic_set(&filter->usage, 1);
+	filter->len = fprog->len;
+
+	/* Copy the instructions from fprog. */
+	ret = -EFAULT;
+	if (copy_from_user(filter->insns, fprog->filter, fp_size))
+		goto fail;
+
+	/* Check and rewrite the fprog via the skb checker */
+	ret = sk_chk_filter(filter->insns, filter->len);
+	if (ret)
+		goto fail;
+
+	/* Check and rewrite the fprog for seccomp use */
+	ret = seccomp_check_filter(filter->insns, filter->len);
+	if (ret)
+		goto fail;
+
+	/*
+	 * If there is an existing filter, make it the prev and don't drop its
+	 * task reference.
+	 */
+	filter->prev = current->seccomp.filter;
+	current->seccomp.filter = filter;
+	return 0;
+fail:
+	kfree(filter);
+	return ret;
+}
+
+/**
+ * seccomp_attach_user_filter - attaches a user-supplied sock_fprog
+ * @user_filter: pointer to the user data containing a sock_fprog.
+ *
+ * Returns 0 on success and non-zero otherwise.
+ */
+long seccomp_attach_user_filter(char __user *user_filter)
+{
+	struct sock_fprog fprog;
+	long ret = -EFAULT;
+
+#ifdef CONFIG_COMPAT
+	if (is_compat_task()) {
+		struct compat_sock_fprog fprog32;
+		if (copy_from_user(&fprog32, user_filter, sizeof(fprog32)))
+			goto out;
+		fprog.len = fprog32.len;
+		fprog.filter = compat_ptr(fprog32.filter);
+	} else /* falls through to the if below. */
+#endif
+	if (copy_from_user(&fprog, user_filter, sizeof(fprog)))
+		goto out;
+	ret = seccomp_attach_filter(&fprog);
+out:
+	return ret;
+}
+
+/* get_seccomp_filter - increments the reference count of the filter on @tsk */
+void get_seccomp_filter(struct task_struct *tsk)
+{
+	struct seccomp_filter *orig = tsk->seccomp.filter;
+	if (!orig)
+		return;
+	/* Reference count is bounded by the number of total processes. */
+	atomic_inc(&orig->usage);
+}
+
+/* put_seccomp_filter - decrements the ref count of tsk->seccomp.filter */
+void put_seccomp_filter(struct task_struct *tsk)
+{
+	struct seccomp_filter *orig = tsk->seccomp.filter;
+	/* Clean up single-reference branches iteratively. */
+	while (orig && atomic_dec_and_test(&orig->usage)) {
+		struct seccomp_filter *freeme = orig;
+		orig = orig->prev;
+		kfree(freeme);
+	}
+}
+
+/**
+ * seccomp_send_sigsys - signals the task to allow in-process syscall emulation
+ * @syscall: syscall number to send to userland
+ * @reason: filter-supplied reason code to send to userland (via si_errno)
+ *
+ * Forces a SIGSYS with a code of SYS_SECCOMP and related sigsys info.
+ */
+static void seccomp_send_sigsys(int syscall, int reason)
+{
+	struct siginfo info;
+	memset(&info, 0, sizeof(info));
+	info.si_signo = SIGSYS;
+	info.si_code = SYS_SECCOMP;
+	info.si_call_addr = (void __user *)KSTK_EIP(current);
+	info.si_errno = reason;
+	info.si_arch = syscall_get_arch(current, task_pt_regs(current));
+	info.si_syscall = syscall;
+	force_sig_info(SIGSYS, &info, current);
+}
+#endif	/* CONFIG_SECCOMP_FILTER */
 
 /*
  * Secure computing mode 1 allows only read/write/exit/sigreturn.
@@ -31,13 +372,15 @@ static int mode1_syscalls_32[] = {
 };
 #endif
 
-void __secure_computing(int this_syscall)
+int __secure_computing(int this_syscall)
 {
 	int mode = current->seccomp.mode;
-	int * syscall;
+	int exit_sig = 0;
+	int *syscall;
+	u32 ret;
 
 	switch (mode) {
-	case 1:
+	case SECCOMP_MODE_STRICT:
 		syscall = mode1_syscalls;
 #ifdef CONFIG_COMPAT
 		if (is_compat_task())
@@ -45,9 +388,54 @@ void __secure_computing(int this_syscall)
 #endif
 		do {
 			if (*syscall == this_syscall)
-				return;
+				return 0;
 		} while (*++syscall);
+		exit_sig = SIGKILL;
+		ret = SECCOMP_RET_KILL;
+		break;
+#ifdef CONFIG_SECCOMP_FILTER
+	case SECCOMP_MODE_FILTER: {
+		int data;
+		ret = seccomp_run_filters(this_syscall);
+		data = ret & SECCOMP_RET_DATA;
+		ret &= SECCOMP_RET_ACTION;
+		switch (ret) {
+		case SECCOMP_RET_ERRNO:
+			/* Set the low-order 16-bits as a errno. */
+			syscall_set_return_value(current, task_pt_regs(current),
+						 -data, 0);
+			goto skip;
+		case SECCOMP_RET_TRAP:
+			/* Show the handler the original registers. */
+			syscall_rollback(current, task_pt_regs(current));
+			/* Let the filter pass back 16 bits of data. */
+			seccomp_send_sigsys(this_syscall, data);
+			goto skip;
+		case SECCOMP_RET_TRACE:
+			/* Skip these calls if there is no tracer. */
+			if (!ptrace_event_enabled(current, PTRACE_EVENT_SECCOMP))
+				goto skip;
+			/* Allow the BPF to provide the event message */
+			ptrace_event(PTRACE_EVENT_SECCOMP, data);
+			/*
+			 * The delivery of a fatal signal during event
+			 * notification may silently skip tracer notification.
+			 * Terminating the task now avoids executing a system
+			 * call that may not be intended.
+			 */
+			if (fatal_signal_pending(current))
+				break;
+			return 0;
+		case SECCOMP_RET_ALLOW:
+			return 0;
+		case SECCOMP_RET_KILL:
+		default:
+			break;
+		}
+		exit_sig = SIGSYS;
 		break;
+	}
+#endif
 	default:
 		BUG();
 	}
@@ -55,8 +443,13 @@ void __secure_computing(int this_syscall)
 #ifdef SECCOMP_DEBUG
 	dump_stack();
 #endif
-	audit_seccomp(this_syscall);
-	do_exit(SIGKILL);
+	audit_seccomp(this_syscall, exit_sig, ret);
+	do_exit(exit_sig);
+#ifdef CONFIG_SECCOMP_FILTER
+skip:
+	audit_seccomp(this_syscall, exit_sig, ret);
+#endif
+	return -1;
 }
 
 long prctl_get_seccomp(void)
@@ -64,25 +457,48 @@ long prctl_get_seccomp(void)
 	return current->seccomp.mode;
 }
 
-long prctl_set_seccomp(unsigned long seccomp_mode)
+/**
+ * prctl_set_seccomp: configures current->seccomp.mode
+ * @seccomp_mode: requested mode to use
+ * @filter: optional struct sock_fprog for use with SECCOMP_MODE_FILTER
+ *
+ * This function may be called repeatedly with a @seccomp_mode of
+ * SECCOMP_MODE_FILTER to install additional filters.  Every filter
+ * successfully installed will be evaluated (in reverse order) for each system
+ * call the task makes.
+ *
+ * Once current->seccomp.mode is non-zero, it may not be changed.
+ *
+ * Returns 0 on success or -EINVAL on failure.
+ */
+long prctl_set_seccomp(unsigned long seccomp_mode, char __user *filter)
 {
-	long ret;
+	long ret = -EINVAL;
 
-	/* can set it only once to be even more secure */
-	ret = -EPERM;
-	if (unlikely(current->seccomp.mode))
+	if (current->seccomp.mode &&
+	    current->seccomp.mode != seccomp_mode)
 		goto out;
 
-	ret = -EINVAL;
-	if (seccomp_mode && seccomp_mode <= NR_SECCOMP_MODES) {
-		current->seccomp.mode = seccomp_mode;
-		set_thread_flag(TIF_SECCOMP);
+	switch (seccomp_mode) {
+	case SECCOMP_MODE_STRICT:
+		ret = 0;
 #ifdef TIF_NOTSC
 		disable_TSC();
 #endif
-		ret = 0;
+		break;
+#ifdef CONFIG_SECCOMP_FILTER
+	case SECCOMP_MODE_FILTER:
+		ret = seccomp_attach_user_filter(filter);
+		if (ret)
+			goto out;
+		break;
+#endif
+	default:
+		goto out;
 	}
 
- out:
+	current->seccomp.mode = seccomp_mode;
+	set_thread_flag(TIF_SECCOMP);
+out:
 	return ret;
 }
diff --git a/kernel/signal.c b/kernel/signal.c
index 17afcaf582d0..1a006b5d9d9d 100644
--- a/kernel/signal.c
+++ b/kernel/signal.c
@@ -160,7 +160,7 @@ void recalc_sigpending(void)
 
 #define SYNCHRONOUS_MASK \
 	(sigmask(SIGSEGV) | sigmask(SIGBUS) | sigmask(SIGILL) | \
-	 sigmask(SIGTRAP) | sigmask(SIGFPE))
+	 sigmask(SIGTRAP) | sigmask(SIGFPE) | sigmask(SIGSYS))
 
 int next_signal(struct sigpending *pending, sigset_t *mask)
 {
@@ -2706,6 +2706,13 @@ int copy_siginfo_to_user(siginfo_t __user *to, siginfo_t *from)
 		err |= __put_user(from->si_uid, &to->si_uid);
 		err |= __put_user(from->si_ptr, &to->si_ptr);
 		break;
+#ifdef __ARCH_SIGSYS
+	case __SI_SYS:
+		err |= __put_user(from->si_call_addr, &to->si_call_addr);
+		err |= __put_user(from->si_syscall, &to->si_syscall);
+		err |= __put_user(from->si_arch, &to->si_arch);
+		break;
+#endif
 	default: /* this is just in case for now ... */
 		err |= __put_user(from->si_pid, &to->si_pid);
 		err |= __put_user(from->si_uid, &to->si_uid);
diff --git a/kernel/sys.c b/kernel/sys.c
index e7006eb6c1e4..ba0ae8eea6fb 100644
--- a/kernel/sys.c
+++ b/kernel/sys.c
@@ -1908,7 +1908,7 @@ SYSCALL_DEFINE5(prctl, int, option, unsigned long, arg2, unsigned long, arg3,
 			error = prctl_get_seccomp();
 			break;
 		case PR_SET_SECCOMP:
-			error = prctl_set_seccomp(arg2);
+			error = prctl_set_seccomp(arg2, (char __user *)arg3);
 			break;
 		case PR_GET_TSC:
 			error = GET_TSC_CTL(arg2);
@@ -1979,6 +1979,16 @@ SYSCALL_DEFINE5(prctl, int, option, unsigned long, arg2, unsigned long, arg3,
 			error = put_user(me->signal->is_child_subreaper,
 					 (int __user *) arg2);
 			break;
+		case PR_SET_NO_NEW_PRIVS:
+			if (arg2 != 1 || arg3 || arg4 || arg5)
+				return -EINVAL;
+
+			current->no_new_privs = 1;
+			break;
+		case PR_GET_NO_NEW_PRIVS:
+			if (arg2 || arg3 || arg4 || arg5)
+				return -EINVAL;
+			return current->no_new_privs ? 1 : 0;
 		default:
 			error = -EINVAL;
 			break;
diff --git a/net/compat.c b/net/compat.c
index e240441a2317..1b96281892de 100644
--- a/net/compat.c
+++ b/net/compat.c
@@ -328,14 +328,6 @@ void scm_detach_fds_compat(struct msghdr *kmsg, struct scm_cookie *scm)
 	__scm_destroy(scm);
 }
 
-/*
- * A struct sock_filter is architecture independent.
- */
-struct compat_sock_fprog {
-	u16		len;
-	compat_uptr_t	filter;		/* struct sock_filter * */
-};
-
 static int do_set_attach_filter(struct socket *sock, int level, int optname,
 				char __user *optval, unsigned int optlen)
 {
diff --git a/net/core/filter.c b/net/core/filter.c
index 47a5f055e7f3..a3eddb515d1b 100644
--- a/net/core/filter.c
+++ b/net/core/filter.c
@@ -38,6 +38,7 @@
 #include <linux/filter.h>
 #include <linux/reciprocal_div.h>
 #include <linux/ratelimit.h>
+#include <linux/seccomp.h>
 
 /* No hurry in this branch
  *
@@ -355,6 +356,11 @@ load_b:
 				A = 0;
 			continue;
 		}
+#ifdef CONFIG_SECCOMP_FILTER
+		case BPF_S_ANC_SECCOMP_LD_W:
+			A = seccomp_bpf_load(fentry->k);
+			continue;
+#endif
 		default:
 			WARN_RATELIMIT(1, "Unknown code:%u jt:%u tf:%u k:%u\n",
 				       fentry->code, fentry->jt,
diff --git a/net/dns_resolver/dns_key.c b/net/dns_resolver/dns_key.c
index 6f70ea935b0b..d9507dd05818 100644
--- a/net/dns_resolver/dns_key.c
+++ b/net/dns_resolver/dns_key.c
@@ -249,9 +249,6 @@ static int __init init_dns_resolver(void)
 	struct key *keyring;
 	int ret;
 
-	printk(KERN_NOTICE "Registering the %s key type\n",
-	       key_type_dns_resolver.name);
-
 	/* create an override credential set with a special thread keyring in
 	 * which DNS requests are cached
 	 *
@@ -301,8 +298,6 @@ static void __exit exit_dns_resolver(void)
 	key_revoke(dns_resolver_cache->thread_keyring);
 	unregister_key_type(&key_type_dns_resolver);
 	put_cred(dns_resolver_cache);
-	printk(KERN_NOTICE "Unregistered %s key type\n",
-	       key_type_dns_resolver.name);
 }
 
 module_init(init_dns_resolver)
diff --git a/net/xfrm/xfrm_policy.c b/net/xfrm/xfrm_policy.c
index 3c87a1c4066f..c53e8f42aa75 100644
--- a/net/xfrm/xfrm_policy.c
+++ b/net/xfrm/xfrm_policy.c
@@ -26,6 +26,7 @@
 #include <linux/cache.h>
 #include <linux/audit.h>
 #include <net/dst.h>
+#include <net/flow.h>
 #include <net/xfrm.h>
 #include <net/ip.h>
 #ifdef CONFIG_XFRM_STATISTICS
diff --git a/samples/Makefile b/samples/Makefile
index 2f75851ec629..5ef08bba96ce 100644
--- a/samples/Makefile
+++ b/samples/Makefile
@@ -1,4 +1,4 @@
 # Makefile for Linux samples code
 
 obj-$(CONFIG_SAMPLES)	+= kobject/ kprobes/ tracepoints/ trace_events/ \
-			   hw_breakpoint/ kfifo/ kdb/ hidraw/ rpmsg/
+			   hw_breakpoint/ kfifo/ kdb/ hidraw/ rpmsg/ seccomp/
diff --git a/samples/seccomp/Makefile b/samples/seccomp/Makefile
new file mode 100644
index 000000000000..16aa2d424985
--- /dev/null
+++ b/samples/seccomp/Makefile
@@ -0,0 +1,32 @@
+# kbuild trick to avoid linker error. Can be omitted if a module is built.
+obj- := dummy.o
+
+hostprogs-$(CONFIG_SECCOMP_FILTER) := bpf-fancy dropper bpf-direct
+
+HOSTCFLAGS_bpf-fancy.o += -I$(objtree)/usr/include
+HOSTCFLAGS_bpf-fancy.o += -idirafter $(objtree)/include
+HOSTCFLAGS_bpf-helper.o += -I$(objtree)/usr/include
+HOSTCFLAGS_bpf-helper.o += -idirafter $(objtree)/include
+bpf-fancy-objs := bpf-fancy.o bpf-helper.o
+
+HOSTCFLAGS_dropper.o += -I$(objtree)/usr/include
+HOSTCFLAGS_dropper.o += -idirafter $(objtree)/include
+dropper-objs := dropper.o
+
+HOSTCFLAGS_bpf-direct.o += -I$(objtree)/usr/include
+HOSTCFLAGS_bpf-direct.o += -idirafter $(objtree)/include
+bpf-direct-objs := bpf-direct.o
+
+# Try to match the kernel target.
+ifeq ($(CONFIG_64BIT),)
+HOSTCFLAGS_bpf-direct.o += -m32
+HOSTCFLAGS_dropper.o += -m32
+HOSTCFLAGS_bpf-helper.o += -m32
+HOSTCFLAGS_bpf-fancy.o += -m32
+HOSTLOADLIBES_bpf-direct += -m32
+HOSTLOADLIBES_bpf-fancy += -m32
+HOSTLOADLIBES_dropper += -m32
+endif
+
+# Tell kbuild to always build the programs
+always := $(hostprogs-y)
diff --git a/samples/seccomp/bpf-direct.c b/samples/seccomp/bpf-direct.c
new file mode 100644
index 000000000000..151ec3f52189
--- /dev/null
+++ b/samples/seccomp/bpf-direct.c
@@ -0,0 +1,190 @@
+/*
+ * Seccomp filter example for x86 (32-bit and 64-bit) with BPF macros
+ *
+ * Copyright (c) 2012 The Chromium OS Authors <chromium-os-dev@chromium.org>
+ * Author: Will Drewry <wad@chromium.org>
+ *
+ * The code may be used by anyone for any purpose,
+ * and can serve as a starting point for developing
+ * applications using prctl(PR_SET_SECCOMP, 2, ...).
+ */
+#if defined(__i386__) || defined(__x86_64__)
+#define SUPPORTED_ARCH 1
+#endif
+
+#if defined(SUPPORTED_ARCH)
+#define __USE_GNU 1
+#define _GNU_SOURCE 1
+
+#include <linux/types.h>
+#include <linux/filter.h>
+#include <linux/seccomp.h>
+#include <linux/unistd.h>
+#include <signal.h>
+#include <stdio.h>
+#include <stddef.h>
+#include <string.h>
+#include <sys/prctl.h>
+#include <unistd.h>
+
+#define syscall_arg(_n) (offsetof(struct seccomp_data, args[_n]))
+#define syscall_nr (offsetof(struct seccomp_data, nr))
+
+#if defined(__i386__)
+#define REG_RESULT	REG_EAX
+#define REG_SYSCALL	REG_EAX
+#define REG_ARG0	REG_EBX
+#define REG_ARG1	REG_ECX
+#define REG_ARG2	REG_EDX
+#define REG_ARG3	REG_ESI
+#define REG_ARG4	REG_EDI
+#define REG_ARG5	REG_EBP
+#elif defined(__x86_64__)
+#define REG_RESULT	REG_RAX
+#define REG_SYSCALL	REG_RAX
+#define REG_ARG0	REG_RDI
+#define REG_ARG1	REG_RSI
+#define REG_ARG2	REG_RDX
+#define REG_ARG3	REG_R10
+#define REG_ARG4	REG_R8
+#define REG_ARG5	REG_R9
+#endif
+
+#ifndef PR_SET_NO_NEW_PRIVS
+#define PR_SET_NO_NEW_PRIVS 38
+#endif
+
+#ifndef SYS_SECCOMP
+#define SYS_SECCOMP 1
+#endif
+
+static void emulator(int nr, siginfo_t *info, void *void_context)
+{
+	ucontext_t *ctx = (ucontext_t *)(void_context);
+	int syscall;
+	char *buf;
+	ssize_t bytes;
+	size_t len;
+	if (info->si_code != SYS_SECCOMP)
+		return;
+	if (!ctx)
+		return;
+	syscall = ctx->uc_mcontext.gregs[REG_SYSCALL];
+	buf = (char *) ctx->uc_mcontext.gregs[REG_ARG1];
+	len = (size_t) ctx->uc_mcontext.gregs[REG_ARG2];
+
+	if (syscall != __NR_write)
+		return;
+	if (ctx->uc_mcontext.gregs[REG_ARG0] != STDERR_FILENO)
+		return;
+	/* Redirect stderr messages to stdout. Doesn't handle EINTR, etc */
+	ctx->uc_mcontext.gregs[REG_RESULT] = -1;
+	if (write(STDOUT_FILENO, "[ERR] ", 6) > 0) {
+		bytes = write(STDOUT_FILENO, buf, len);
+		ctx->uc_mcontext.gregs[REG_RESULT] = bytes;
+	}
+	return;
+}
+
+static int install_emulator(void)
+{
+	struct sigaction act;
+	sigset_t mask;
+	memset(&act, 0, sizeof(act));
+	sigemptyset(&mask);
+	sigaddset(&mask, SIGSYS);
+
+	act.sa_sigaction = &emulator;
+	act.sa_flags = SA_SIGINFO;
+	if (sigaction(SIGSYS, &act, NULL) < 0) {
+		perror("sigaction");
+		return -1;
+	}
+	if (sigprocmask(SIG_UNBLOCK, &mask, NULL)) {
+		perror("sigprocmask");
+		return -1;
+	}
+	return 0;
+}
+
+static int install_filter(void)
+{
+	struct sock_filter filter[] = {
+		/* Grab the system call number */
+		BPF_STMT(BPF_LD+BPF_W+BPF_ABS, syscall_nr),
+		/* Jump table for the allowed syscalls */
+		BPF_JUMP(BPF_JMP+BPF_JEQ+BPF_K, __NR_rt_sigreturn, 0, 1),
+		BPF_STMT(BPF_RET+BPF_K, SECCOMP_RET_ALLOW),
+#ifdef __NR_sigreturn
+		BPF_JUMP(BPF_JMP+BPF_JEQ+BPF_K, __NR_sigreturn, 0, 1),
+		BPF_STMT(BPF_RET+BPF_K, SECCOMP_RET_ALLOW),
+#endif
+		BPF_JUMP(BPF_JMP+BPF_JEQ+BPF_K, __NR_exit_group, 0, 1),
+		BPF_STMT(BPF_RET+BPF_K, SECCOMP_RET_ALLOW),
+		BPF_JUMP(BPF_JMP+BPF_JEQ+BPF_K, __NR_exit, 0, 1),
+		BPF_STMT(BPF_RET+BPF_K, SECCOMP_RET_ALLOW),
+		BPF_JUMP(BPF_JMP+BPF_JEQ+BPF_K, __NR_read, 1, 0),
+		BPF_JUMP(BPF_JMP+BPF_JEQ+BPF_K, __NR_write, 3, 2),
+
+		/* Check that read is only using stdin. */
+		BPF_STMT(BPF_LD+BPF_W+BPF_ABS, syscall_arg(0)),
+		BPF_JUMP(BPF_JMP+BPF_JEQ+BPF_K, STDIN_FILENO, 4, 0),
+		BPF_STMT(BPF_RET+BPF_K, SECCOMP_RET_KILL),
+
+		/* Check that write is only using stdout */
+		BPF_STMT(BPF_LD+BPF_W+BPF_ABS, syscall_arg(0)),
+		BPF_JUMP(BPF_JMP+BPF_JEQ+BPF_K, STDOUT_FILENO, 1, 0),
+		/* Trap attempts to write to stderr */
+		BPF_JUMP(BPF_JMP+BPF_JEQ+BPF_K, STDERR_FILENO, 1, 2),
+
+		BPF_STMT(BPF_RET+BPF_K, SECCOMP_RET_ALLOW),
+		BPF_STMT(BPF_RET+BPF_K, SECCOMP_RET_TRAP),
+		BPF_STMT(BPF_RET+BPF_K, SECCOMP_RET_KILL),
+	};
+	struct sock_fprog prog = {
+		.len = (unsigned short)(sizeof(filter)/sizeof(filter[0])),
+		.filter = filter,
+	};
+
+	if (prctl(PR_SET_NO_NEW_PRIVS, 1, 0, 0, 0)) {
+		perror("prctl(NO_NEW_PRIVS)");
+		return 1;
+	}
+
+
+	if (prctl(PR_SET_SECCOMP, SECCOMP_MODE_FILTER, &prog)) {
+		perror("prctl");
+		return 1;
+	}
+	return 0;
+}
+
+#define payload(_c) (_c), sizeof((_c))
+int main(int argc, char **argv)
+{
+	char buf[4096];
+	ssize_t bytes = 0;
+	if (install_emulator())
+		return 1;
+	if (install_filter())
+		return 1;
+	syscall(__NR_write, STDOUT_FILENO,
+		payload("OHAI! WHAT IS YOUR NAME? "));
+	bytes = syscall(__NR_read, STDIN_FILENO, buf, sizeof(buf));
+	syscall(__NR_write, STDOUT_FILENO, payload("HELLO, "));
+	syscall(__NR_write, STDOUT_FILENO, buf, bytes);
+	syscall(__NR_write, STDERR_FILENO,
+		payload("Error message going to STDERR\n"));
+	return 0;
+}
+#else	/* SUPPORTED_ARCH */
+/*
+ * This sample is x86-only.  Since kernel samples are compiled with the
+ * host toolchain, a non-x86 host will result in using only the main()
+ * below.
+ */
+int main(void)
+{
+	return 1;
+}
+#endif	/* SUPPORTED_ARCH */
diff --git a/samples/seccomp/bpf-fancy.c b/samples/seccomp/bpf-fancy.c
new file mode 100644
index 000000000000..8eb483aaec46
--- /dev/null
+++ b/samples/seccomp/bpf-fancy.c
@@ -0,0 +1,102 @@
+/*
+ * Seccomp BPF example using a macro-based generator.
+ *
+ * Copyright (c) 2012 The Chromium OS Authors <chromium-os-dev@chromium.org>
+ * Author: Will Drewry <wad@chromium.org>
+ *
+ * The code may be used by anyone for any purpose,
+ * and can serve as a starting point for developing
+ * applications using prctl(PR_ATTACH_SECCOMP_FILTER).
+ */
+
+#include <linux/filter.h>
+#include <linux/seccomp.h>
+#include <linux/unistd.h>
+#include <stdio.h>
+#include <string.h>
+#include <sys/prctl.h>
+#include <unistd.h>
+
+#include "bpf-helper.h"
+
+#ifndef PR_SET_NO_NEW_PRIVS
+#define PR_SET_NO_NEW_PRIVS 38
+#endif
+
+int main(int argc, char **argv)
+{
+	struct bpf_labels l;
+	static const char msg1[] = "Please type something: ";
+	static const char msg2[] = "You typed: ";
+	char buf[256];
+	struct sock_filter filter[] = {
+		/* TODO: LOAD_SYSCALL_NR(arch) and enforce an arch */
+		LOAD_SYSCALL_NR,
+		SYSCALL(__NR_exit, ALLOW),
+		SYSCALL(__NR_exit_group, ALLOW),
+		SYSCALL(__NR_write, JUMP(&l, write_fd)),
+		SYSCALL(__NR_read, JUMP(&l, read)),
+		DENY,  /* Don't passthrough into a label */
+
+		LABEL(&l, read),
+		ARG(0),
+		JNE(STDIN_FILENO, DENY),
+		ARG(1),
+		JNE((unsigned long)buf, DENY),
+		ARG(2),
+		JGE(sizeof(buf), DENY),
+		ALLOW,
+
+		LABEL(&l, write_fd),
+		ARG(0),
+		JEQ(STDOUT_FILENO, JUMP(&l, write_buf)),
+		JEQ(STDERR_FILENO, JUMP(&l, write_buf)),
+		DENY,
+
+		LABEL(&l, write_buf),
+		ARG(1),
+		JEQ((unsigned long)msg1, JUMP(&l, msg1_len)),
+		JEQ((unsigned long)msg2, JUMP(&l, msg2_len)),
+		JEQ((unsigned long)buf, JUMP(&l, buf_len)),
+		DENY,
+
+		LABEL(&l, msg1_len),
+		ARG(2),
+		JLT(sizeof(msg1), ALLOW),
+		DENY,
+
+		LABEL(&l, msg2_len),
+		ARG(2),
+		JLT(sizeof(msg2), ALLOW),
+		DENY,
+
+		LABEL(&l, buf_len),
+		ARG(2),
+		JLT(sizeof(buf), ALLOW),
+		DENY,
+	};
+	struct sock_fprog prog = {
+		.filter = filter,
+		.len = (unsigned short)(sizeof(filter)/sizeof(filter[0])),
+	};
+	ssize_t bytes;
+	bpf_resolve_jumps(&l, filter, sizeof(filter)/sizeof(*filter));
+
+	if (prctl(PR_SET_NO_NEW_PRIVS, 1, 0, 0, 0)) {
+		perror("prctl(NO_NEW_PRIVS)");
+		return 1;
+	}
+
+	if (prctl(PR_SET_SECCOMP, SECCOMP_MODE_FILTER, &prog)) {
+		perror("prctl(SECCOMP)");
+		return 1;
+	}
+	syscall(__NR_write, STDOUT_FILENO, msg1, strlen(msg1));
+	bytes = syscall(__NR_read, STDIN_FILENO, buf, sizeof(buf)-1);
+	bytes = (bytes > 0 ? bytes : 0);
+	syscall(__NR_write, STDERR_FILENO, msg2, strlen(msg2));
+	syscall(__NR_write, STDERR_FILENO, buf, bytes);
+	/* Now get killed */
+	syscall(__NR_write, STDERR_FILENO, msg2, strlen(msg2)+2);
+	return 0;
+}
diff --git a/samples/seccomp/bpf-helper.c b/samples/seccomp/bpf-helper.c
new file mode 100644
index 000000000000..579cfe331886
--- /dev/null
+++ b/samples/seccomp/bpf-helper.c
@@ -0,0 +1,89 @@
+/*
+ * Seccomp BPF helper functions
+ *
+ * Copyright (c) 2012 The Chromium OS Authors <chromium-os-dev@chromium.org>
+ * Author: Will Drewry <wad@chromium.org>
+ *
+ * The code may be used by anyone for any purpose,
+ * and can serve as a starting point for developing
+ * applications using prctl(PR_ATTACH_SECCOMP_FILTER).
+ */
+
+#include <stdio.h>
+#include <string.h>
+
+#include "bpf-helper.h"
+
+int bpf_resolve_jumps(struct bpf_labels *labels,
+		      struct sock_filter *filter, size_t count)
+{
+	struct sock_filter *begin = filter;
+	__u8 insn = count - 1;
+
+	if (count < 1)
+		return -1;
+	/*
+	* Walk it once, backwards, to build the label table and do fixups.
+	* Since backward jumps are disallowed by BPF, this is easy.
+	*/
+	filter += insn;
+	for (; filter >= begin; --insn, --filter) {
+		if (filter->code != (BPF_JMP+BPF_JA))
+			continue;
+		switch ((filter->jt<<8)|filter->jf) {
+		case (JUMP_JT<<8)|JUMP_JF:
+			if (labels->labels[filter->k].location == 0xffffffff) {
+				fprintf(stderr, "Unresolved label: '%s'\n",
+					labels->labels[filter->k].label);
+				return 1;
+			}
+			filter->k = labels->labels[filter->k].location -
+				    (insn + 1);
+			filter->jt = 0;
+			filter->jf = 0;
+			continue;
+		case (LABEL_JT<<8)|LABEL_JF:
+			if (labels->labels[filter->k].location != 0xffffffff) {
+				fprintf(stderr, "Duplicate label use: '%s'\n",
+					labels->labels[filter->k].label);
+				return 1;
+			}
+			labels->labels[filter->k].location = insn;
+			filter->k = 0; /* fall through */
+			filter->jt = 0;
+			filter->jf = 0;
+			continue;
+		}
+	}
+	return 0;
+}
+
+/* Simple lookup table for labels. */
+__u32 seccomp_bpf_label(struct bpf_labels *labels, const char *label)
+{
+	struct __bpf_label *begin = labels->labels, *end;
+	int id;
+	if (labels->count == 0) {
+		begin->label = label;
+		begin->location = 0xffffffff;
+		labels->count++;
+		return 0;
+	}
+	end = begin + labels->count;
+	for (id = 0; begin < end; ++begin, ++id) {
+		if (!strcmp(label, begin->label))
+			return id;
+	}
+	begin->label = label;
+	begin->location = 0xffffffff;
+	labels->count++;
+	return id;
+}
+
+void seccomp_bpf_print(struct sock_filter *filter, size_t count)
+{
+	struct sock_filter *end = filter + count;
+	for ( ; filter < end; ++filter)
+		printf("{ code=%u,jt=%u,jf=%u,k=%u },\n",
+			filter->code, filter->jt, filter->jf, filter->k);
+}
diff --git a/samples/seccomp/bpf-helper.h b/samples/seccomp/bpf-helper.h
new file mode 100644
index 000000000000..643279dd30fb
--- /dev/null
+++ b/samples/seccomp/bpf-helper.h
@@ -0,0 +1,238 @@
+/*
+ * Example wrapper around BPF macros.
+ *
+ * Copyright (c) 2012 The Chromium OS Authors <chromium-os-dev@chromium.org>
+ * Author: Will Drewry <wad@chromium.org>
+ *
+ * The code may be used by anyone for any purpose,
+ * and can serve as a starting point for developing
+ * applications using prctl(PR_SET_SECCOMP, 2, ...).
+ *
+ * No guarantees are provided with respect to the correctness
+ * or functionality of this code.
+ */
+#ifndef __BPF_HELPER_H__
+#define __BPF_HELPER_H__
+
+#include <asm/bitsperlong.h>	/* for __BITS_PER_LONG */
+#include <endian.h>
+#include <linux/filter.h>
+#include <linux/seccomp.h>	/* for seccomp_data */
+#include <linux/types.h>
+#include <linux/unistd.h>
+#include <stddef.h>
+
+#define BPF_LABELS_MAX 256
+struct bpf_labels {
+	int count;
+	struct __bpf_label {
+		const char *label;
+		__u32 location;
+	} labels[BPF_LABELS_MAX];
+};
+
+int bpf_resolve_jumps(struct bpf_labels *labels,
+		      struct sock_filter *filter, size_t count);
+__u32 seccomp_bpf_label(struct bpf_labels *labels, const char *label);
+void seccomp_bpf_print(struct sock_filter *filter, size_t count);
+
+#define JUMP_JT 0xff
+#define JUMP_JF 0xff
+#define LABEL_JT 0xfe
+#define LABEL_JF 0xfe
+
+#define ALLOW \
+	BPF_STMT(BPF_RET+BPF_K, SECCOMP_RET_ALLOW)
+#define DENY \
+	BPF_STMT(BPF_RET+BPF_K, SECCOMP_RET_KILL)
+#define JUMP(labels, label) \
+	BPF_JUMP(BPF_JMP+BPF_JA, FIND_LABEL((labels), (label)), \
+		 JUMP_JT, JUMP_JF)
+#define LABEL(labels, label) \
+	BPF_JUMP(BPF_JMP+BPF_JA, FIND_LABEL((labels), (label)), \
+		 LABEL_JT, LABEL_JF)
+#define SYSCALL(nr, jt) \
+	BPF_JUMP(BPF_JMP+BPF_JEQ+BPF_K, (nr), 0, 1), \
+	jt
+
+/* Lame, but just an example */
+#define FIND_LABEL(labels, label) seccomp_bpf_label((labels), #label)
+
+#define EXPAND(...) __VA_ARGS__
+/* Map all width-sensitive operations */
+#if __BITS_PER_LONG == 32
+
+#define JEQ(x, jt) JEQ32(x, EXPAND(jt))
+#define JNE(x, jt) JNE32(x, EXPAND(jt))
+#define JGT(x, jt) JGT32(x, EXPAND(jt))
+#define JLT(x, jt) JLT32(x, EXPAND(jt))
+#define JGE(x, jt) JGE32(x, EXPAND(jt))
+#define JLE(x, jt) JLE32(x, EXPAND(jt))
+#define JA(x, jt) JA32(x, EXPAND(jt))
+#define ARG(i) ARG_32(i)
+#define LO_ARG(idx) offsetof(struct seccomp_data, args[(idx)])
+
+#elif __BITS_PER_LONG == 64
+
+/* Ensure that we load the logically correct offset. */
+#if __BYTE_ORDER == __LITTLE_ENDIAN
+#define ENDIAN(_lo, _hi) _lo, _hi
+#define LO_ARG(idx) offsetof(struct seccomp_data, args[(idx)])
+#define HI_ARG(idx) offsetof(struct seccomp_data, args[(idx)]) + sizeof(__u32)
+#elif __BYTE_ORDER == __BIG_ENDIAN
+#define ENDIAN(_lo, _hi) _hi, _lo
+#define LO_ARG(idx) offsetof(struct seccomp_data, args[(idx)]) + sizeof(__u32)
+#define HI_ARG(idx) offsetof(struct seccomp_data, args[(idx)])
+#else
+#error "Unknown endianness"
+#endif
+
+union arg64 {
+	struct {
+		__u32 ENDIAN(lo32, hi32);
+	};
+	__u64 u64;
+};
+
+#define JEQ(x, jt) \
+	JEQ64(((union arg64){.u64 = (x)}).lo32, \
+	      ((union arg64){.u64 = (x)}).hi32, \
+	      EXPAND(jt))
+#define JGT(x, jt) \
+	JGT64(((union arg64){.u64 = (x)}).lo32, \
+	      ((union arg64){.u64 = (x)}).hi32, \
+	      EXPAND(jt))
+#define JGE(x, jt) \
+	JGE64(((union arg64){.u64 = (x)}).lo32, \
+	      ((union arg64){.u64 = (x)}).hi32, \
+	      EXPAND(jt))
+#define JNE(x, jt) \
+	JNE64(((union arg64){.u64 = (x)}).lo32, \
+	      ((union arg64){.u64 = (x)}).hi32, \
+	      EXPAND(jt))
+#define JLT(x, jt) \
+	JLT64(((union arg64){.u64 = (x)}).lo32, \
+	      ((union arg64){.u64 = (x)}).hi32, \
+	      EXPAND(jt))
+#define JLE(x, jt) \
+	JLE64(((union arg64){.u64 = (x)}).lo32, \
+	      ((union arg64){.u64 = (x)}).hi32, \
+	      EXPAND(jt))
+
+#define JA(x, jt) \
+	JA64(((union arg64){.u64 = (x)}).lo32, \
+	       ((union arg64){.u64 = (x)}).hi32, \
+	       EXPAND(jt))
+#define ARG(i) ARG_64(i)
+
+#else
+#error __BITS_PER_LONG value unusable.
+#endif
+
+/* Loads the arg into A */
+#define ARG_32(idx) \
+	BPF_STMT(BPF_LD+BPF_W+BPF_ABS, LO_ARG(idx))
+
+/* Loads hi into A and lo in X */
+#define ARG_64(idx) \
+	BPF_STMT(BPF_LD+BPF_W+BPF_ABS, LO_ARG(idx)), \
+	BPF_STMT(BPF_ST, 0), /* lo -> M[0] */ \
+	BPF_STMT(BPF_LD+BPF_W+BPF_ABS, HI_ARG(idx)), \
+	BPF_STMT(BPF_ST, 1) /* hi -> M[1] */
+
+#define JEQ32(value, jt) \
+	BPF_JUMP(BPF_JMP+BPF_JEQ+BPF_K, (value), 0, 1), \
+	jt
+
+#define JNE32(value, jt) \
+	BPF_JUMP(BPF_JMP+BPF_JEQ+BPF_K, (value), 1, 0), \
+	jt
+
+/* Checks the lo, then swaps to check the hi. A=lo,X=hi */
+#define JEQ64(lo, hi, jt) \
+	BPF_JUMP(BPF_JMP+BPF_JEQ+BPF_K, (hi), 0, 5), \
+	BPF_STMT(BPF_LD+BPF_MEM, 0), /* swap in lo */ \
+	BPF_JUMP(BPF_JMP+BPF_JEQ+BPF_K, (lo), 0, 2), \
+	BPF_STMT(BPF_LD+BPF_MEM, 1), /* passed: swap hi back in */ \
+	jt, \
+	BPF_STMT(BPF_LD+BPF_MEM, 1) /* failed: swap hi back in */
+
+#define JNE64(lo, hi, jt) \
+	BPF_JUMP(BPF_JMP+BPF_JEQ+BPF_K, (hi), 5, 0), \
+	BPF_STMT(BPF_LD+BPF_MEM, 0), /* swap in lo */ \
+	BPF_JUMP(BPF_JMP+BPF_JEQ+BPF_K, (lo), 2, 0), \
+	BPF_STMT(BPF_LD+BPF_MEM, 1), /* passed: swap hi back in */ \
+	jt, \
+	BPF_STMT(BPF_LD+BPF_MEM, 1) /* failed: swap hi back in */
+
+#define JA32(value, jt) \
+	BPF_JUMP(BPF_JMP+BPF_JSET+BPF_K, (value), 0, 1), \
+	jt
+
+#define JA64(lo, hi, jt) \
+	BPF_JUMP(BPF_JMP+BPF_JSET+BPF_K, (hi), 3, 0), \
+	BPF_STMT(BPF_LD+BPF_MEM, 0), /* swap in lo */ \
+	BPF_JUMP(BPF_JMP+BPF_JSET+BPF_K, (lo), 0, 2), \
+	BPF_STMT(BPF_LD+BPF_MEM, 1), /* passed: swap hi back in */ \
+	jt, \
+	BPF_STMT(BPF_LD+BPF_MEM, 1) /* failed: swap hi back in */
+
+#define JGE32(value, jt) \
+	BPF_JUMP(BPF_JMP+BPF_JGE+BPF_K, (value), 0, 1), \
+	jt
+
+#define JLT32(value, jt) \
+	BPF_JUMP(BPF_JMP+BPF_JGE+BPF_K, (value), 1, 0), \
+	jt
+
+/* Shortcut checking if hi > arg.hi. */
+#define JGE64(lo, hi, jt) \
+	BPF_JUMP(BPF_JMP+BPF_JGT+BPF_K, (hi), 4, 0), \
+	BPF_JUMP(BPF_JMP+BPF_JEQ+BPF_K, (hi), 0, 5), \
+	BPF_STMT(BPF_LD+BPF_MEM, 0), /* swap in lo */ \
+	BPF_JUMP(BPF_JMP+BPF_JGE+BPF_K, (lo), 0, 2), \
+	BPF_STMT(BPF_LD+BPF_MEM, 1), /* passed: swap hi back in */ \
+	jt, \
+	BPF_STMT(BPF_LD+BPF_MEM, 1) /* failed: swap hi back in */
+
+#define JLT64(lo, hi, jt) \
+	BPF_JUMP(BPF_JMP+BPF_JGE+BPF_K, (hi), 0, 4), \
+	BPF_JUMP(BPF_JMP+BPF_JEQ+BPF_K, (hi), 0, 5), \
+	BPF_STMT(BPF_LD+BPF_MEM, 0), /* swap in lo */ \
+	BPF_JUMP(BPF_JMP+BPF_JGT+BPF_K, (lo), 2, 0), \
+	BPF_STMT(BPF_LD+BPF_MEM, 1), /* passed: swap hi back in */ \
+	jt, \
+	BPF_STMT(BPF_LD+BPF_MEM, 1) /* failed: swap hi back in */
+
+#define JGT32(value, jt) \
+	BPF_JUMP(BPF_JMP+BPF_JGT+BPF_K, (value), 0, 1), \
+	jt
+
+#define JLE32(value, jt) \
+	BPF_JUMP(BPF_JMP+BPF_JGT+BPF_K, (value), 1, 0), \
+	jt
+
+/* Check hi > args.hi first, then do the GE checking */
+#define JGT64(lo, hi, jt) \
+	BPF_JUMP(BPF_JMP+BPF_JGT+BPF_K, (hi), 4, 0), \
+	BPF_JUMP(BPF_JMP+BPF_JEQ+BPF_K, (hi), 0, 5), \
+	BPF_STMT(BPF_LD+BPF_MEM, 0), /* swap in lo */ \
+	BPF_JUMP(BPF_JMP+BPF_JGT+BPF_K, (lo), 0, 2), \
+	BPF_STMT(BPF_LD+BPF_MEM, 1), /* passed: swap hi back in */ \
+	jt, \
+	BPF_STMT(BPF_LD+BPF_MEM, 1) /* failed: swap hi back in */
+
+#define JLE64(lo, hi, jt) \
+	BPF_JUMP(BPF_JMP+BPF_JGT+BPF_K, (hi), 6, 0), \
+	BPF_JUMP(BPF_JMP+BPF_JEQ+BPF_K, (hi), 0, 3), \
+	BPF_STMT(BPF_LD+BPF_MEM, 0), /* swap in lo */ \
+	BPF_JUMP(BPF_JMP+BPF_JGT+BPF_K, (lo), 2, 0), \
+	BPF_STMT(BPF_LD+BPF_MEM, 1), /* passed: swap hi back in */ \
+	jt, \
+	BPF_STMT(BPF_LD+BPF_MEM, 1) /* failed: swap hi back in */
+
+#define LOAD_SYSCALL_NR \
+	BPF_STMT(BPF_LD+BPF_W+BPF_ABS, \
+		 offsetof(struct seccomp_data, nr))
+
+#endif  /* __BPF_HELPER_H__ */
diff --git a/samples/seccomp/dropper.c b/samples/seccomp/dropper.c
new file mode 100644
index 000000000000..c69c347c7011
--- /dev/null
+++ b/samples/seccomp/dropper.c
@@ -0,0 +1,68 @@
+/*
+ * Naive system call dropper built on seccomp_filter.
+ *
+ * Copyright (c) 2012 The Chromium OS Authors <chromium-os-dev@chromium.org>
+ * Author: Will Drewry <wad@chromium.org>
+ *
+ * The code may be used by anyone for any purpose,
+ * and can serve as a starting point for developing
+ * applications using prctl(PR_SET_SECCOMP, 2, ...).
+ *
+ * When run, returns the specified errno for the specified
+ * system call number against the given architecture.
+ *
+ * Run this one as root as PR_SET_NO_NEW_PRIVS is not called.
+ */
+
+#include <errno.h>
+#include <linux/audit.h>
+#include <linux/filter.h>
+#include <linux/seccomp.h>
+#include <linux/unistd.h>
+#include <stdio.h>
+#include <stddef.h>
+#include <stdlib.h>
+#include <sys/prctl.h>
+#include <unistd.h>
+
+static int install_filter(int nr, int arch, int error)
+{
+	struct sock_filter filter[] = {
+		BPF_STMT(BPF_LD+BPF_W+BPF_ABS,
+			 (offsetof(struct seccomp_data, arch))),
+		BPF_JUMP(BPF_JMP+BPF_JEQ+BPF_K, arch, 0, 3),
+		BPF_STMT(BPF_LD+BPF_W+BPF_ABS,
+			 (offsetof(struct seccomp_data, nr))),
+		BPF_JUMP(BPF_JMP+BPF_JEQ+BPF_K, nr, 0, 1),
+		BPF_STMT(BPF_RET+BPF_K,
+			 SECCOMP_RET_ERRNO|(error & SECCOMP_RET_DATA)),
+		BPF_STMT(BPF_RET+BPF_K, SECCOMP_RET_ALLOW),
+	};
+	struct sock_fprog prog = {
+		.len = (unsigned short)(sizeof(filter)/sizeof(filter[0])),
+		.filter = filter,
+	};
+	if (prctl(PR_SET_SECCOMP, 2, &prog)) {
+		perror("prctl");
+		return 1;
+	}
+	return 0;
+}
+
+int main(int argc, char **argv)
+{
+	if (argc < 5) {
+		fprintf(stderr, "Usage:\n"
+			"dropper <syscall_nr> <arch> <errno> <prog> [<args>]\n"
+			"Hint:	AUDIT_ARCH_I386: 0x%X\n"
+			"	AUDIT_ARCH_X86_64: 0x%X\n"
+			"\n", AUDIT_ARCH_I386, AUDIT_ARCH_X86_64);
+		return 1;
+	}
+	if (install_filter(strtol(argv[1], NULL, 0), strtol(argv[2], NULL, 0),
+			   strtol(argv[3], NULL, 0)))
+		return 1;
+	execv(argv[4], &argv[4]);
+	printf("Failed to execv\n");
+	return 255;
+}
diff --git a/security/Kconfig b/security/Kconfig
index ccc61f8006b2..e9c6ac724fef 100644
--- a/security/Kconfig
+++ b/security/Kconfig
@@ -4,73 +4,7 @@
 
 menu "Security options"
 
-config KEYS
-	bool "Enable access key retention support"
-	help
-	  This option provides support for retaining authentication tokens and
-	  access keys in the kernel.
-
-	  It also includes provision of methods by which such keys might be
-	  associated with a process so that network filesystems, encryption
-	  support and the like can find them.
-
-	  Furthermore, a special type of key is available that acts as keyring:
-	  a searchable sequence of keys. Each process is equipped with access
-	  to five standard keyrings: UID-specific, GID-specific, session,
-	  process and thread.
-
-	  If you are unsure as to whether this is required, answer N.
-
-config TRUSTED_KEYS
-	tristate "TRUSTED KEYS"
-	depends on KEYS && TCG_TPM
-	select CRYPTO
-	select CRYPTO_HMAC
-	select CRYPTO_SHA1
-	help
-	  This option provides support for creating, sealing, and unsealing
-	  keys in the kernel. Trusted keys are random number symmetric keys,
-	  generated and RSA-sealed by the TPM. The TPM only unseals the keys,
-	  if the boot PCRs and other criteria match.  Userspace will only ever
-	  see encrypted blobs.
-
-	  If you are unsure as to whether this is required, answer N.
-
-config ENCRYPTED_KEYS
-	tristate "ENCRYPTED KEYS"
-	depends on KEYS
-	select CRYPTO
-	select CRYPTO_HMAC
-	select CRYPTO_AES
-	select CRYPTO_CBC
-	select CRYPTO_SHA256
-	select CRYPTO_RNG
-	help
-	  This option provides support for create/encrypting/decrypting keys
-	  in the kernel.  Encrypted keys are kernel generated random numbers,
-	  which are encrypted/decrypted with a 'master' symmetric key. The
-	  'master' key can be either a trusted-key or user-key type.
-	  Userspace only ever sees/stores encrypted blobs.
-
-	  If you are unsure as to whether this is required, answer N.
-
-config KEYS_DEBUG_PROC_KEYS
-	bool "Enable the /proc/keys file by which keys may be viewed"
-	depends on KEYS
-	help
-	  This option turns on support for the /proc/keys file - through which
-	  can be listed all the keys on the system that are viewable by the
-	  reading process.
-
-	  The only keys included in the list are those that grant View
-	  permission to the reading process whether or not it possesses them.
-	  Note that LSM security checks are still performed, and may further
-	  filter out keys that the current process is not authorised to view.
-
-	  Only key attributes are listed here; key payloads are not included in
-	  the resulting table.
-
-	  If you are unsure as to whether this is required, answer N.
+source security/keys/Kconfig
 
 config SECURITY_DMESG_RESTRICT
 	bool "Restrict unprivileged access to the kernel syslog"
diff --git a/security/apparmor/audit.c b/security/apparmor/audit.c
index cc3520d39a78..3ae28db5a64f 100644
--- a/security/apparmor/audit.c
+++ b/security/apparmor/audit.c
@@ -111,7 +111,7 @@ static const char *const aa_audit_type[] = {
 static void audit_pre(struct audit_buffer *ab, void *ca)
 {
 	struct common_audit_data *sa = ca;
-	struct task_struct *tsk = sa->tsk ? sa->tsk : current;
+	struct task_struct *tsk = sa->aad->tsk ? sa->aad->tsk : current;
 
 	if (aa_g_audit_header) {
 		audit_log_format(ab, "apparmor=");
@@ -149,6 +149,12 @@ static void audit_pre(struct audit_buffer *ab, void *ca)
 		audit_log_format(ab, " name=");
 		audit_log_untrustedstring(ab, sa->aad->name);
 	}
+
+	if (sa->aad->tsk) {
+		audit_log_format(ab, " pid=%d comm=", tsk->pid);
+		audit_log_untrustedstring(ab, tsk->comm);
+	}
+
 }
 
 /**
@@ -205,7 +211,8 @@ int aa_audit(int type, struct aa_profile *profile, gfp_t gfp,
 	aa_audit_msg(type, sa, cb);
 
 	if (sa->aad->type == AUDIT_APPARMOR_KILL)
-		(void)send_sig_info(SIGKILL, NULL, sa->tsk ? sa->tsk : current);
+		(void)send_sig_info(SIGKILL, NULL,
+				    sa->aad->tsk ?  sa->aad->tsk : current);
 
 	if (sa->aad->type == AUDIT_APPARMOR_ALLOWED)
 		return complain_error(sa->aad->error);
diff --git a/security/apparmor/capability.c b/security/apparmor/capability.c
index 088dba3bf7dc..887a5e948945 100644
--- a/security/apparmor/capability.c
+++ b/security/apparmor/capability.c
@@ -65,10 +65,10 @@ static int audit_caps(struct aa_profile *profile, struct task_struct *task,
 	int type = AUDIT_APPARMOR_AUTO;
 	struct common_audit_data sa;
 	struct apparmor_audit_data aad = {0,};
-	COMMON_AUDIT_DATA_INIT(&sa, CAP);
+	sa.type = LSM_AUDIT_DATA_CAP;
 	sa.aad = &aad;
-	sa.tsk = task;
 	sa.u.cap = cap;
+	sa.aad->tsk = task;
 	sa.aad->op = OP_CAPABLE;
 	sa.aad->error = error;
 
diff --git a/security/apparmor/domain.c b/security/apparmor/domain.c
index 6327685c101e..b81ea10a17a3 100644
--- a/security/apparmor/domain.c
+++ b/security/apparmor/domain.c
@@ -394,6 +394,11 @@ int apparmor_bprm_set_creds(struct linux_binprm *bprm)
 			new_profile = find_attach(ns, &ns->base.profiles, name);
 		if (!new_profile)
 			goto cleanup;
+		/*
+		 * NOTE: Domain transitions from unconfined are allowed
+		 * even when no_new_privs is set because this aways results
+		 * in a further reduction of permissions.
+		 */
 		goto apply;
 	}
 
@@ -455,6 +460,16 @@ int apparmor_bprm_set_creds(struct linux_binprm *bprm)
 		/* fail exec */
 		error = -EACCES;
 
+	/*
+	 * Policy has specified a domain transition, if no_new_privs then
+	 * fail the exec.
+	 */
+	if (bprm->unsafe & LSM_UNSAFE_NO_NEW_PRIVS) {
+		aa_put_profile(new_profile);
+		error = -EPERM;
+		goto cleanup;
+	}
+
 	if (!new_profile)
 		goto audit;
 
@@ -609,6 +624,14 @@ int aa_change_hat(const char *hats[], int count, u64 token, bool permtest)
 	const char *target = NULL, *info = NULL;
 	int error = 0;
 
+	/*
+	 * Fail explicitly requested domain transitions if no_new_privs.
+	 * There is no exception for unconfined as change_hat is not
+	 * available.
+	 */
+	if (current->no_new_privs)
+		return -EPERM;
+
 	/* released below */
 	cred = get_current_cred();
 	cxt = cred->security;
@@ -750,6 +773,18 @@ int aa_change_profile(const char *ns_name, const char *hname, bool onexec,
 	cxt = cred->security;
 	profile = aa_cred_profile(cred);
 
+	/*
+	 * Fail explicitly requested domain transitions if no_new_privs
+	 * and not unconfined.
+	 * Domain transitions from unconfined are allowed even when
+	 * no_new_privs is set because this aways results in a reduction
+	 * of permissions.
+	 */
+	if (current->no_new_privs && !unconfined(profile)) {
+		put_cred(cred);
+		return -EPERM;
+	}
+
 	if (ns_name) {
 		/* released below */
 		ns = aa_find_namespace(profile->ns, ns_name);
diff --git a/security/apparmor/file.c b/security/apparmor/file.c
index 2f8fcba9ce4b..cf19d4093ca4 100644
--- a/security/apparmor/file.c
+++ b/security/apparmor/file.c
@@ -108,7 +108,7 @@ int aa_audit_file(struct aa_profile *profile, struct file_perms *perms,
 	int type = AUDIT_APPARMOR_AUTO;
 	struct common_audit_data sa;
 	struct apparmor_audit_data aad = {0,};
-	COMMON_AUDIT_DATA_INIT(&sa, NONE);
+	sa.type = LSM_AUDIT_DATA_NONE;
 	sa.aad = &aad;
 	aad.op = op,
 	aad.fs.request = request;
diff --git a/security/apparmor/include/audit.h b/security/apparmor/include/audit.h
index 3868b1e5d5ba..4b7e18951aea 100644
--- a/security/apparmor/include/audit.h
+++ b/security/apparmor/include/audit.h
@@ -110,6 +110,7 @@ struct apparmor_audit_data {
 	void *profile;
 	const char *name;
 	const char *info;
+	struct task_struct *tsk;
 	union {
 		void *target;
 		struct {
diff --git a/security/apparmor/ipc.c b/security/apparmor/ipc.c
index c3da93a5150d..cf1071b14232 100644
--- a/security/apparmor/ipc.c
+++ b/security/apparmor/ipc.c
@@ -42,7 +42,7 @@ static int aa_audit_ptrace(struct aa_profile *profile,
 {
 	struct common_audit_data sa;
 	struct apparmor_audit_data aad = {0,};
-	COMMON_AUDIT_DATA_INIT(&sa, NONE);
+	sa.type = LSM_AUDIT_DATA_NONE;
 	sa.aad = &aad;
 	aad.op = OP_PTRACE;
 	aad.target = target;
diff --git a/security/apparmor/lib.c b/security/apparmor/lib.c
index e75829ba0ff9..7430298116d6 100644
--- a/security/apparmor/lib.c
+++ b/security/apparmor/lib.c
@@ -66,7 +66,7 @@ void aa_info_message(const char *str)
 	if (audit_enabled) {
 		struct common_audit_data sa;
 		struct apparmor_audit_data aad = {0,};
-		COMMON_AUDIT_DATA_INIT(&sa, NONE);
+		sa.type = LSM_AUDIT_DATA_NONE;
 		sa.aad = &aad;
 		aad.info = str;
 		aa_audit_msg(AUDIT_APPARMOR_STATUS, &sa, NULL);
diff --git a/security/apparmor/lsm.c b/security/apparmor/lsm.c
index ad05d391974d..032daab449b0 100644
--- a/security/apparmor/lsm.c
+++ b/security/apparmor/lsm.c
@@ -373,7 +373,7 @@ static int apparmor_inode_getattr(struct vfsmount *mnt, struct dentry *dentry)
 				      AA_MAY_META_READ);
 }
 
-static int apparmor_dentry_open(struct file *file, const struct cred *cred)
+static int apparmor_file_open(struct file *file, const struct cred *cred)
 {
 	struct aa_file_cxt *fcxt = file->f_security;
 	struct aa_profile *profile;
@@ -589,7 +589,7 @@ static int apparmor_setprocattr(struct task_struct *task, char *name,
 		} else {
 			struct common_audit_data sa;
 			struct apparmor_audit_data aad = {0,};
-			COMMON_AUDIT_DATA_INIT(&sa, NONE);
+			sa.type = LSM_AUDIT_DATA_NONE;
 			sa.aad = &aad;
 			aad.op = OP_SETPROCATTR;
 			aad.info = name;
@@ -640,9 +640,9 @@ static struct security_operations apparmor_ops = {
 	.path_chmod =			apparmor_path_chmod,
 	.path_chown =			apparmor_path_chown,
 	.path_truncate =		apparmor_path_truncate,
-	.dentry_open =			apparmor_dentry_open,
 	.inode_getattr =                apparmor_inode_getattr,
 
+	.file_open =			apparmor_file_open,
 	.file_permission =		apparmor_file_permission,
 	.file_alloc_security =		apparmor_file_alloc_security,
 	.file_free_security =		apparmor_file_free_security,
diff --git a/security/apparmor/path.c b/security/apparmor/path.c
index 2daeea4f9266..e91ffee80162 100644
--- a/security/apparmor/path.c
+++ b/security/apparmor/path.c
@@ -94,6 +94,8 @@ static int d_namespace_path(struct path *path, char *buf, int buflen,
 	 * be returned.
 	 */
 	if (!res || IS_ERR(res)) {
+		if (PTR_ERR(res) == -ENAMETOOLONG)
+			return -ENAMETOOLONG;
 		connected = 0;
 		res = dentry_path_raw(path->dentry, buf, buflen);
 		if (IS_ERR(res)) {
diff --git a/security/apparmor/policy.c b/security/apparmor/policy.c
index f1f7506a464d..cf5fd220309b 100644
--- a/security/apparmor/policy.c
+++ b/security/apparmor/policy.c
@@ -903,6 +903,10 @@ struct aa_profile *aa_lookup_profile(struct aa_namespace *ns, const char *hname)
 	profile = aa_get_profile(__lookup_profile(&ns->base, hname));
 	read_unlock(&ns->lock);
 
+	/* the unconfined profile is not in the regular profile list */
+	if (!profile && strcmp(hname, "unconfined") == 0)
+		profile = aa_get_profile(ns->unconfined);
+
 	/* refcount released by caller */
 	return profile;
 }
@@ -965,7 +969,7 @@ static int audit_policy(int op, gfp_t gfp, const char *name, const char *info,
 {
 	struct common_audit_data sa;
 	struct apparmor_audit_data aad = {0,};
-	COMMON_AUDIT_DATA_INIT(&sa, NONE);
+	sa.type = LSM_AUDIT_DATA_NONE;
 	sa.aad = &aad;
 	aad.op = op;
 	aad.name = name;
diff --git a/security/apparmor/policy_unpack.c b/security/apparmor/policy_unpack.c
index deab7c7e8dc0..329b1fd30749 100644
--- a/security/apparmor/policy_unpack.c
+++ b/security/apparmor/policy_unpack.c
@@ -95,7 +95,7 @@ static int audit_iface(struct aa_profile *new, const char *name,
 	struct aa_profile *profile = __aa_current_profile();
 	struct common_audit_data sa;
 	struct apparmor_audit_data aad = {0,};
-	COMMON_AUDIT_DATA_INIT(&sa, NONE);
+	sa.type = LSM_AUDIT_DATA_NONE;
 	sa.aad = &aad;
 	if (e)
 		aad.iface.pos = e->pos - e->start;
diff --git a/security/apparmor/resource.c b/security/apparmor/resource.c
index 2fe8613efe33..e1f3d7ef2c54 100644
--- a/security/apparmor/resource.c
+++ b/security/apparmor/resource.c
@@ -52,7 +52,7 @@ static int audit_resource(struct aa_profile *profile, unsigned int resource,
 	struct common_audit_data sa;
 	struct apparmor_audit_data aad = {0,};
 
-	COMMON_AUDIT_DATA_INIT(&sa, NONE);
+	sa.type = LSM_AUDIT_DATA_NONE;
 	sa.aad = &aad;
 	aad.op = OP_SETRLIMIT,
 	aad.rlim.rlim = resource;
diff --git a/security/capability.c b/security/capability.c
index 5bb21b1c448c..fca889676c5e 100644
--- a/security/capability.c
+++ b/security/capability.c
@@ -348,7 +348,7 @@ static int cap_file_receive(struct file *file)
 	return 0;
 }
 
-static int cap_dentry_open(struct file *file, const struct cred *cred)
+static int cap_file_open(struct file *file, const struct cred *cred)
 {
 	return 0;
 }
@@ -956,7 +956,7 @@ void __init security_fixup_ops(struct security_operations *ops)
 	set_to_cap_if_null(ops, file_set_fowner);
 	set_to_cap_if_null(ops, file_send_sigiotask);
 	set_to_cap_if_null(ops, file_receive);
-	set_to_cap_if_null(ops, dentry_open);
+	set_to_cap_if_null(ops, file_open);
 	set_to_cap_if_null(ops, task_create);
 	set_to_cap_if_null(ops, task_free);
 	set_to_cap_if_null(ops, cred_alloc_blank);
diff --git a/security/commoncap.c b/security/commoncap.c
index 71a166a05975..f80d11609391 100644
--- a/security/commoncap.c
+++ b/security/commoncap.c
@@ -512,14 +512,17 @@ skip:
 
 
 	/* Don't let someone trace a set[ug]id/setpcap binary with the revised
-	 * credentials unless they have the appropriate permit
+	 * credentials unless they have the appropriate permit.
+	 *
+	 * In addition, if NO_NEW_PRIVS, then ensure we get no new privs.
 	 */
 	if ((new->euid != old->uid ||
 	     new->egid != old->gid ||
 	     !cap_issubset(new->cap_permitted, old->cap_permitted)) &&
 	    bprm->unsafe & ~LSM_UNSAFE_PTRACE_CAP) {
 		/* downgrade; they get no more than they had, and maybe less */
-		if (!capable(CAP_SETUID)) {
+		if (!capable(CAP_SETUID) ||
+		    (bprm->unsafe & LSM_UNSAFE_NO_NEW_PRIVS)) {
 			new->euid = new->uid;
 			new->egid = new->gid;
 		}
diff --git a/security/integrity/ima/ima_main.c b/security/integrity/ima/ima_main.c
index 1eff5cb001e5..b17be79b9cf2 100644
--- a/security/integrity/ima/ima_main.c
+++ b/security/integrity/ima/ima_main.c
@@ -194,7 +194,9 @@ int ima_bprm_check(struct linux_binprm *bprm)
 {
 	int rc;
 
-	rc = process_measurement(bprm->file, bprm->filename,
+	rc = process_measurement(bprm->file,
+				 (strcmp(bprm->filename, bprm->interp) == 0) ?
+				 bprm->filename : bprm->interp,
 				 MAY_EXEC, BPRM_CHECK);
 	return 0;
 }
diff --git a/security/keys/Kconfig b/security/keys/Kconfig
new file mode 100644
index 000000000000..a90d6d300dbd
--- /dev/null
+++ b/security/keys/Kconfig
@@ -0,0 +1,71 @@
+#
+# Key management configuration
+#
+
+config KEYS
+	bool "Enable access key retention support"
+	help
+	  This option provides support for retaining authentication tokens and
+	  access keys in the kernel.
+
+	  It also includes provision of methods by which such keys might be
+	  associated with a process so that network filesystems, encryption
+	  support and the like can find them.
+
+	  Furthermore, a special type of key is available that acts as keyring:
+	  a searchable sequence of keys. Each process is equipped with access
+	  to five standard keyrings: UID-specific, GID-specific, session,
+	  process and thread.
+
+	  If you are unsure as to whether this is required, answer N.
+
+config TRUSTED_KEYS
+	tristate "TRUSTED KEYS"
+	depends on KEYS && TCG_TPM
+	select CRYPTO
+	select CRYPTO_HMAC
+	select CRYPTO_SHA1
+	help
+	  This option provides support for creating, sealing, and unsealing
+	  keys in the kernel. Trusted keys are random number symmetric keys,
+	  generated and RSA-sealed by the TPM. The TPM only unseals the keys,
+	  if the boot PCRs and other criteria match.  Userspace will only ever
+	  see encrypted blobs.
+
+	  If you are unsure as to whether this is required, answer N.
+
+config ENCRYPTED_KEYS
+	tristate "ENCRYPTED KEYS"
+	depends on KEYS
+	select CRYPTO
+	select CRYPTO_HMAC
+	select CRYPTO_AES
+	select CRYPTO_CBC
+	select CRYPTO_SHA256
+	select CRYPTO_RNG
+	help
+	  This option provides support for create/encrypting/decrypting keys
+	  in the kernel.  Encrypted keys are kernel generated random numbers,
+	  which are encrypted/decrypted with a 'master' symmetric key. The
+	  'master' key can be either a trusted-key or user-key type.
+	  Userspace only ever sees/stores encrypted blobs.
+
+	  If you are unsure as to whether this is required, answer N.
+
+config KEYS_DEBUG_PROC_KEYS
+	bool "Enable the /proc/keys file by which keys may be viewed"
+	depends on KEYS
+	help
+	  This option turns on support for the /proc/keys file - through which
+	  can be listed all the keys on the system that are viewable by the
+	  reading process.
+
+	  The only keys included in the list are those that grant View
+	  permission to the reading process whether or not it possesses them.
+	  Note that LSM security checks are still performed, and may further
+	  filter out keys that the current process is not authorised to view.
+
+	  Only key attributes are listed here; key payloads are not included in
+	  the resulting table.
+
+	  If you are unsure as to whether this is required, answer N.
diff --git a/security/keys/Makefile b/security/keys/Makefile
index a56f1ffdc64d..504aaa008388 100644
--- a/security/keys/Makefile
+++ b/security/keys/Makefile
@@ -2,6 +2,9 @@
 # Makefile for key management
 #
 
+#
+# Core
+#
 obj-y := \
 	gc.o \
 	key.o \
@@ -12,9 +15,12 @@ obj-y := \
 	request_key.o \
 	request_key_auth.o \
 	user_defined.o
-
-obj-$(CONFIG_TRUSTED_KEYS) += trusted.o
-obj-$(CONFIG_ENCRYPTED_KEYS) += encrypted-keys/
 obj-$(CONFIG_KEYS_COMPAT) += compat.o
 obj-$(CONFIG_PROC_FS) += proc.o
 obj-$(CONFIG_SYSCTL) += sysctl.o
+
+#
+# Key types
+#
+obj-$(CONFIG_TRUSTED_KEYS) += trusted.o
+obj-$(CONFIG_ENCRYPTED_KEYS) += encrypted-keys/
diff --git a/security/keys/compat.c b/security/keys/compat.c
index 4c48e13448f8..fab4f8dda6c6 100644
--- a/security/keys/compat.c
+++ b/security/keys/compat.c
@@ -135,6 +135,9 @@ asmlinkage long compat_sys_keyctl(u32 option,
 		return compat_keyctl_instantiate_key_iov(
 			arg2, compat_ptr(arg3), arg4, arg5);
 
+	case KEYCTL_INVALIDATE:
+		return keyctl_invalidate_key(arg2);
+
 	default:
 		return -EOPNOTSUPP;
 	}
diff --git a/security/keys/gc.c b/security/keys/gc.c
index a42b45531aac..61ab7c82ebb1 100644
--- a/security/keys/gc.c
+++ b/security/keys/gc.c
@@ -72,6 +72,15 @@ void key_schedule_gc(time_t gc_at)
 }
 
 /*
+ * Schedule a dead links collection run.
+ */
+void key_schedule_gc_links(void)
+{
+	set_bit(KEY_GC_KEY_EXPIRED, &key_gc_flags);
+	queue_work(system_nrt_wq, &key_gc_work);
+}
+
+/*
  * Some key's cleanup time was met after it expired, so we need to get the
  * reaper to go through a cycle finding expired keys.
  */
@@ -79,8 +88,7 @@ static void key_gc_timer_func(unsigned long data)
 {
 	kenter("");
 	key_gc_next_run = LONG_MAX;
-	set_bit(KEY_GC_KEY_EXPIRED, &key_gc_flags);
-	queue_work(system_nrt_wq, &key_gc_work);
+	key_schedule_gc_links();
 }
 
 /*
@@ -131,12 +139,12 @@ void key_gc_keytype(struct key_type *ktype)
 static void key_gc_keyring(struct key *keyring, time_t limit)
 {
 	struct keyring_list *klist;
-	struct key *key;
 	int loop;
 
 	kenter("%x", key_serial(keyring));
 
-	if (test_bit(KEY_FLAG_REVOKED, &keyring->flags))
+	if (keyring->flags & ((1 << KEY_FLAG_INVALIDATED) |
+			      (1 << KEY_FLAG_REVOKED)))
 		goto dont_gc;
 
 	/* scan the keyring looking for dead keys */
@@ -148,9 +156,8 @@ static void key_gc_keyring(struct key *keyring, time_t limit)
 	loop = klist->nkeys;
 	smp_rmb();
 	for (loop--; loop >= 0; loop--) {
-		key = klist->keys[loop];
-		if (test_bit(KEY_FLAG_DEAD, &key->flags) ||
-		    (key->expiry > 0 && key->expiry <= limit))
+		struct key *key = rcu_dereference(klist->keys[loop]);
+		if (key_is_dead(key, limit))
 			goto do_gc;
 	}
 
@@ -168,38 +175,45 @@ do_gc:
 }
 
 /*
- * Garbage collect an unreferenced, detached key
+ * Garbage collect a list of unreferenced, detached keys
  */
-static noinline void key_gc_unused_key(struct key *key)
+static noinline void key_gc_unused_keys(struct list_head *keys)
 {
-	key_check(key);
-
-	security_key_free(key);
-
-	/* deal with the user's key tracking and quota */
-	if (test_bit(KEY_FLAG_IN_QUOTA, &key->flags)) {
-		spin_lock(&key->user->lock);
-		key->user->qnkeys--;
-		key->user->qnbytes -= key->quotalen;
-		spin_unlock(&key->user->lock);
-	}
+	while (!list_empty(keys)) {
+		struct key *key =
+			list_entry(keys->next, struct key, graveyard_link);
+		list_del(&key->graveyard_link);
+
+		kdebug("- %u", key->serial);
+		key_check(key);
+
+		security_key_free(key);
+
+		/* deal with the user's key tracking and quota */
+		if (test_bit(KEY_FLAG_IN_QUOTA, &key->flags)) {
+			spin_lock(&key->user->lock);
+			key->user->qnkeys--;
+			key->user->qnbytes -= key->quotalen;
+			spin_unlock(&key->user->lock);
+		}
 
-	atomic_dec(&key->user->nkeys);
-	if (test_bit(KEY_FLAG_INSTANTIATED, &key->flags))
-		atomic_dec(&key->user->nikeys);
+		atomic_dec(&key->user->nkeys);
+		if (test_bit(KEY_FLAG_INSTANTIATED, &key->flags))
+			atomic_dec(&key->user->nikeys);
 
-	key_user_put(key->user);
+		key_user_put(key->user);
 
-	/* now throw away the key memory */
-	if (key->type->destroy)
-		key->type->destroy(key);
+		/* now throw away the key memory */
+		if (key->type->destroy)
+			key->type->destroy(key);
 
-	kfree(key->description);
+		kfree(key->description);
 
 #ifdef KEY_DEBUGGING
-	key->magic = KEY_DEBUG_MAGIC_X;
+		key->magic = KEY_DEBUG_MAGIC_X;
 #endif
-	kmem_cache_free(key_jar, key);
+		kmem_cache_free(key_jar, key);
+	}
 }
 
 /*
@@ -211,6 +225,7 @@ static noinline void key_gc_unused_key(struct key *key)
  */
 static void key_garbage_collector(struct work_struct *work)
 {
+	static LIST_HEAD(graveyard);
 	static u8 gc_state;		/* Internal persistent state */
 #define KEY_GC_REAP_AGAIN	0x01	/* - Need another cycle */
 #define KEY_GC_REAPING_LINKS	0x02	/* - We need to reap links */
@@ -316,15 +331,22 @@ maybe_resched:
 		key_schedule_gc(new_timer);
 	}
 
-	if (unlikely(gc_state & KEY_GC_REAPING_DEAD_2)) {
-		/* Make sure everyone revalidates their keys if we marked a
-		 * bunch as being dead and make sure all keyring ex-payloads
-		 * are destroyed.
+	if (unlikely(gc_state & KEY_GC_REAPING_DEAD_2) ||
+	    !list_empty(&graveyard)) {
+		/* Make sure that all pending keyring payload destructions are
+		 * fulfilled and that people aren't now looking at dead or
+		 * dying keys that they don't have a reference upon or a link
+		 * to.
 		 */
-		kdebug("dead sync");
+		kdebug("gc sync");
 		synchronize_rcu();
 	}
 
+	if (!list_empty(&graveyard)) {
+		kdebug("gc keys");
+		key_gc_unused_keys(&graveyard);
+	}
+
 	if (unlikely(gc_state & (KEY_GC_REAPING_DEAD_1 |
 				 KEY_GC_REAPING_DEAD_2))) {
 		if (!(gc_state & KEY_GC_FOUND_DEAD_KEY)) {
@@ -359,7 +381,7 @@ found_unreferenced_key:
 	rb_erase(&key->serial_node, &key_serial_tree);
 	spin_unlock(&key_serial_lock);
 
-	key_gc_unused_key(key);
+	list_add_tail(&key->graveyard_link, &graveyard);
 	gc_state |= KEY_GC_REAP_AGAIN;
 	goto maybe_resched;
 
diff --git a/security/keys/internal.h b/security/keys/internal.h
index 65647f825584..f711b094ed41 100644
--- a/security/keys/internal.h
+++ b/security/keys/internal.h
@@ -152,7 +152,8 @@ extern long join_session_keyring(const char *name);
 extern struct work_struct key_gc_work;
 extern unsigned key_gc_delay;
 extern void keyring_gc(struct key *keyring, time_t limit);
-extern void key_schedule_gc(time_t expiry_at);
+extern void key_schedule_gc(time_t gc_at);
+extern void key_schedule_gc_links(void);
 extern void key_gc_keytype(struct key_type *ktype);
 
 extern int key_task_permission(const key_ref_t key_ref,
@@ -197,6 +198,17 @@ extern struct key *request_key_auth_new(struct key *target,
 extern struct key *key_get_instantiation_authkey(key_serial_t target_id);
 
 /*
+ * Determine whether a key is dead.
+ */
+static inline bool key_is_dead(struct key *key, time_t limit)
+{
+	return
+		key->flags & ((1 << KEY_FLAG_DEAD) |
+			      (1 << KEY_FLAG_INVALIDATED)) ||
+		(key->expiry > 0 && key->expiry <= limit);
+}
+
+/*
  * keyctl() functions
  */
 extern long keyctl_get_keyring_ID(key_serial_t, int);
@@ -225,6 +237,7 @@ extern long keyctl_reject_key(key_serial_t, unsigned, unsigned, key_serial_t);
 extern long keyctl_instantiate_key_iov(key_serial_t,
 				       const struct iovec __user *,
 				       unsigned, key_serial_t);
+extern long keyctl_invalidate_key(key_serial_t);
 
 extern long keyctl_instantiate_key_common(key_serial_t,
 					  const struct iovec __user *,
diff --git a/security/keys/key.c b/security/keys/key.c
index 06783cffb3af..c9bf66ac36e0 100644
--- a/security/keys/key.c
+++ b/security/keys/key.c
@@ -955,6 +955,28 @@ void key_revoke(struct key *key)
 EXPORT_SYMBOL(key_revoke);
 
 /**
+ * key_invalidate - Invalidate a key.
+ * @key: The key to be invalidated.
+ *
+ * Mark a key as being invalidated and have it cleaned up immediately.  The key
+ * is ignored by all searches and other operations from this point.
+ */
+void key_invalidate(struct key *key)
+{
+	kenter("%d", key_serial(key));
+
+	key_check(key);
+
+	if (!test_bit(KEY_FLAG_INVALIDATED, &key->flags)) {
+		down_write_nested(&key->sem, 1);
+		if (!test_and_set_bit(KEY_FLAG_INVALIDATED, &key->flags))
+			key_schedule_gc_links();
+		up_write(&key->sem);
+	}
+}
+EXPORT_SYMBOL(key_invalidate);
+
+/**
  * register_key_type - Register a type of key.
  * @ktype: The new key type.
  *
@@ -980,6 +1002,8 @@ int register_key_type(struct key_type *ktype)
 
 	/* store the type */
 	list_add(&ktype->link, &key_types_list);
+
+	pr_notice("Key type %s registered\n", ktype->name);
 	ret = 0;
 
 out:
@@ -1002,6 +1026,7 @@ void unregister_key_type(struct key_type *ktype)
 	list_del_init(&ktype->link);
 	downgrade_write(&key_types_sem);
 	key_gc_keytype(ktype);
+	pr_notice("Key type %s unregistered\n", ktype->name);
 	up_read(&key_types_sem);
 }
 EXPORT_SYMBOL(unregister_key_type);
diff --git a/security/keys/keyctl.c b/security/keys/keyctl.c
index fb767c6cd99f..ddb3e05bc5fc 100644
--- a/security/keys/keyctl.c
+++ b/security/keys/keyctl.c
@@ -375,6 +375,37 @@ error:
 }
 
 /*
+ * Invalidate a key.
+ *
+ * The key must be grant the caller Invalidate permission for this to work.
+ * The key and any links to the key will be automatically garbage collected
+ * immediately.
+ *
+ * If successful, 0 is returned.
+ */
+long keyctl_invalidate_key(key_serial_t id)
+{
+	key_ref_t key_ref;
+	long ret;
+
+	kenter("%d", id);
+
+	key_ref = lookup_user_key(id, 0, KEY_SEARCH);
+	if (IS_ERR(key_ref)) {
+		ret = PTR_ERR(key_ref);
+		goto error;
+	}
+
+	key_invalidate(key_ref_to_ptr(key_ref));
+	ret = 0;
+
+	key_ref_put(key_ref);
+error:
+	kleave(" = %ld", ret);
+	return ret;
+}
+
+/*
  * Clear the specified keyring, creating an empty process keyring if one of the
  * special keyring IDs is used.
  *
@@ -1622,6 +1653,9 @@ SYSCALL_DEFINE5(keyctl, int, option, unsigned long, arg2, unsigned long, arg3,
 			(unsigned) arg4,
 			(key_serial_t) arg5);
 
+	case KEYCTL_INVALIDATE:
+		return keyctl_invalidate_key((key_serial_t) arg2);
+
 	default:
 		return -EOPNOTSUPP;
 	}
diff --git a/security/keys/keyring.c b/security/keys/keyring.c
index d605f75292e4..7445875f6818 100644
--- a/security/keys/keyring.c
+++ b/security/keys/keyring.c
@@ -25,6 +25,15 @@
 		(keyring)->payload.subscriptions,			\
 		rwsem_is_locked((struct rw_semaphore *)&(keyring)->sem)))
 
+#define rcu_deref_link_locked(klist, index, keyring)			\
+	(rcu_dereference_protected(					\
+		(klist)->keys[index],					\
+		rwsem_is_locked((struct rw_semaphore *)&(keyring)->sem)))
+
+#define MAX_KEYRING_LINKS						\
+	min_t(size_t, USHRT_MAX - 1,					\
+	      ((PAGE_SIZE - sizeof(struct keyring_list)) / sizeof(struct key *)))
+
 #define KEY_LINK_FIXQUOTA 1UL
 
 /*
@@ -138,6 +147,11 @@ static int keyring_match(const struct key *keyring, const void *description)
 /*
  * Clean up a keyring when it is destroyed.  Unpublish its name if it had one
  * and dispose of its data.
+ *
+ * The garbage collector detects the final key_put(), removes the keyring from
+ * the serial number tree and then does RCU synchronisation before coming here,
+ * so we shouldn't need to worry about code poking around here with the RCU
+ * readlock held by this time.
  */
 static void keyring_destroy(struct key *keyring)
 {
@@ -154,11 +168,10 @@ static void keyring_destroy(struct key *keyring)
 		write_unlock(&keyring_name_lock);
 	}
 
-	klist = rcu_dereference_check(keyring->payload.subscriptions,
-				      atomic_read(&keyring->usage) == 0);
+	klist = rcu_access_pointer(keyring->payload.subscriptions);
 	if (klist) {
 		for (loop = klist->nkeys - 1; loop >= 0; loop--)
-			key_put(klist->keys[loop]);
+			key_put(rcu_access_pointer(klist->keys[loop]));
 		kfree(klist);
 	}
 }
@@ -214,7 +227,8 @@ static long keyring_read(const struct key *keyring,
 			ret = -EFAULT;
 
 			for (loop = 0; loop < klist->nkeys; loop++) {
-				key = klist->keys[loop];
+				key = rcu_deref_link_locked(klist, loop,
+							    keyring);
 
 				tmp = sizeof(key_serial_t);
 				if (tmp > buflen)
@@ -309,6 +323,8 @@ key_ref_t keyring_search_aux(key_ref_t keyring_ref,
 			     bool no_state_check)
 {
 	struct {
+		/* Need a separate keylist pointer for RCU purposes */
+		struct key *keyring;
 		struct keyring_list *keylist;
 		int kix;
 	} stack[KEYRING_SEARCH_MAX_DEPTH];
@@ -366,13 +382,17 @@ key_ref_t keyring_search_aux(key_ref_t keyring_ref,
 	/* otherwise, the top keyring must not be revoked, expired, or
 	 * negatively instantiated if we are to search it */
 	key_ref = ERR_PTR(-EAGAIN);
-	if (kflags & ((1 << KEY_FLAG_REVOKED) | (1 << KEY_FLAG_NEGATIVE)) ||
+	if (kflags & ((1 << KEY_FLAG_INVALIDATED) |
+		      (1 << KEY_FLAG_REVOKED) |
+		      (1 << KEY_FLAG_NEGATIVE)) ||
 	    (keyring->expiry && now.tv_sec >= keyring->expiry))
 		goto error_2;
 
 	/* start processing a new keyring */
 descend:
-	if (test_bit(KEY_FLAG_REVOKED, &keyring->flags))
+	kflags = keyring->flags;
+	if (kflags & ((1 << KEY_FLAG_INVALIDATED) |
+		      (1 << KEY_FLAG_REVOKED)))
 		goto not_this_keyring;
 
 	keylist = rcu_dereference(keyring->payload.subscriptions);
@@ -383,16 +403,17 @@ descend:
 	nkeys = keylist->nkeys;
 	smp_rmb();
 	for (kix = 0; kix < nkeys; kix++) {
-		key = keylist->keys[kix];
+		key = rcu_dereference(keylist->keys[kix]);
 		kflags = key->flags;
 
 		/* ignore keys not of this type */
 		if (key->type != type)
 			continue;
 
-		/* skip revoked keys and expired keys */
+		/* skip invalidated, revoked and expired keys */
 		if (!no_state_check) {
-			if (kflags & (1 << KEY_FLAG_REVOKED))
+			if (kflags & ((1 << KEY_FLAG_INVALIDATED) |
+				      (1 << KEY_FLAG_REVOKED)))
 				continue;
 
 			if (key->expiry && now.tv_sec >= key->expiry)
@@ -426,7 +447,7 @@ ascend:
 	nkeys = keylist->nkeys;
 	smp_rmb();
 	for (; kix < nkeys; kix++) {
-		key = keylist->keys[kix];
+		key = rcu_dereference(keylist->keys[kix]);
 		if (key->type != &key_type_keyring)
 			continue;
 
@@ -441,6 +462,7 @@ ascend:
 			continue;
 
 		/* stack the current position */
+		stack[sp].keyring = keyring;
 		stack[sp].keylist = keylist;
 		stack[sp].kix = kix;
 		sp++;
@@ -456,6 +478,7 @@ not_this_keyring:
 	if (sp > 0) {
 		/* resume the processing of a keyring higher up in the tree */
 		sp--;
+		keyring = stack[sp].keyring;
 		keylist = stack[sp].keylist;
 		kix = stack[sp].kix + 1;
 		goto ascend;
@@ -467,6 +490,10 @@ not_this_keyring:
 	/* we found a viable match */
 found:
 	atomic_inc(&key->usage);
+	key->last_used_at = now.tv_sec;
+	keyring->last_used_at = now.tv_sec;
+	while (sp > 0)
+		stack[--sp].keyring->last_used_at = now.tv_sec;
 	key_check(key);
 	key_ref = make_key_ref(key, possessed);
 error_2:
@@ -531,14 +558,14 @@ key_ref_t __keyring_search_one(key_ref_t keyring_ref,
 		nkeys = klist->nkeys;
 		smp_rmb();
 		for (loop = 0; loop < nkeys ; loop++) {
-			key = klist->keys[loop];
-
+			key = rcu_dereference(klist->keys[loop]);
 			if (key->type == ktype &&
 			    (!key->type->match ||
 			     key->type->match(key, description)) &&
 			    key_permission(make_key_ref(key, possessed),
 					   perm) == 0 &&
-			    !test_bit(KEY_FLAG_REVOKED, &key->flags)
+			    !(key->flags & ((1 << KEY_FLAG_INVALIDATED) |
+					    (1 << KEY_FLAG_REVOKED)))
 			    )
 				goto found;
 		}
@@ -549,6 +576,8 @@ key_ref_t __keyring_search_one(key_ref_t keyring_ref,
 
 found:
 	atomic_inc(&key->usage);
+	keyring->last_used_at = key->last_used_at =
+		current_kernel_time().tv_sec;
 	rcu_read_unlock();
 	return make_key_ref(key, possessed);
 }
@@ -602,6 +631,7 @@ struct key *find_keyring_by_name(const char *name, bool skip_perm_check)
 			 * (ie. it has a zero usage count) */
 			if (!atomic_inc_not_zero(&keyring->usage))
 				continue;
+			keyring->last_used_at = current_kernel_time().tv_sec;
 			goto out;
 		}
 	}
@@ -654,7 +684,7 @@ ascend:
 	nkeys = keylist->nkeys;
 	smp_rmb();
 	for (; kix < nkeys; kix++) {
-		key = keylist->keys[kix];
+		key = rcu_dereference(keylist->keys[kix]);
 
 		if (key == A)
 			goto cycle_detected;
@@ -711,7 +741,7 @@ static void keyring_unlink_rcu_disposal(struct rcu_head *rcu)
 		container_of(rcu, struct keyring_list, rcu);
 
 	if (klist->delkey != USHRT_MAX)
-		key_put(klist->keys[klist->delkey]);
+		key_put(rcu_access_pointer(klist->keys[klist->delkey]));
 	kfree(klist);
 }
 
@@ -725,8 +755,9 @@ int __key_link_begin(struct key *keyring, const struct key_type *type,
 	struct keyring_list *klist, *nklist;
 	unsigned long prealloc;
 	unsigned max;
+	time_t lowest_lru;
 	size_t size;
-	int loop, ret;
+	int loop, lru, ret;
 
 	kenter("%d,%s,%s,", key_serial(keyring), type->name, description);
 
@@ -747,31 +778,39 @@ int __key_link_begin(struct key *keyring, const struct key_type *type,
 	klist = rcu_dereference_locked_keyring(keyring);
 
 	/* see if there's a matching key we can displace */
+	lru = -1;
 	if (klist && klist->nkeys > 0) {
+		lowest_lru = TIME_T_MAX;
 		for (loop = klist->nkeys - 1; loop >= 0; loop--) {
-			if (klist->keys[loop]->type == type &&
-			    strcmp(klist->keys[loop]->description,
-				   description) == 0
-			    ) {
-				/* found a match - we'll replace this one with
-				 * the new key */
-				size = sizeof(struct key *) * klist->maxkeys;
-				size += sizeof(*klist);
-				BUG_ON(size > PAGE_SIZE);
-
-				ret = -ENOMEM;
-				nklist = kmemdup(klist, size, GFP_KERNEL);
-				if (!nklist)
-					goto error_sem;
-
-				/* note replacement slot */
-				klist->delkey = nklist->delkey = loop;
-				prealloc = (unsigned long)nklist;
+			struct key *key = rcu_deref_link_locked(klist, loop,
+								keyring);
+			if (key->type == type &&
+			    strcmp(key->description, description) == 0) {
+				/* Found a match - we'll replace the link with
+				 * one to the new key.  We record the slot
+				 * position.
+				 */
+				klist->delkey = loop;
+				prealloc = 0;
 				goto done;
 			}
+			if (key->last_used_at < lowest_lru) {
+				lowest_lru = key->last_used_at;
+				lru = loop;
+			}
 		}
 	}
 
+	/* If the keyring is full then do an LRU discard */
+	if (klist &&
+	    klist->nkeys == klist->maxkeys &&
+	    klist->maxkeys >= MAX_KEYRING_LINKS) {
+		kdebug("LRU discard %d\n", lru);
+		klist->delkey = lru;
+		prealloc = 0;
+		goto done;
+	}
+
 	/* check that we aren't going to overrun the user's quota */
 	ret = key_payload_reserve(keyring,
 				  keyring->datalen + KEYQUOTA_LINK_BYTES);
@@ -780,20 +819,19 @@ int __key_link_begin(struct key *keyring, const struct key_type *type,
 
 	if (klist && klist->nkeys < klist->maxkeys) {
 		/* there's sufficient slack space to append directly */
-		nklist = NULL;
+		klist->delkey = klist->nkeys;
 		prealloc = KEY_LINK_FIXQUOTA;
 	} else {
 		/* grow the key list */
 		max = 4;
-		if (klist)
+		if (klist) {
 			max += klist->maxkeys;
+			if (max > MAX_KEYRING_LINKS)
+				max = MAX_KEYRING_LINKS;
+			BUG_ON(max <= klist->maxkeys);
+		}
 
-		ret = -ENFILE;
-		if (max > USHRT_MAX - 1)
-			goto error_quota;
 		size = sizeof(*klist) + sizeof(struct key *) * max;
-		if (size > PAGE_SIZE)
-			goto error_quota;
 
 		ret = -ENOMEM;
 		nklist = kmalloc(size, GFP_KERNEL);
@@ -813,10 +851,10 @@ int __key_link_begin(struct key *keyring, const struct key_type *type,
 		}
 
 		/* add the key into the new space */
-		nklist->keys[nklist->delkey] = NULL;
+		RCU_INIT_POINTER(nklist->keys[nklist->delkey], NULL);
+		prealloc = (unsigned long)nklist | KEY_LINK_FIXQUOTA;
 	}
 
-	prealloc = (unsigned long)nklist | KEY_LINK_FIXQUOTA;
 done:
 	*_prealloc = prealloc;
 	kleave(" = 0");
@@ -862,6 +900,7 @@ void __key_link(struct key *keyring, struct key *key,
 		unsigned long *_prealloc)
 {
 	struct keyring_list *klist, *nklist;
+	struct key *discard;
 
 	nklist = (struct keyring_list *)(*_prealloc & ~KEY_LINK_FIXQUOTA);
 	*_prealloc = 0;
@@ -871,14 +910,16 @@ void __key_link(struct key *keyring, struct key *key,
 	klist = rcu_dereference_locked_keyring(keyring);
 
 	atomic_inc(&key->usage);
+	keyring->last_used_at = key->last_used_at =
+		current_kernel_time().tv_sec;
 
 	/* there's a matching key we can displace or an empty slot in a newly
 	 * allocated list we can fill */
 	if (nklist) {
-		kdebug("replace %hu/%hu/%hu",
+		kdebug("reissue %hu/%hu/%hu",
 		       nklist->delkey, nklist->nkeys, nklist->maxkeys);
 
-		nklist->keys[nklist->delkey] = key;
+		RCU_INIT_POINTER(nklist->keys[nklist->delkey], key);
 
 		rcu_assign_pointer(keyring->payload.subscriptions, nklist);
 
@@ -889,9 +930,23 @@ void __key_link(struct key *keyring, struct key *key,
 			       klist->delkey, klist->nkeys, klist->maxkeys);
 			call_rcu(&klist->rcu, keyring_unlink_rcu_disposal);
 		}
+	} else if (klist->delkey < klist->nkeys) {
+		kdebug("replace %hu/%hu/%hu",
+		       klist->delkey, klist->nkeys, klist->maxkeys);
+
+		discard = rcu_dereference_protected(
+			klist->keys[klist->delkey],
+			rwsem_is_locked(&keyring->sem));
+		rcu_assign_pointer(klist->keys[klist->delkey], key);
+		/* The garbage collector will take care of RCU
+		 * synchronisation */
+		key_put(discard);
 	} else {
 		/* there's sufficient slack space to append directly */
-		klist->keys[klist->nkeys] = key;
+		kdebug("append %hu/%hu/%hu",
+		       klist->delkey, klist->nkeys, klist->maxkeys);
+
+		RCU_INIT_POINTER(klist->keys[klist->delkey], key);
 		smp_wmb();
 		klist->nkeys++;
 	}
@@ -998,7 +1053,7 @@ int key_unlink(struct key *keyring, struct key *key)
 	if (klist) {
 		/* search the keyring for the key */
 		for (loop = 0; loop < klist->nkeys; loop++)
-			if (klist->keys[loop] == key)
+			if (rcu_access_pointer(klist->keys[loop]) == key)
 				goto key_is_present;
 	}
 
@@ -1061,7 +1116,7 @@ static void keyring_clear_rcu_disposal(struct rcu_head *rcu)
 	klist = container_of(rcu, struct keyring_list, rcu);
 
 	for (loop = klist->nkeys - 1; loop >= 0; loop--)
-		key_put(klist->keys[loop]);
+		key_put(rcu_access_pointer(klist->keys[loop]));
 
 	kfree(klist);
 }
@@ -1128,15 +1183,6 @@ static void keyring_revoke(struct key *keyring)
 }
 
 /*
- * Determine whether a key is dead.
- */
-static bool key_is_dead(struct key *key, time_t limit)
-{
-	return test_bit(KEY_FLAG_DEAD, &key->flags) ||
-		(key->expiry > 0 && key->expiry <= limit);
-}
-
-/*
  * Collect garbage from the contents of a keyring, replacing the old list with
  * a new one with the pointers all shuffled down.
  *
@@ -1161,7 +1207,8 @@ void keyring_gc(struct key *keyring, time_t limit)
 	/* work out how many subscriptions we're keeping */
 	keep = 0;
 	for (loop = klist->nkeys - 1; loop >= 0; loop--)
-		if (!key_is_dead(klist->keys[loop], limit))
+		if (!key_is_dead(rcu_deref_link_locked(klist, loop, keyring),
+				 limit))
 			keep++;
 
 	if (keep == klist->nkeys)
@@ -1182,11 +1229,11 @@ void keyring_gc(struct key *keyring, time_t limit)
 	 */
 	keep = 0;
 	for (loop = klist->nkeys - 1; loop >= 0; loop--) {
-		key = klist->keys[loop];
+		key = rcu_deref_link_locked(klist, loop, keyring);
 		if (!key_is_dead(key, limit)) {
 			if (keep >= max)
 				goto discard_new;
-			new->keys[keep++] = key_get(key);
+			RCU_INIT_POINTER(new->keys[keep++], key_get(key));
 		}
 	}
 	new->nkeys = keep;
diff --git a/security/keys/permission.c b/security/keys/permission.c
index c35b5229e3cd..57d96363d7f1 100644
--- a/security/keys/permission.c
+++ b/security/keys/permission.c
@@ -87,32 +87,29 @@ EXPORT_SYMBOL(key_task_permission);
  * key_validate - Validate a key.
  * @key: The key to be validated.
  *
- * Check that a key is valid, returning 0 if the key is okay, -EKEYREVOKED if
- * the key's type has been removed or if the key has been revoked or
- * -EKEYEXPIRED if the key has expired.
+ * Check that a key is valid, returning 0 if the key is okay, -ENOKEY if the
+ * key is invalidated, -EKEYREVOKED if the key's type has been removed or if
+ * the key has been revoked or -EKEYEXPIRED if the key has expired.
  */
-int key_validate(struct key *key)
+int key_validate(const struct key *key)
 {
-	struct timespec now;
-	int ret = 0;
-
-	if (key) {
-		/* check it's still accessible */
-		ret = -EKEYREVOKED;
-		if (test_bit(KEY_FLAG_REVOKED, &key->flags) ||
-		    test_bit(KEY_FLAG_DEAD, &key->flags))
-			goto error;
-
-		/* check it hasn't expired */
-		ret = 0;
-		if (key->expiry) {
-			now = current_kernel_time();
-			if (now.tv_sec >= key->expiry)
-				ret = -EKEYEXPIRED;
-		}
+	unsigned long flags = key->flags;
+
+	if (flags & (1 << KEY_FLAG_INVALIDATED))
+		return -ENOKEY;
+
+	/* check it's still accessible */
+	if (flags & ((1 << KEY_FLAG_REVOKED) |
+		     (1 << KEY_FLAG_DEAD)))
+		return -EKEYREVOKED;
+
+	/* check it hasn't expired */
+	if (key->expiry) {
+		struct timespec now = current_kernel_time();
+		if (now.tv_sec >= key->expiry)
+			return -EKEYEXPIRED;
 	}
 
-error:
-	return ret;
+	return 0;
 }
 EXPORT_SYMBOL(key_validate);
diff --git a/security/keys/proc.c b/security/keys/proc.c
index 49bbc97943ad..30d1ddfd9cef 100644
--- a/security/keys/proc.c
+++ b/security/keys/proc.c
@@ -242,7 +242,7 @@ static int proc_keys_show(struct seq_file *m, void *v)
 #define showflag(KEY, LETTER, FLAG) \
 	(test_bit(FLAG,	&(KEY)->flags) ? LETTER : '-')
 
-	seq_printf(m, "%08x %c%c%c%c%c%c %5d %4s %08x %5d %5d %-9.9s ",
+	seq_printf(m, "%08x %c%c%c%c%c%c%c %5d %4s %08x %5d %5d %-9.9s ",
 		   key->serial,
 		   showflag(key, 'I', KEY_FLAG_INSTANTIATED),
 		   showflag(key, 'R', KEY_FLAG_REVOKED),
@@ -250,6 +250,7 @@ static int proc_keys_show(struct seq_file *m, void *v)
 		   showflag(key, 'Q', KEY_FLAG_IN_QUOTA),
 		   showflag(key, 'U', KEY_FLAG_USER_CONSTRUCT),
 		   showflag(key, 'N', KEY_FLAG_NEGATIVE),
+		   showflag(key, 'i', KEY_FLAG_INVALIDATED),
 		   atomic_read(&key->usage),
 		   xbuf,
 		   key->perm,
diff --git a/security/keys/process_keys.c b/security/keys/process_keys.c
index be7ecb2018dd..e137fcd7042c 100644
--- a/security/keys/process_keys.c
+++ b/security/keys/process_keys.c
@@ -732,6 +732,8 @@ try_again:
 	if (ret < 0)
 		goto invalid_key;
 
+	key->last_used_at = current_kernel_time().tv_sec;
+
 error:
 	put_cred(cred);
 	return key_ref;
diff --git a/security/lsm_audit.c b/security/lsm_audit.c
index 90c129b0102f..8d8d97dbb389 100644
--- a/security/lsm_audit.c
+++ b/security/lsm_audit.c
@@ -213,12 +213,15 @@ static void dump_common_audit_data(struct audit_buffer *ab,
 {
 	struct task_struct *tsk = current;
 
-	if (a->tsk)
-		tsk = a->tsk;
-	if (tsk && tsk->pid) {
-		audit_log_format(ab, " pid=%d comm=", tsk->pid);
-		audit_log_untrustedstring(ab, tsk->comm);
-	}
+	/*
+	 * To keep stack sizes in check force programers to notice if they
+	 * start making this union too large!  See struct lsm_network_audit
+	 * as an example of how to deal with large data.
+	 */
+	BUILD_BUG_ON(sizeof(a->u) > sizeof(void *)*2);
+
+	audit_log_format(ab, " pid=%d comm=", tsk->pid);
+	audit_log_untrustedstring(ab, tsk->comm);
 
 	switch (a->type) {
 	case LSM_AUDIT_DATA_NONE:
diff --git a/security/security.c b/security/security.c
index bf619ffc9a4d..5497a57fba01 100644
--- a/security/security.c
+++ b/security/security.c
@@ -701,11 +701,11 @@ int security_file_receive(struct file *file)
 	return security_ops->file_receive(file);
 }
 
-int security_dentry_open(struct file *file, const struct cred *cred)
+int security_file_open(struct file *file, const struct cred *cred)
 {
 	int ret;
 
-	ret = security_ops->dentry_open(file, cred);
+	ret = security_ops->file_open(file, cred);
 	if (ret)
 		return ret;
 
diff --git a/security/selinux/avc.c b/security/selinux/avc.c
index 8ee42b2a5f19..68d82daed257 100644
--- a/security/selinux/avc.c
+++ b/security/selinux/avc.c
@@ -65,14 +65,8 @@ struct avc_cache {
 };
 
 struct avc_callback_node {
-	int (*callback) (u32 event, u32 ssid, u32 tsid,
-			 u16 tclass, u32 perms,
-			 u32 *out_retained);
+	int (*callback) (u32 event);
 	u32 events;
-	u32 ssid;
-	u32 tsid;
-	u16 tclass;
-	u32 perms;
 	struct avc_callback_node *next;
 };
 
@@ -436,9 +430,9 @@ static void avc_audit_pre_callback(struct audit_buffer *ab, void *a)
 {
 	struct common_audit_data *ad = a;
 	audit_log_format(ab, "avc:  %s ",
-			 ad->selinux_audit_data->slad->denied ? "denied" : "granted");
-	avc_dump_av(ab, ad->selinux_audit_data->slad->tclass,
-			ad->selinux_audit_data->slad->audited);
+			 ad->selinux_audit_data->denied ? "denied" : "granted");
+	avc_dump_av(ab, ad->selinux_audit_data->tclass,
+			ad->selinux_audit_data->audited);
 	audit_log_format(ab, " for ");
 }
 
@@ -452,25 +446,23 @@ static void avc_audit_post_callback(struct audit_buffer *ab, void *a)
 {
 	struct common_audit_data *ad = a;
 	audit_log_format(ab, " ");
-	avc_dump_query(ab, ad->selinux_audit_data->slad->ssid,
-			   ad->selinux_audit_data->slad->tsid,
-			   ad->selinux_audit_data->slad->tclass);
+	avc_dump_query(ab, ad->selinux_audit_data->ssid,
+			   ad->selinux_audit_data->tsid,
+			   ad->selinux_audit_data->tclass);
 }
 
 /* This is the slow part of avc audit with big stack footprint */
-static noinline int slow_avc_audit(u32 ssid, u32 tsid, u16 tclass,
+noinline int slow_avc_audit(u32 ssid, u32 tsid, u16 tclass,
 		u32 requested, u32 audited, u32 denied,
 		struct common_audit_data *a,
 		unsigned flags)
 {
 	struct common_audit_data stack_data;
-	struct selinux_audit_data sad = {0,};
-	struct selinux_late_audit_data slad;
+	struct selinux_audit_data sad;
 
 	if (!a) {
 		a = &stack_data;
-		COMMON_AUDIT_DATA_INIT(a, NONE);
-		a->selinux_audit_data = &sad;
+		a->type = LSM_AUDIT_DATA_NONE;
 	}
 
 	/*
@@ -484,104 +476,34 @@ static noinline int slow_avc_audit(u32 ssid, u32 tsid, u16 tclass,
 	    (flags & MAY_NOT_BLOCK))
 		return -ECHILD;
 
-	slad.tclass = tclass;
-	slad.requested = requested;
-	slad.ssid = ssid;
-	slad.tsid = tsid;
-	slad.audited = audited;
-	slad.denied = denied;
+	sad.tclass = tclass;
+	sad.requested = requested;
+	sad.ssid = ssid;
+	sad.tsid = tsid;
+	sad.audited = audited;
+	sad.denied = denied;
+
+	a->selinux_audit_data = &sad;
 
-	a->selinux_audit_data->slad = &slad;
 	common_lsm_audit(a, avc_audit_pre_callback, avc_audit_post_callback);
 	return 0;
 }
 
 /**
- * avc_audit - Audit the granting or denial of permissions.
- * @ssid: source security identifier
- * @tsid: target security identifier
- * @tclass: target security class
- * @requested: requested permissions
- * @avd: access vector decisions
- * @result: result from avc_has_perm_noaudit
- * @a:  auxiliary audit data
- * @flags: VFS walk flags
- *
- * Audit the granting or denial of permissions in accordance
- * with the policy.  This function is typically called by
- * avc_has_perm() after a permission check, but can also be
- * called directly by callers who use avc_has_perm_noaudit()
- * in order to separate the permission check from the auditing.
- * For example, this separation is useful when the permission check must
- * be performed under a lock, to allow the lock to be released
- * before calling the auditing code.
- */
-inline int avc_audit(u32 ssid, u32 tsid,
-	       u16 tclass, u32 requested,
-	       struct av_decision *avd, int result, struct common_audit_data *a,
-	       unsigned flags)
-{
-	u32 denied, audited;
-	denied = requested & ~avd->allowed;
-	if (unlikely(denied)) {
-		audited = denied & avd->auditdeny;
-		/*
-		 * a->selinux_audit_data->auditdeny is TRICKY!  Setting a bit in
-		 * this field means that ANY denials should NOT be audited if
-		 * the policy contains an explicit dontaudit rule for that
-		 * permission.  Take notice that this is unrelated to the
-		 * actual permissions that were denied.  As an example lets
-		 * assume:
-		 *
-		 * denied == READ
-		 * avd.auditdeny & ACCESS == 0 (not set means explicit rule)
-		 * selinux_audit_data->auditdeny & ACCESS == 1
-		 *
-		 * We will NOT audit the denial even though the denied
-		 * permission was READ and the auditdeny checks were for
-		 * ACCESS
-		 */
-		if (a &&
-		    a->selinux_audit_data->auditdeny &&
-		    !(a->selinux_audit_data->auditdeny & avd->auditdeny))
-			audited = 0;
-	} else if (result)
-		audited = denied = requested;
-	else
-		audited = requested & avd->auditallow;
-	if (likely(!audited))
-		return 0;
-
-	return slow_avc_audit(ssid, tsid, tclass,
-		requested, audited, denied,
-		a, flags);
-}
-
-/**
  * avc_add_callback - Register a callback for security events.
  * @callback: callback function
  * @events: security events
- * @ssid: source security identifier or %SECSID_WILD
- * @tsid: target security identifier or %SECSID_WILD
- * @tclass: target security class
- * @perms: permissions
  *
- * Register a callback function for events in the set @events
- * related to the SID pair (@ssid, @tsid) 
- * and the permissions @perms, interpreting
- * @perms based on @tclass.  Returns %0 on success or
- * -%ENOMEM if insufficient memory exists to add the callback.
+ * Register a callback function for events in the set @events.
+ * Returns %0 on success or -%ENOMEM if insufficient memory
+ * exists to add the callback.
  */
-int avc_add_callback(int (*callback)(u32 event, u32 ssid, u32 tsid,
-				     u16 tclass, u32 perms,
-				     u32 *out_retained),
-		     u32 events, u32 ssid, u32 tsid,
-		     u16 tclass, u32 perms)
+int __init avc_add_callback(int (*callback)(u32 event), u32 events)
 {
 	struct avc_callback_node *c;
 	int rc = 0;
 
-	c = kmalloc(sizeof(*c), GFP_ATOMIC);
+	c = kmalloc(sizeof(*c), GFP_KERNEL);
 	if (!c) {
 		rc = -ENOMEM;
 		goto out;
@@ -589,9 +511,6 @@ int avc_add_callback(int (*callback)(u32 event, u32 ssid, u32 tsid,
 
 	c->callback = callback;
 	c->events = events;
-	c->ssid = ssid;
-	c->tsid = tsid;
-	c->perms = perms;
 	c->next = avc_callbacks;
 	avc_callbacks = c;
 out:
@@ -731,8 +650,7 @@ int avc_ss_reset(u32 seqno)
 
 	for (c = avc_callbacks; c; c = c->next) {
 		if (c->events & AVC_CALLBACK_RESET) {
-			tmprc = c->callback(AVC_CALLBACK_RESET,
-					    0, 0, 0, 0, NULL);
+			tmprc = c->callback(AVC_CALLBACK_RESET);
 			/* save the first error encountered for the return
 			   value and continue processing the callbacks */
 			if (!rc)
diff --git a/security/selinux/hooks.c b/security/selinux/hooks.c
index d85b793c9321..fa2341b68331 100644
--- a/security/selinux/hooks.c
+++ b/security/selinux/hooks.c
@@ -1420,16 +1420,13 @@ static int cred_has_capability(const struct cred *cred,
 			       int cap, int audit)
 {
 	struct common_audit_data ad;
-	struct selinux_audit_data sad = {0,};
 	struct av_decision avd;
 	u16 sclass;
 	u32 sid = cred_sid(cred);
 	u32 av = CAP_TO_MASK(cap);
 	int rc;
 
-	COMMON_AUDIT_DATA_INIT(&ad, CAP);
-	ad.selinux_audit_data = &sad;
-	ad.tsk = current;
+	ad.type = LSM_AUDIT_DATA_CAP;
 	ad.u.cap = cap;
 
 	switch (CAP_TO_INDEX(cap)) {
@@ -1488,20 +1485,6 @@ static int inode_has_perm(const struct cred *cred,
 	return avc_has_perm_flags(sid, isec->sid, isec->sclass, perms, adp, flags);
 }
 
-static int inode_has_perm_noadp(const struct cred *cred,
-				struct inode *inode,
-				u32 perms,
-				unsigned flags)
-{
-	struct common_audit_data ad;
-	struct selinux_audit_data sad = {0,};
-
-	COMMON_AUDIT_DATA_INIT(&ad, INODE);
-	ad.u.inode = inode;
-	ad.selinux_audit_data = &sad;
-	return inode_has_perm(cred, inode, perms, &ad, flags);
-}
-
 /* Same as inode_has_perm, but pass explicit audit data containing
    the dentry to help the auditing code to more easily generate the
    pathname if needed. */
@@ -1511,11 +1494,9 @@ static inline int dentry_has_perm(const struct cred *cred,
 {
 	struct inode *inode = dentry->d_inode;
 	struct common_audit_data ad;
-	struct selinux_audit_data sad = {0,};
 
-	COMMON_AUDIT_DATA_INIT(&ad, DENTRY);
+	ad.type = LSM_AUDIT_DATA_DENTRY;
 	ad.u.dentry = dentry;
-	ad.selinux_audit_data = &sad;
 	return inode_has_perm(cred, inode, av, &ad, 0);
 }
 
@@ -1528,11 +1509,9 @@ static inline int path_has_perm(const struct cred *cred,
 {
 	struct inode *inode = path->dentry->d_inode;
 	struct common_audit_data ad;
-	struct selinux_audit_data sad = {0,};
 
-	COMMON_AUDIT_DATA_INIT(&ad, PATH);
+	ad.type = LSM_AUDIT_DATA_PATH;
 	ad.u.path = *path;
-	ad.selinux_audit_data = &sad;
 	return inode_has_perm(cred, inode, av, &ad, 0);
 }
 
@@ -1551,13 +1530,11 @@ static int file_has_perm(const struct cred *cred,
 	struct file_security_struct *fsec = file->f_security;
 	struct inode *inode = file->f_path.dentry->d_inode;
 	struct common_audit_data ad;
-	struct selinux_audit_data sad = {0,};
 	u32 sid = cred_sid(cred);
 	int rc;
 
-	COMMON_AUDIT_DATA_INIT(&ad, PATH);
+	ad.type = LSM_AUDIT_DATA_PATH;
 	ad.u.path = file->f_path;
-	ad.selinux_audit_data = &sad;
 
 	if (sid != fsec->sid) {
 		rc = avc_has_perm(sid, fsec->sid,
@@ -1587,7 +1564,6 @@ static int may_create(struct inode *dir,
 	struct superblock_security_struct *sbsec;
 	u32 sid, newsid;
 	struct common_audit_data ad;
-	struct selinux_audit_data sad = {0,};
 	int rc;
 
 	dsec = dir->i_security;
@@ -1596,9 +1572,8 @@ static int may_create(struct inode *dir,
 	sid = tsec->sid;
 	newsid = tsec->create_sid;
 
-	COMMON_AUDIT_DATA_INIT(&ad, DENTRY);
+	ad.type = LSM_AUDIT_DATA_DENTRY;
 	ad.u.dentry = dentry;
-	ad.selinux_audit_data = &sad;
 
 	rc = avc_has_perm(sid, dsec->sid, SECCLASS_DIR,
 			  DIR__ADD_NAME | DIR__SEARCH,
@@ -1643,7 +1618,6 @@ static int may_link(struct inode *dir,
 {
 	struct inode_security_struct *dsec, *isec;
 	struct common_audit_data ad;
-	struct selinux_audit_data sad = {0,};
 	u32 sid = current_sid();
 	u32 av;
 	int rc;
@@ -1651,9 +1625,8 @@ static int may_link(struct inode *dir,
 	dsec = dir->i_security;
 	isec = dentry->d_inode->i_security;
 
-	COMMON_AUDIT_DATA_INIT(&ad, DENTRY);
+	ad.type = LSM_AUDIT_DATA_DENTRY;
 	ad.u.dentry = dentry;
-	ad.selinux_audit_data = &sad;
 
 	av = DIR__SEARCH;
 	av |= (kind ? DIR__REMOVE_NAME : DIR__ADD_NAME);
@@ -1688,7 +1661,6 @@ static inline int may_rename(struct inode *old_dir,
 {
 	struct inode_security_struct *old_dsec, *new_dsec, *old_isec, *new_isec;
 	struct common_audit_data ad;
-	struct selinux_audit_data sad = {0,};
 	u32 sid = current_sid();
 	u32 av;
 	int old_is_dir, new_is_dir;
@@ -1699,8 +1671,7 @@ static inline int may_rename(struct inode *old_dir,
 	old_is_dir = S_ISDIR(old_dentry->d_inode->i_mode);
 	new_dsec = new_dir->i_security;
 
-	COMMON_AUDIT_DATA_INIT(&ad, DENTRY);
-	ad.selinux_audit_data = &sad;
+	ad.type = LSM_AUDIT_DATA_DENTRY;
 
 	ad.u.dentry = old_dentry;
 	rc = avc_has_perm(sid, old_dsec->sid, SECCLASS_DIR,
@@ -1986,7 +1957,6 @@ static int selinux_bprm_set_creds(struct linux_binprm *bprm)
 	struct task_security_struct *new_tsec;
 	struct inode_security_struct *isec;
 	struct common_audit_data ad;
-	struct selinux_audit_data sad = {0,};
 	struct inode *inode = bprm->file->f_path.dentry->d_inode;
 	int rc;
 
@@ -2016,6 +1986,13 @@ static int selinux_bprm_set_creds(struct linux_binprm *bprm)
 		new_tsec->sid = old_tsec->exec_sid;
 		/* Reset exec SID on execve. */
 		new_tsec->exec_sid = 0;
+
+		/*
+		 * Minimize confusion: if no_new_privs and a transition is
+		 * explicitly requested, then fail the exec.
+		 */
+		if (bprm->unsafe & LSM_UNSAFE_NO_NEW_PRIVS)
+			return -EPERM;
 	} else {
 		/* Check for a default transition on this program. */
 		rc = security_transition_sid(old_tsec->sid, isec->sid,
@@ -2025,11 +2002,11 @@ static int selinux_bprm_set_creds(struct linux_binprm *bprm)
 			return rc;
 	}
 
-	COMMON_AUDIT_DATA_INIT(&ad, PATH);
-	ad.selinux_audit_data = &sad;
+	ad.type = LSM_AUDIT_DATA_PATH;
 	ad.u.path = bprm->file->f_path;
 
-	if (bprm->file->f_path.mnt->mnt_flags & MNT_NOSUID)
+	if ((bprm->file->f_path.mnt->mnt_flags & MNT_NOSUID) ||
+	    (bprm->unsafe & LSM_UNSAFE_NO_NEW_PRIVS))
 		new_tsec->sid = old_tsec->sid;
 
 	if (new_tsec->sid == old_tsec->sid) {
@@ -2115,8 +2092,6 @@ static int selinux_bprm_secureexec(struct linux_binprm *bprm)
 static inline void flush_unauthorized_files(const struct cred *cred,
 					    struct files_struct *files)
 {
-	struct common_audit_data ad;
-	struct selinux_audit_data sad = {0,};
 	struct file *file, *devnull = NULL;
 	struct tty_struct *tty;
 	struct fdtable *fdt;
@@ -2128,21 +2103,17 @@ static inline void flush_unauthorized_files(const struct cred *cred,
 		spin_lock(&tty_files_lock);
 		if (!list_empty(&tty->tty_files)) {
 			struct tty_file_private *file_priv;
-			struct inode *inode;
 
 			/* Revalidate access to controlling tty.
-			   Use inode_has_perm on the tty inode directly rather
+			   Use path_has_perm on the tty path directly rather
 			   than using file_has_perm, as this particular open
 			   file may belong to another process and we are only
 			   interested in the inode-based check here. */
 			file_priv = list_first_entry(&tty->tty_files,
 						struct tty_file_private, list);
 			file = file_priv->file;
-			inode = file->f_path.dentry->d_inode;
-			if (inode_has_perm_noadp(cred, inode,
-					   FILE__READ | FILE__WRITE, 0)) {
+			if (path_has_perm(cred, &file->f_path, FILE__READ | FILE__WRITE))
 				drop_tty = 1;
-			}
 		}
 		spin_unlock(&tty_files_lock);
 		tty_kref_put(tty);
@@ -2152,10 +2123,6 @@ static inline void flush_unauthorized_files(const struct cred *cred,
 		no_tty();
 
 	/* Revalidate access to inherited open files. */
-
-	COMMON_AUDIT_DATA_INIT(&ad, INODE);
-	ad.selinux_audit_data = &sad;
-
 	spin_lock(&files->file_lock);
 	for (;;) {
 		unsigned long set, i;
@@ -2492,7 +2459,6 @@ static int selinux_sb_kern_mount(struct super_block *sb, int flags, void *data)
 {
 	const struct cred *cred = current_cred();
 	struct common_audit_data ad;
-	struct selinux_audit_data sad = {0,};
 	int rc;
 
 	rc = superblock_doinit(sb, data);
@@ -2503,8 +2469,7 @@ static int selinux_sb_kern_mount(struct super_block *sb, int flags, void *data)
 	if (flags & MS_KERNMOUNT)
 		return 0;
 
-	COMMON_AUDIT_DATA_INIT(&ad, DENTRY);
-	ad.selinux_audit_data = &sad;
+	ad.type = LSM_AUDIT_DATA_DENTRY;
 	ad.u.dentry = sb->s_root;
 	return superblock_has_perm(cred, sb, FILESYSTEM__MOUNT, &ad);
 }
@@ -2513,10 +2478,8 @@ static int selinux_sb_statfs(struct dentry *dentry)
 {
 	const struct cred *cred = current_cred();
 	struct common_audit_data ad;
-	struct selinux_audit_data sad = {0,};
 
-	COMMON_AUDIT_DATA_INIT(&ad, DENTRY);
-	ad.selinux_audit_data = &sad;
+	ad.type = LSM_AUDIT_DATA_DENTRY;
 	ad.u.dentry = dentry->d_sb->s_root;
 	return superblock_has_perm(cred, dentry->d_sb, FILESYSTEM__GETATTR, &ad);
 }
@@ -2676,14 +2639,35 @@ static int selinux_inode_follow_link(struct dentry *dentry, struct nameidata *na
 	return dentry_has_perm(cred, dentry, FILE__READ);
 }
 
+static noinline int audit_inode_permission(struct inode *inode,
+					   u32 perms, u32 audited, u32 denied,
+					   unsigned flags)
+{
+	struct common_audit_data ad;
+	struct inode_security_struct *isec = inode->i_security;
+	int rc;
+
+	ad.type = LSM_AUDIT_DATA_INODE;
+	ad.u.inode = inode;
+
+	rc = slow_avc_audit(current_sid(), isec->sid, isec->sclass, perms,
+			    audited, denied, &ad, flags);
+	if (rc)
+		return rc;
+	return 0;
+}
+
 static int selinux_inode_permission(struct inode *inode, int mask)
 {
 	const struct cred *cred = current_cred();
-	struct common_audit_data ad;
-	struct selinux_audit_data sad = {0,};
 	u32 perms;
 	bool from_access;
 	unsigned flags = mask & MAY_NOT_BLOCK;
+	struct inode_security_struct *isec;
+	u32 sid;
+	struct av_decision avd;
+	int rc, rc2;
+	u32 audited, denied;
 
 	from_access = mask & MAY_ACCESS;
 	mask &= (MAY_READ|MAY_WRITE|MAY_EXEC|MAY_APPEND);
@@ -2692,22 +2676,34 @@ static int selinux_inode_permission(struct inode *inode, int mask)
 	if (!mask)
 		return 0;
 
-	COMMON_AUDIT_DATA_INIT(&ad, INODE);
-	ad.selinux_audit_data = &sad;
-	ad.u.inode = inode;
+	validate_creds(cred);
 
-	if (from_access)
-		ad.selinux_audit_data->auditdeny |= FILE__AUDIT_ACCESS;
+	if (unlikely(IS_PRIVATE(inode)))
+		return 0;
 
 	perms = file_mask_to_av(inode->i_mode, mask);
 
-	return inode_has_perm(cred, inode, perms, &ad, flags);
+	sid = cred_sid(cred);
+	isec = inode->i_security;
+
+	rc = avc_has_perm_noaudit(sid, isec->sid, isec->sclass, perms, 0, &avd);
+	audited = avc_audit_required(perms, &avd, rc,
+				     from_access ? FILE__AUDIT_ACCESS : 0,
+				     &denied);
+	if (likely(!audited))
+		return rc;
+
+	rc2 = audit_inode_permission(inode, perms, audited, denied, flags);
+	if (rc2)
+		return rc2;
+	return rc;
 }
 
 static int selinux_inode_setattr(struct dentry *dentry, struct iattr *iattr)
 {
 	const struct cred *cred = current_cred();
 	unsigned int ia_valid = iattr->ia_valid;
+	__u32 av = FILE__WRITE;
 
 	/* ATTR_FORCE is just used for ATTR_KILL_S[UG]ID. */
 	if (ia_valid & ATTR_FORCE) {
@@ -2721,7 +2717,10 @@ static int selinux_inode_setattr(struct dentry *dentry, struct iattr *iattr)
 			ATTR_ATIME_SET | ATTR_MTIME_SET | ATTR_TIMES_SET))
 		return dentry_has_perm(cred, dentry, FILE__SETATTR);
 
-	return dentry_has_perm(cred, dentry, FILE__WRITE);
+	if (ia_valid & ATTR_SIZE)
+		av |= FILE__OPEN;
+
+	return dentry_has_perm(cred, dentry, av);
 }
 
 static int selinux_inode_getattr(struct vfsmount *mnt, struct dentry *dentry)
@@ -2763,7 +2762,6 @@ static int selinux_inode_setxattr(struct dentry *dentry, const char *name,
 	struct inode_security_struct *isec = inode->i_security;
 	struct superblock_security_struct *sbsec;
 	struct common_audit_data ad;
-	struct selinux_audit_data sad = {0,};
 	u32 newsid, sid = current_sid();
 	int rc = 0;
 
@@ -2777,8 +2775,7 @@ static int selinux_inode_setxattr(struct dentry *dentry, const char *name,
 	if (!inode_owner_or_capable(inode))
 		return -EPERM;
 
-	COMMON_AUDIT_DATA_INIT(&ad, DENTRY);
-	ad.selinux_audit_data = &sad;
+	ad.type = LSM_AUDIT_DATA_DENTRY;
 	ad.u.dentry = dentry;
 
 	rc = avc_has_perm(sid, isec->sid, isec->sclass,
@@ -2788,8 +2785,25 @@ static int selinux_inode_setxattr(struct dentry *dentry, const char *name,
 
 	rc = security_context_to_sid(value, size, &newsid);
 	if (rc == -EINVAL) {
-		if (!capable(CAP_MAC_ADMIN))
+		if (!capable(CAP_MAC_ADMIN)) {
+			struct audit_buffer *ab;
+			size_t audit_size;
+			const char *str;
+
+			/* We strip a nul only if it is at the end, otherwise the
+			 * context contains a nul and we should audit that */
+			str = value;
+			if (str[size - 1] == '\0')
+				audit_size = size - 1;
+			else
+				audit_size = size;
+			ab = audit_log_start(current->audit_context, GFP_ATOMIC, AUDIT_SELINUX_ERR);
+			audit_log_format(ab, "op=setxattr invalid_context=");
+			audit_log_n_untrustedstring(ab, value, audit_size);
+			audit_log_end(ab);
+
 			return rc;
+		}
 		rc = security_context_to_sid_force(value, size, &newsid);
 	}
 	if (rc)
@@ -2969,7 +2983,7 @@ static int selinux_file_permission(struct file *file, int mask)
 
 	if (sid == fsec->sid && fsec->isid == isec->sid &&
 	    fsec->pseqno == avc_policy_seqno())
-		/* No change since dentry_open check. */
+		/* No change since file_open check. */
 		return 0;
 
 	return selinux_revalidate_file_permission(file, mask);
@@ -3228,15 +3242,13 @@ static int selinux_file_receive(struct file *file)
 	return file_has_perm(cred, file, file_to_av(file));
 }
 
-static int selinux_dentry_open(struct file *file, const struct cred *cred)
+static int selinux_file_open(struct file *file, const struct cred *cred)
 {
 	struct file_security_struct *fsec;
-	struct inode *inode;
 	struct inode_security_struct *isec;
 
-	inode = file->f_path.dentry->d_inode;
 	fsec = file->f_security;
-	isec = inode->i_security;
+	isec = file->f_path.dentry->d_inode->i_security;
 	/*
 	 * Save inode label and policy sequence number
 	 * at open-time so that selinux_file_permission
@@ -3254,7 +3266,7 @@ static int selinux_dentry_open(struct file *file, const struct cred *cred)
 	 * new inode label or new policy.
 	 * This check is not redundant - do not remove.
 	 */
-	return inode_has_perm_noadp(cred, inode, open_file_to_av(file), 0);
+	return path_has_perm(cred, &file->f_path, open_file_to_av(file));
 }
 
 /* task security operations */
@@ -3373,12 +3385,10 @@ static int selinux_kernel_module_request(char *kmod_name)
 {
 	u32 sid;
 	struct common_audit_data ad;
-	struct selinux_audit_data sad = {0,};
 
 	sid = task_sid(current);
 
-	COMMON_AUDIT_DATA_INIT(&ad, KMOD);
-	ad.selinux_audit_data = &sad;
+	ad.type = LSM_AUDIT_DATA_KMOD;
 	ad.u.kmod_name = kmod_name;
 
 	return avc_has_perm(sid, SECINITSID_KERNEL, SECCLASS_SYSTEM,
@@ -3751,15 +3761,13 @@ static int sock_has_perm(struct task_struct *task, struct sock *sk, u32 perms)
 {
 	struct sk_security_struct *sksec = sk->sk_security;
 	struct common_audit_data ad;
-	struct selinux_audit_data sad = {0,};
 	struct lsm_network_audit net = {0,};
 	u32 tsid = task_sid(task);
 
 	if (sksec->sid == SECINITSID_KERNEL)
 		return 0;
 
-	COMMON_AUDIT_DATA_INIT(&ad, NET);
-	ad.selinux_audit_data = &sad;
+	ad.type = LSM_AUDIT_DATA_NET;
 	ad.u.net = &net;
 	ad.u.net->sk = sk;
 
@@ -3839,7 +3847,6 @@ static int selinux_socket_bind(struct socket *sock, struct sockaddr *address, in
 		char *addrp;
 		struct sk_security_struct *sksec = sk->sk_security;
 		struct common_audit_data ad;
-		struct selinux_audit_data sad = {0,};
 		struct lsm_network_audit net = {0,};
 		struct sockaddr_in *addr4 = NULL;
 		struct sockaddr_in6 *addr6 = NULL;
@@ -3866,8 +3873,7 @@ static int selinux_socket_bind(struct socket *sock, struct sockaddr *address, in
 						      snum, &sid);
 				if (err)
 					goto out;
-				COMMON_AUDIT_DATA_INIT(&ad, NET);
-				ad.selinux_audit_data = &sad;
+				ad.type = LSM_AUDIT_DATA_NET;
 				ad.u.net = &net;
 				ad.u.net->sport = htons(snum);
 				ad.u.net->family = family;
@@ -3901,8 +3907,7 @@ static int selinux_socket_bind(struct socket *sock, struct sockaddr *address, in
 		if (err)
 			goto out;
 
-		COMMON_AUDIT_DATA_INIT(&ad, NET);
-		ad.selinux_audit_data = &sad;
+		ad.type = LSM_AUDIT_DATA_NET;
 		ad.u.net = &net;
 		ad.u.net->sport = htons(snum);
 		ad.u.net->family = family;
@@ -3937,7 +3942,6 @@ static int selinux_socket_connect(struct socket *sock, struct sockaddr *address,
 	if (sksec->sclass == SECCLASS_TCP_SOCKET ||
 	    sksec->sclass == SECCLASS_DCCP_SOCKET) {
 		struct common_audit_data ad;
-		struct selinux_audit_data sad = {0,};
 		struct lsm_network_audit net = {0,};
 		struct sockaddr_in *addr4 = NULL;
 		struct sockaddr_in6 *addr6 = NULL;
@@ -3963,8 +3967,7 @@ static int selinux_socket_connect(struct socket *sock, struct sockaddr *address,
 		perm = (sksec->sclass == SECCLASS_TCP_SOCKET) ?
 		       TCP_SOCKET__NAME_CONNECT : DCCP_SOCKET__NAME_CONNECT;
 
-		COMMON_AUDIT_DATA_INIT(&ad, NET);
-		ad.selinux_audit_data = &sad;
+		ad.type = LSM_AUDIT_DATA_NET;
 		ad.u.net = &net;
 		ad.u.net->dport = htons(snum);
 		ad.u.net->family = sk->sk_family;
@@ -4056,12 +4059,10 @@ static int selinux_socket_unix_stream_connect(struct sock *sock,
 	struct sk_security_struct *sksec_other = other->sk_security;
 	struct sk_security_struct *sksec_new = newsk->sk_security;
 	struct common_audit_data ad;
-	struct selinux_audit_data sad = {0,};
 	struct lsm_network_audit net = {0,};
 	int err;
 
-	COMMON_AUDIT_DATA_INIT(&ad, NET);
-	ad.selinux_audit_data = &sad;
+	ad.type = LSM_AUDIT_DATA_NET;
 	ad.u.net = &net;
 	ad.u.net->sk = other;
 
@@ -4090,11 +4091,9 @@ static int selinux_socket_unix_may_send(struct socket *sock,
 	struct sk_security_struct *ssec = sock->sk->sk_security;
 	struct sk_security_struct *osec = other->sk->sk_security;
 	struct common_audit_data ad;
-	struct selinux_audit_data sad = {0,};
 	struct lsm_network_audit net = {0,};
 
-	COMMON_AUDIT_DATA_INIT(&ad, NET);
-	ad.selinux_audit_data = &sad;
+	ad.type = LSM_AUDIT_DATA_NET;
 	ad.u.net = &net;
 	ad.u.net->sk = other->sk;
 
@@ -4132,12 +4131,10 @@ static int selinux_sock_rcv_skb_compat(struct sock *sk, struct sk_buff *skb,
 	struct sk_security_struct *sksec = sk->sk_security;
 	u32 sk_sid = sksec->sid;
 	struct common_audit_data ad;
-	struct selinux_audit_data sad = {0,};
 	struct lsm_network_audit net = {0,};
 	char *addrp;
 
-	COMMON_AUDIT_DATA_INIT(&ad, NET);
-	ad.selinux_audit_data = &sad;
+	ad.type = LSM_AUDIT_DATA_NET;
 	ad.u.net = &net;
 	ad.u.net->netif = skb->skb_iif;
 	ad.u.net->family = family;
@@ -4167,7 +4164,6 @@ static int selinux_socket_sock_rcv_skb(struct sock *sk, struct sk_buff *skb)
 	u16 family = sk->sk_family;
 	u32 sk_sid = sksec->sid;
 	struct common_audit_data ad;
-	struct selinux_audit_data sad = {0,};
 	struct lsm_network_audit net = {0,};
 	char *addrp;
 	u8 secmark_active;
@@ -4192,8 +4188,7 @@ static int selinux_socket_sock_rcv_skb(struct sock *sk, struct sk_buff *skb)
 	if (!secmark_active && !peerlbl_active)
 		return 0;
 
-	COMMON_AUDIT_DATA_INIT(&ad, NET);
-	ad.selinux_audit_data = &sad;
+	ad.type = LSM_AUDIT_DATA_NET;
 	ad.u.net = &net;
 	ad.u.net->netif = skb->skb_iif;
 	ad.u.net->family = family;
@@ -4531,7 +4526,6 @@ static unsigned int selinux_ip_forward(struct sk_buff *skb, int ifindex,
 	char *addrp;
 	u32 peer_sid;
 	struct common_audit_data ad;
-	struct selinux_audit_data sad = {0,};
 	struct lsm_network_audit net = {0,};
 	u8 secmark_active;
 	u8 netlbl_active;
@@ -4549,8 +4543,7 @@ static unsigned int selinux_ip_forward(struct sk_buff *skb, int ifindex,
 	if (selinux_skb_peerlbl_sid(skb, family, &peer_sid) != 0)
 		return NF_DROP;
 
-	COMMON_AUDIT_DATA_INIT(&ad, NET);
-	ad.selinux_audit_data = &sad;
+	ad.type = LSM_AUDIT_DATA_NET;
 	ad.u.net = &net;
 	ad.u.net->netif = ifindex;
 	ad.u.net->family = family;
@@ -4640,7 +4633,6 @@ static unsigned int selinux_ip_postroute_compat(struct sk_buff *skb,
 	struct sock *sk = skb->sk;
 	struct sk_security_struct *sksec;
 	struct common_audit_data ad;
-	struct selinux_audit_data sad = {0,};
 	struct lsm_network_audit net = {0,};
 	char *addrp;
 	u8 proto;
@@ -4649,8 +4641,7 @@ static unsigned int selinux_ip_postroute_compat(struct sk_buff *skb,
 		return NF_ACCEPT;
 	sksec = sk->sk_security;
 
-	COMMON_AUDIT_DATA_INIT(&ad, NET);
-	ad.selinux_audit_data = &sad;
+	ad.type = LSM_AUDIT_DATA_NET;
 	ad.u.net = &net;
 	ad.u.net->netif = ifindex;
 	ad.u.net->family = family;
@@ -4675,7 +4666,6 @@ static unsigned int selinux_ip_postroute(struct sk_buff *skb, int ifindex,
 	u32 peer_sid;
 	struct sock *sk;
 	struct common_audit_data ad;
-	struct selinux_audit_data sad = {0,};
 	struct lsm_network_audit net = {0,};
 	char *addrp;
 	u8 secmark_active;
@@ -4722,8 +4712,7 @@ static unsigned int selinux_ip_postroute(struct sk_buff *skb, int ifindex,
 		secmark_perm = PACKET__SEND;
 	}
 
-	COMMON_AUDIT_DATA_INIT(&ad, NET);
-	ad.selinux_audit_data = &sad;
+	ad.type = LSM_AUDIT_DATA_NET;
 	ad.u.net = &net;
 	ad.u.net->netif = ifindex;
 	ad.u.net->family = family;
@@ -4841,13 +4830,11 @@ static int ipc_has_perm(struct kern_ipc_perm *ipc_perms,
 {
 	struct ipc_security_struct *isec;
 	struct common_audit_data ad;
-	struct selinux_audit_data sad = {0,};
 	u32 sid = current_sid();
 
 	isec = ipc_perms->security;
 
-	COMMON_AUDIT_DATA_INIT(&ad, IPC);
-	ad.selinux_audit_data = &sad;
+	ad.type = LSM_AUDIT_DATA_IPC;
 	ad.u.ipc_id = ipc_perms->key;
 
 	return avc_has_perm(sid, isec->sid, isec->sclass, perms, &ad);
@@ -4868,7 +4855,6 @@ static int selinux_msg_queue_alloc_security(struct msg_queue *msq)
 {
 	struct ipc_security_struct *isec;
 	struct common_audit_data ad;
-	struct selinux_audit_data sad = {0,};
 	u32 sid = current_sid();
 	int rc;
 
@@ -4878,8 +4864,7 @@ static int selinux_msg_queue_alloc_security(struct msg_queue *msq)
 
 	isec = msq->q_perm.security;
 
-	COMMON_AUDIT_DATA_INIT(&ad, IPC);
-	ad.selinux_audit_data = &sad;
+	ad.type = LSM_AUDIT_DATA_IPC;
 	ad.u.ipc_id = msq->q_perm.key;
 
 	rc = avc_has_perm(sid, isec->sid, SECCLASS_MSGQ,
@@ -4900,13 +4885,11 @@ static int selinux_msg_queue_associate(struct msg_queue *msq, int msqflg)
 {
 	struct ipc_security_struct *isec;
 	struct common_audit_data ad;
-	struct selinux_audit_data sad = {0,};
 	u32 sid = current_sid();
 
 	isec = msq->q_perm.security;
 
-	COMMON_AUDIT_DATA_INIT(&ad, IPC);
-	ad.selinux_audit_data = &sad;
+	ad.type = LSM_AUDIT_DATA_IPC;
 	ad.u.ipc_id = msq->q_perm.key;
 
 	return avc_has_perm(sid, isec->sid, SECCLASS_MSGQ,
@@ -4946,7 +4929,6 @@ static int selinux_msg_queue_msgsnd(struct msg_queue *msq, struct msg_msg *msg,
 	struct ipc_security_struct *isec;
 	struct msg_security_struct *msec;
 	struct common_audit_data ad;
-	struct selinux_audit_data sad = {0,};
 	u32 sid = current_sid();
 	int rc;
 
@@ -4967,8 +4949,7 @@ static int selinux_msg_queue_msgsnd(struct msg_queue *msq, struct msg_msg *msg,
 			return rc;
 	}
 
-	COMMON_AUDIT_DATA_INIT(&ad, IPC);
-	ad.selinux_audit_data = &sad;
+	ad.type = LSM_AUDIT_DATA_IPC;
 	ad.u.ipc_id = msq->q_perm.key;
 
 	/* Can this process write to the queue? */
@@ -4993,15 +4974,13 @@ static int selinux_msg_queue_msgrcv(struct msg_queue *msq, struct msg_msg *msg,
 	struct ipc_security_struct *isec;
 	struct msg_security_struct *msec;
 	struct common_audit_data ad;
-	struct selinux_audit_data sad = {0,};
 	u32 sid = task_sid(target);
 	int rc;
 
 	isec = msq->q_perm.security;
 	msec = msg->security;
 
-	COMMON_AUDIT_DATA_INIT(&ad, IPC);
-	ad.selinux_audit_data = &sad;
+	ad.type = LSM_AUDIT_DATA_IPC;
 	ad.u.ipc_id = msq->q_perm.key;
 
 	rc = avc_has_perm(sid, isec->sid,
@@ -5017,7 +4996,6 @@ static int selinux_shm_alloc_security(struct shmid_kernel *shp)
 {
 	struct ipc_security_struct *isec;
 	struct common_audit_data ad;
-	struct selinux_audit_data sad = {0,};
 	u32 sid = current_sid();
 	int rc;
 
@@ -5027,8 +5005,7 @@ static int selinux_shm_alloc_security(struct shmid_kernel *shp)
 
 	isec = shp->shm_perm.security;
 
-	COMMON_AUDIT_DATA_INIT(&ad, IPC);
-	ad.selinux_audit_data = &sad;
+	ad.type = LSM_AUDIT_DATA_IPC;
 	ad.u.ipc_id = shp->shm_perm.key;
 
 	rc = avc_has_perm(sid, isec->sid, SECCLASS_SHM,
@@ -5049,13 +5026,11 @@ static int selinux_shm_associate(struct shmid_kernel *shp, int shmflg)
 {
 	struct ipc_security_struct *isec;
 	struct common_audit_data ad;
-	struct selinux_audit_data sad = {0,};
 	u32 sid = current_sid();
 
 	isec = shp->shm_perm.security;
 
-	COMMON_AUDIT_DATA_INIT(&ad, IPC);
-	ad.selinux_audit_data = &sad;
+	ad.type = LSM_AUDIT_DATA_IPC;
 	ad.u.ipc_id = shp->shm_perm.key;
 
 	return avc_has_perm(sid, isec->sid, SECCLASS_SHM,
@@ -5113,7 +5088,6 @@ static int selinux_sem_alloc_security(struct sem_array *sma)
 {
 	struct ipc_security_struct *isec;
 	struct common_audit_data ad;
-	struct selinux_audit_data sad = {0,};
 	u32 sid = current_sid();
 	int rc;
 
@@ -5123,8 +5097,7 @@ static int selinux_sem_alloc_security(struct sem_array *sma)
 
 	isec = sma->sem_perm.security;
 
-	COMMON_AUDIT_DATA_INIT(&ad, IPC);
-	ad.selinux_audit_data = &sad;
+	ad.type = LSM_AUDIT_DATA_IPC;
 	ad.u.ipc_id = sma->sem_perm.key;
 
 	rc = avc_has_perm(sid, isec->sid, SECCLASS_SEM,
@@ -5145,13 +5118,11 @@ static int selinux_sem_associate(struct sem_array *sma, int semflg)
 {
 	struct ipc_security_struct *isec;
 	struct common_audit_data ad;
-	struct selinux_audit_data sad = {0,};
 	u32 sid = current_sid();
 
 	isec = sma->sem_perm.security;
 
-	COMMON_AUDIT_DATA_INIT(&ad, IPC);
-	ad.selinux_audit_data = &sad;
+	ad.type = LSM_AUDIT_DATA_IPC;
 	ad.u.ipc_id = sma->sem_perm.key;
 
 	return avc_has_perm(sid, isec->sid, SECCLASS_SEM,
@@ -5331,8 +5302,23 @@ static int selinux_setprocattr(struct task_struct *p,
 		}
 		error = security_context_to_sid(value, size, &sid);
 		if (error == -EINVAL && !strcmp(name, "fscreate")) {
-			if (!capable(CAP_MAC_ADMIN))
+			if (!capable(CAP_MAC_ADMIN)) {
+				struct audit_buffer *ab;
+				size_t audit_size;
+
+				/* We strip a nul only if it is at the end, otherwise the
+				 * context contains a nul and we should audit that */
+				if (str[size - 1] == '\0')
+					audit_size = size - 1;
+				else
+					audit_size = size;
+				ab = audit_log_start(current->audit_context, GFP_ATOMIC, AUDIT_SELINUX_ERR);
+				audit_log_format(ab, "op=fscreate invalid_context=");
+				audit_log_n_untrustedstring(ab, value, audit_size);
+				audit_log_end(ab);
+
 				return error;
+			}
 			error = security_context_to_sid_force(value, size,
 							      &sid);
 		}
@@ -5592,7 +5578,7 @@ static struct security_operations selinux_ops = {
 	.file_send_sigiotask =		selinux_file_send_sigiotask,
 	.file_receive =			selinux_file_receive,
 
-	.dentry_open =			selinux_dentry_open,
+	.file_open =			selinux_file_open,
 
 	.task_create =			selinux_task_create,
 	.cred_alloc_blank =		selinux_cred_alloc_blank,
diff --git a/security/selinux/include/avc.h b/security/selinux/include/avc.h
index 1931370233d7..92d0ab561db8 100644
--- a/security/selinux/include/avc.h
+++ b/security/selinux/include/avc.h
@@ -49,7 +49,7 @@ struct avc_cache_stats {
 /*
  * We only need this data after we have decided to send an audit message.
  */
-struct selinux_late_audit_data {
+struct selinux_audit_data {
 	u32 ssid;
 	u32 tsid;
 	u16 tclass;
@@ -60,28 +60,86 @@ struct selinux_late_audit_data {
 };
 
 /*
- * We collect this at the beginning or during an selinux security operation
- */
-struct selinux_audit_data {
-	/*
-	 * auditdeny is a bit tricky and unintuitive.  See the
-	 * comments in avc.c for it's meaning and usage.
-	 */
-	u32 auditdeny;
-	struct selinux_late_audit_data *slad;
-};
-
-/*
  * AVC operations
  */
 
 void __init avc_init(void);
 
-int avc_audit(u32 ssid, u32 tsid,
-	       u16 tclass, u32 requested,
-	       struct av_decision *avd,
-	       int result,
-	      struct common_audit_data *a, unsigned flags);
+static inline u32 avc_audit_required(u32 requested,
+			      struct av_decision *avd,
+			      int result,
+			      u32 auditdeny,
+			      u32 *deniedp)
+{
+	u32 denied, audited;
+	denied = requested & ~avd->allowed;
+	if (unlikely(denied)) {
+		audited = denied & avd->auditdeny;
+		/*
+		 * auditdeny is TRICKY!  Setting a bit in
+		 * this field means that ANY denials should NOT be audited if
+		 * the policy contains an explicit dontaudit rule for that
+		 * permission.  Take notice that this is unrelated to the
+		 * actual permissions that were denied.  As an example lets
+		 * assume:
+		 *
+		 * denied == READ
+		 * avd.auditdeny & ACCESS == 0 (not set means explicit rule)
+		 * auditdeny & ACCESS == 1
+		 *
+		 * We will NOT audit the denial even though the denied
+		 * permission was READ and the auditdeny checks were for
+		 * ACCESS
+		 */
+		if (auditdeny && !(auditdeny & avd->auditdeny))
+			audited = 0;
+	} else if (result)
+		audited = denied = requested;
+	else
+		audited = requested & avd->auditallow;
+	*deniedp = denied;
+	return audited;
+}
+
+int slow_avc_audit(u32 ssid, u32 tsid, u16 tclass,
+		   u32 requested, u32 audited, u32 denied,
+		   struct common_audit_data *a,
+		   unsigned flags);
+
+/**
+ * avc_audit - Audit the granting or denial of permissions.
+ * @ssid: source security identifier
+ * @tsid: target security identifier
+ * @tclass: target security class
+ * @requested: requested permissions
+ * @avd: access vector decisions
+ * @result: result from avc_has_perm_noaudit
+ * @a:  auxiliary audit data
+ * @flags: VFS walk flags
+ *
+ * Audit the granting or denial of permissions in accordance
+ * with the policy.  This function is typically called by
+ * avc_has_perm() after a permission check, but can also be
+ * called directly by callers who use avc_has_perm_noaudit()
+ * in order to separate the permission check from the auditing.
+ * For example, this separation is useful when the permission check must
+ * be performed under a lock, to allow the lock to be released
+ * before calling the auditing code.
+ */
+static inline int avc_audit(u32 ssid, u32 tsid,
+			    u16 tclass, u32 requested,
+			    struct av_decision *avd,
+			    int result,
+			    struct common_audit_data *a, unsigned flags)
+{
+	u32 audited, denied;
+	audited = avc_audit_required(requested, avd, result, 0, &denied);
+	if (likely(!audited))
+		return 0;
+	return slow_avc_audit(ssid, tsid, tclass,
+			      requested, audited, denied,
+			      a, flags);
+}
 
 #define AVC_STRICT 1 /* Ignore permissive mode. */
 int avc_has_perm_noaudit(u32 ssid, u32 tsid,
@@ -112,11 +170,7 @@ u32 avc_policy_seqno(void);
 #define AVC_CALLBACK_AUDITDENY_ENABLE	64
 #define AVC_CALLBACK_AUDITDENY_DISABLE	128
 
-int avc_add_callback(int (*callback)(u32 event, u32 ssid, u32 tsid,
-				     u16 tclass, u32 perms,
-				     u32 *out_retained),
-		     u32 events, u32 ssid, u32 tsid,
-		     u16 tclass, u32 perms);
+int avc_add_callback(int (*callback)(u32 event), u32 events);
 
 /* Exported to selinuxfs */
 int avc_get_hash_stats(char *page);
diff --git a/security/selinux/include/security.h b/security/selinux/include/security.h
index d871e8ad2103..dde2005407aa 100644
--- a/security/selinux/include/security.h
+++ b/security/selinux/include/security.h
@@ -31,13 +31,15 @@
 #define POLICYDB_VERSION_BOUNDARY	24
 #define POLICYDB_VERSION_FILENAME_TRANS	25
 #define POLICYDB_VERSION_ROLETRANS	26
+#define POLICYDB_VERSION_NEW_OBJECT_DEFAULTS	27
+#define POLICYDB_VERSION_DEFAULT_TYPE	28
 
 /* Range of policy versions we understand*/
 #define POLICYDB_VERSION_MIN   POLICYDB_VERSION_BASE
 #ifdef CONFIG_SECURITY_SELINUX_POLICYDB_VERSION_MAX
 #define POLICYDB_VERSION_MAX	CONFIG_SECURITY_SELINUX_POLICYDB_VERSION_MAX_VALUE
 #else
-#define POLICYDB_VERSION_MAX	POLICYDB_VERSION_ROLETRANS
+#define POLICYDB_VERSION_MAX	POLICYDB_VERSION_DEFAULT_TYPE
 #endif
 
 /* Mask for just the mount related flags */
diff --git a/security/selinux/netif.c b/security/selinux/netif.c
index 326f22cbe405..47a49d1a6f6a 100644
--- a/security/selinux/netif.c
+++ b/security/selinux/netif.c
@@ -252,8 +252,7 @@ static void sel_netif_flush(void)
 	spin_unlock_bh(&sel_netif_lock);
 }
 
-static int sel_netif_avc_callback(u32 event, u32 ssid, u32 tsid,
-				  u16 class, u32 perms, u32 *retained)
+static int sel_netif_avc_callback(u32 event)
 {
 	if (event == AVC_CALLBACK_RESET) {
 		sel_netif_flush();
@@ -292,8 +291,7 @@ static __init int sel_netif_init(void)
 
 	register_netdevice_notifier(&sel_netif_netdev_notifier);
 
-	err = avc_add_callback(sel_netif_avc_callback, AVC_CALLBACK_RESET,
-			       SECSID_NULL, SECSID_NULL, SECCLASS_NULL, 0);
+	err = avc_add_callback(sel_netif_avc_callback, AVC_CALLBACK_RESET);
 	if (err)
 		panic("avc_add_callback() failed, error %d\n", err);
 
diff --git a/security/selinux/netnode.c b/security/selinux/netnode.c
index 86365857c088..28f911cdd7c7 100644
--- a/security/selinux/netnode.c
+++ b/security/selinux/netnode.c
@@ -297,8 +297,7 @@ static void sel_netnode_flush(void)
 	spin_unlock_bh(&sel_netnode_lock);
 }
 
-static int sel_netnode_avc_callback(u32 event, u32 ssid, u32 tsid,
-				    u16 class, u32 perms, u32 *retained)
+static int sel_netnode_avc_callback(u32 event)
 {
 	if (event == AVC_CALLBACK_RESET) {
 		sel_netnode_flush();
@@ -320,8 +319,7 @@ static __init int sel_netnode_init(void)
 		sel_netnode_hash[iter].size = 0;
 	}
 
-	ret = avc_add_callback(sel_netnode_avc_callback, AVC_CALLBACK_RESET,
-			       SECSID_NULL, SECSID_NULL, SECCLASS_NULL, 0);
+	ret = avc_add_callback(sel_netnode_avc_callback, AVC_CALLBACK_RESET);
 	if (ret != 0)
 		panic("avc_add_callback() failed, error %d\n", ret);
 
diff --git a/security/selinux/netport.c b/security/selinux/netport.c
index 7b9eb1faf68b..d35379781c2c 100644
--- a/security/selinux/netport.c
+++ b/security/selinux/netport.c
@@ -234,8 +234,7 @@ static void sel_netport_flush(void)
 	spin_unlock_bh(&sel_netport_lock);
 }
 
-static int sel_netport_avc_callback(u32 event, u32 ssid, u32 tsid,
-				    u16 class, u32 perms, u32 *retained)
+static int sel_netport_avc_callback(u32 event)
 {
 	if (event == AVC_CALLBACK_RESET) {
 		sel_netport_flush();
@@ -257,8 +256,7 @@ static __init int sel_netport_init(void)
 		sel_netport_hash[iter].size = 0;
 	}
 
-	ret = avc_add_callback(sel_netport_avc_callback, AVC_CALLBACK_RESET,
-			       SECSID_NULL, SECSID_NULL, SECCLASS_NULL, 0);
+	ret = avc_add_callback(sel_netport_avc_callback, AVC_CALLBACK_RESET);
 	if (ret != 0)
 		panic("avc_add_callback() failed, error %d\n", ret);
 
diff --git a/security/selinux/selinuxfs.c b/security/selinux/selinuxfs.c
index d7018bfa1f00..4e93f9ef970b 100644
--- a/security/selinux/selinuxfs.c
+++ b/security/selinux/selinuxfs.c
@@ -496,6 +496,7 @@ static const struct file_operations sel_policy_ops = {
 	.read		= sel_read_policy,
 	.mmap		= sel_mmap_policy,
 	.release	= sel_release_policy,
+	.llseek		= generic_file_llseek,
 };
 
 static ssize_t sel_write_load(struct file *file, const char __user *buf,
@@ -1232,6 +1233,7 @@ static int sel_make_bools(void)
 		kfree(bool_pending_names[i]);
 	kfree(bool_pending_names);
 	kfree(bool_pending_values);
+	bool_num = 0;
 	bool_pending_names = NULL;
 	bool_pending_values = NULL;
 
@@ -1532,11 +1534,6 @@ static int sel_make_initcon_files(struct dentry *dir)
 	return 0;
 }
 
-static inline unsigned int sel_div(unsigned long a, unsigned long b)
-{
-	return a / b - (a % b < 0);
-}
-
 static inline unsigned long sel_class_to_ino(u16 class)
 {
 	return (class * (SEL_VEC_MAX + 1)) | SEL_CLASS_INO_OFFSET;
@@ -1544,7 +1541,7 @@ static inline unsigned long sel_class_to_ino(u16 class)
 
 static inline u16 sel_ino_to_class(unsigned long ino)
 {
-	return sel_div(ino & SEL_INO_MASK, SEL_VEC_MAX + 1);
+	return (ino & SEL_INO_MASK) / (SEL_VEC_MAX + 1);
 }
 
 static inline unsigned long sel_perm_to_ino(u16 class, u32 perm)
@@ -1831,7 +1828,7 @@ static int sel_fill_super(struct super_block *sb, void *data, int silent)
 		[SEL_REJECT_UNKNOWN] = {"reject_unknown", &sel_handle_unknown_ops, S_IRUGO},
 		[SEL_DENY_UNKNOWN] = {"deny_unknown", &sel_handle_unknown_ops, S_IRUGO},
 		[SEL_STATUS] = {"status", &sel_handle_status_ops, S_IRUGO},
-		[SEL_POLICY] = {"policy", &sel_policy_ops, S_IRUSR},
+		[SEL_POLICY] = {"policy", &sel_policy_ops, S_IRUGO},
 		/* last one */ {""}
 	};
 	ret = simple_fill_super(sb, SELINUX_MAGIC, selinux_files);
diff --git a/security/selinux/ss/context.h b/security/selinux/ss/context.h
index 45e8fb0515f8..212e3479a0d9 100644
--- a/security/selinux/ss/context.h
+++ b/security/selinux/ss/context.h
@@ -74,6 +74,26 @@ out:
 	return rc;
 }
 
+/*
+ * Sets both levels in the MLS range of 'dst' to the high level of 'src'.
+ */
+static inline int mls_context_cpy_high(struct context *dst, struct context *src)
+{
+	int rc;
+
+	dst->range.level[0].sens = src->range.level[1].sens;
+	rc = ebitmap_cpy(&dst->range.level[0].cat, &src->range.level[1].cat);
+	if (rc)
+		goto out;
+
+	dst->range.level[1].sens = src->range.level[1].sens;
+	rc = ebitmap_cpy(&dst->range.level[1].cat, &src->range.level[1].cat);
+	if (rc)
+		ebitmap_destroy(&dst->range.level[0].cat);
+out:
+	return rc;
+}
+
 static inline int mls_context_cmp(struct context *c1, struct context *c2)
 {
 	return ((c1->range.level[0].sens == c2->range.level[0].sens) &&
diff --git a/security/selinux/ss/mls.c b/security/selinux/ss/mls.c
index fbf9c5816c71..40de8d3f208e 100644
--- a/security/selinux/ss/mls.c
+++ b/security/selinux/ss/mls.c
@@ -517,6 +517,8 @@ int mls_compute_sid(struct context *scontext,
 {
 	struct range_trans rtr;
 	struct mls_range *r;
+	struct class_datum *cladatum;
+	int default_range = 0;
 
 	if (!policydb.mls_enabled)
 		return 0;
@@ -530,6 +532,28 @@ int mls_compute_sid(struct context *scontext,
 		r = hashtab_search(policydb.range_tr, &rtr);
 		if (r)
 			return mls_range_set(newcontext, r);
+
+		if (tclass && tclass <= policydb.p_classes.nprim) {
+			cladatum = policydb.class_val_to_struct[tclass - 1];
+			if (cladatum)
+				default_range = cladatum->default_range;
+		}
+
+		switch (default_range) {
+		case DEFAULT_SOURCE_LOW:
+			return mls_context_cpy_low(newcontext, scontext);
+		case DEFAULT_SOURCE_HIGH:
+			return mls_context_cpy_high(newcontext, scontext);
+		case DEFAULT_SOURCE_LOW_HIGH:
+			return mls_context_cpy(newcontext, scontext);
+		case DEFAULT_TARGET_LOW:
+			return mls_context_cpy_low(newcontext, tcontext);
+		case DEFAULT_TARGET_HIGH:
+			return mls_context_cpy_high(newcontext, tcontext);
+		case DEFAULT_TARGET_LOW_HIGH:
+			return mls_context_cpy(newcontext, tcontext);
+		}
+
 		/* Fallthrough */
 	case AVTAB_CHANGE:
 		if ((tclass == policydb.process_class) || (sock == true))
diff --git a/security/selinux/ss/policydb.c b/security/selinux/ss/policydb.c
index a7f61d52f05c..9cd9b7c661ec 100644
--- a/security/selinux/ss/policydb.c
+++ b/security/selinux/ss/policydb.c
@@ -133,6 +133,16 @@ static struct policydb_compat_info policydb_compat[] = {
 		.sym_num	= SYM_NUM,
 		.ocon_num	= OCON_NUM,
 	},
+	{
+		.version	= POLICYDB_VERSION_NEW_OBJECT_DEFAULTS,
+		.sym_num	= SYM_NUM,
+		.ocon_num	= OCON_NUM,
+	},
+	{
+		.version	= POLICYDB_VERSION_DEFAULT_TYPE,
+		.sym_num	= SYM_NUM,
+		.ocon_num	= OCON_NUM,
+	},
 };
 
 static struct policydb_compat_info *policydb_lookup_compat(int version)
@@ -1306,6 +1316,23 @@ static int class_read(struct policydb *p, struct hashtab *h, void *fp)
 			goto bad;
 	}
 
+	if (p->policyvers >= POLICYDB_VERSION_NEW_OBJECT_DEFAULTS) {
+		rc = next_entry(buf, fp, sizeof(u32) * 3);
+		if (rc)
+			goto bad;
+
+		cladatum->default_user = le32_to_cpu(buf[0]);
+		cladatum->default_role = le32_to_cpu(buf[1]);
+		cladatum->default_range = le32_to_cpu(buf[2]);
+	}
+
+	if (p->policyvers >= POLICYDB_VERSION_DEFAULT_TYPE) {
+		rc = next_entry(buf, fp, sizeof(u32) * 1);
+		if (rc)
+			goto bad;
+		cladatum->default_type = le32_to_cpu(buf[0]);
+	}
+
 	rc = hashtab_insert(h, key, cladatum);
 	if (rc)
 		goto bad;
@@ -2832,6 +2859,23 @@ static int class_write(void *vkey, void *datum, void *ptr)
 	if (rc)
 		return rc;
 
+	if (p->policyvers >= POLICYDB_VERSION_NEW_OBJECT_DEFAULTS) {
+		buf[0] = cpu_to_le32(cladatum->default_user);
+		buf[1] = cpu_to_le32(cladatum->default_role);
+		buf[2] = cpu_to_le32(cladatum->default_range);
+
+		rc = put_entry(buf, sizeof(uint32_t), 3, fp);
+		if (rc)
+			return rc;
+	}
+
+	if (p->policyvers >= POLICYDB_VERSION_DEFAULT_TYPE) {
+		buf[0] = cpu_to_le32(cladatum->default_type);
+		rc = put_entry(buf, sizeof(uint32_t), 1, fp);
+		if (rc)
+			return rc;
+	}
+
 	return 0;
 }
 
diff --git a/security/selinux/ss/policydb.h b/security/selinux/ss/policydb.h
index b846c0387180..da637471d4ce 100644
--- a/security/selinux/ss/policydb.h
+++ b/security/selinux/ss/policydb.h
@@ -60,6 +60,20 @@ struct class_datum {
 	struct symtab permissions;	/* class-specific permission symbol table */
 	struct constraint_node *constraints;	/* constraints on class permissions */
 	struct constraint_node *validatetrans;	/* special transition rules */
+/* Options how a new object user, role, and type should be decided */
+#define DEFAULT_SOURCE         1
+#define DEFAULT_TARGET         2
+	char default_user;
+	char default_role;
+	char default_type;
+/* Options how a new object range should be decided */
+#define DEFAULT_SOURCE_LOW     1
+#define DEFAULT_SOURCE_HIGH    2
+#define DEFAULT_SOURCE_LOW_HIGH        3
+#define DEFAULT_TARGET_LOW     4
+#define DEFAULT_TARGET_HIGH    5
+#define DEFAULT_TARGET_LOW_HIGH        6
+	char default_range;
 };
 
 /* Role attributes */
diff --git a/security/selinux/ss/services.c b/security/selinux/ss/services.c
index 185f849a26f6..4321b8fc8863 100644
--- a/security/selinux/ss/services.c
+++ b/security/selinux/ss/services.c
@@ -1018,9 +1018,11 @@ static int context_struct_to_string(struct context *context, char **scontext, u3
 
 	if (context->len) {
 		*scontext_len = context->len;
-		*scontext = kstrdup(context->str, GFP_ATOMIC);
-		if (!(*scontext))
-			return -ENOMEM;
+		if (scontext) {
+			*scontext = kstrdup(context->str, GFP_ATOMIC);
+			if (!(*scontext))
+				return -ENOMEM;
+		}
 		return 0;
 	}
 
@@ -1389,6 +1391,7 @@ static int security_compute_sid(u32 ssid,
 				u32 *out_sid,
 				bool kern)
 {
+	struct class_datum *cladatum = NULL;
 	struct context *scontext = NULL, *tcontext = NULL, newcontext;
 	struct role_trans *roletr = NULL;
 	struct avtab_key avkey;
@@ -1437,12 +1440,20 @@ static int security_compute_sid(u32 ssid,
 		goto out_unlock;
 	}
 
+	if (tclass && tclass <= policydb.p_classes.nprim)
+		cladatum = policydb.class_val_to_struct[tclass - 1];
+
 	/* Set the user identity. */
 	switch (specified) {
 	case AVTAB_TRANSITION:
 	case AVTAB_CHANGE:
-		/* Use the process user identity. */
-		newcontext.user = scontext->user;
+		if (cladatum && cladatum->default_user == DEFAULT_TARGET) {
+			newcontext.user = tcontext->user;
+		} else {
+			/* notice this gets both DEFAULT_SOURCE and unset */
+			/* Use the process user identity. */
+			newcontext.user = scontext->user;
+		}
 		break;
 	case AVTAB_MEMBER:
 		/* Use the related object owner. */
@@ -1450,16 +1461,31 @@ static int security_compute_sid(u32 ssid,
 		break;
 	}
 
-	/* Set the role and type to default values. */
-	if ((tclass == policydb.process_class) || (sock == true)) {
-		/* Use the current role and type of process. */
+	/* Set the role to default values. */
+	if (cladatum && cladatum->default_role == DEFAULT_SOURCE) {
 		newcontext.role = scontext->role;
-		newcontext.type = scontext->type;
+	} else if (cladatum && cladatum->default_role == DEFAULT_TARGET) {
+		newcontext.role = tcontext->role;
 	} else {
-		/* Use the well-defined object role. */
-		newcontext.role = OBJECT_R_VAL;
-		/* Use the type of the related object. */
+		if ((tclass == policydb.process_class) || (sock == true))
+			newcontext.role = scontext->role;
+		else
+			newcontext.role = OBJECT_R_VAL;
+	}
+
+	/* Set the type to default values. */
+	if (cladatum && cladatum->default_type == DEFAULT_SOURCE) {
+		newcontext.type = scontext->type;
+	} else if (cladatum && cladatum->default_type == DEFAULT_TARGET) {
 		newcontext.type = tcontext->type;
+	} else {
+		if ((tclass == policydb.process_class) || (sock == true)) {
+			/* Use the type of process. */
+			newcontext.type = scontext->type;
+		} else {
+			/* Use the type of the related object. */
+			newcontext.type = tcontext->type;
+		}
 	}
 
 	/* Look for a type transition/member/change rule. */
@@ -3018,8 +3044,7 @@ out:
 
 static int (*aurule_callback)(void) = audit_update_lsm_rules;
 
-static int aurule_avc_callback(u32 event, u32 ssid, u32 tsid,
-			       u16 class, u32 perms, u32 *retained)
+static int aurule_avc_callback(u32 event)
 {
 	int err = 0;
 
@@ -3032,8 +3057,7 @@ static int __init aurule_init(void)
 {
 	int err;
 
-	err = avc_add_callback(aurule_avc_callback, AVC_CALLBACK_RESET,
-			       SECSID_NULL, SECSID_NULL, SECCLASS_NULL, 0);
+	err = avc_add_callback(aurule_avc_callback, AVC_CALLBACK_RESET);
 	if (err)
 		panic("avc_add_callback() failed, error %d\n", err);
 
diff --git a/security/smack/smack.h b/security/smack/smack.h
index 4ede719922ed..cc361b8f3d13 100644
--- a/security/smack/smack.h
+++ b/security/smack/smack.h
@@ -23,13 +23,19 @@
 #include <linux/lsm_audit.h>
 
 /*
+ * Smack labels were limited to 23 characters for a long time.
+ */
+#define SMK_LABELLEN	24
+#define SMK_LONGLABEL	256
+
+/*
+ * Maximum number of bytes for the levels in a CIPSO IP option.
  * Why 23? CIPSO is constrained to 30, so a 32 byte buffer is
  * bigger than can be used, and 24 is the next lower multiple
  * of 8, and there are too many issues if there isn't space set
  * aside for the terminating null byte.
  */
-#define SMK_MAXLEN	23
-#define SMK_LABELLEN	(SMK_MAXLEN+1)
+#define SMK_CIPSOLEN	24
 
 struct superblock_smack {
 	char		*smk_root;
@@ -66,6 +72,7 @@ struct task_smack {
 
 #define	SMK_INODE_INSTANT	0x01	/* inode is instantiated */
 #define	SMK_INODE_TRANSMUTE	0x02	/* directory is transmuting */
+#define	SMK_INODE_CHANGED	0x04	/* smack was transmuted */
 
 /*
  * A label access rule.
@@ -78,15 +85,6 @@ struct smack_rule {
 };
 
 /*
- * An entry in the table mapping smack values to
- * CIPSO level/category-set values.
- */
-struct smack_cipso {
-	int	smk_level;
-	char	smk_catset[SMK_LABELLEN];
-};
-
-/*
  * An entry in the table identifying hosts.
  */
 struct smk_netlbladdr {
@@ -113,22 +111,19 @@ struct smk_netlbladdr {
  * interfaces don't. The secid should go away when all of
  * these components have been repaired.
  *
- * If there is a cipso value associated with the label it
- * gets stored here, too. This will most likely be rare as
- * the cipso direct mapping in used internally.
+ * The cipso value associated with the label gets stored here, too.
  *
  * Keep the access rules for this subject label here so that
  * the entire set of rules does not need to be examined every
  * time.
  */
 struct smack_known {
-	struct list_head	list;
-	char			smk_known[SMK_LABELLEN];
-	u32			smk_secid;
-	struct smack_cipso	*smk_cipso;
-	spinlock_t		smk_cipsolock;	/* for changing cipso map */
-	struct list_head	smk_rules;	/* access rules */
-	struct mutex		smk_rules_lock;	/* lock for the rules */
+	struct list_head		list;
+	char				*smk_known;
+	u32				smk_secid;
+	struct netlbl_lsm_secattr	smk_netlabel;	/* on wire labels */
+	struct list_head		smk_rules;	/* access rules */
+	struct mutex			smk_rules_lock;	/* lock for rules */
 };
 
 /*
@@ -165,6 +160,7 @@ struct smack_known {
 #define SMACK_CIPSO_DOI_DEFAULT		3	/* Historical */
 #define SMACK_CIPSO_DOI_INVALID		-1	/* Not a DOI */
 #define SMACK_CIPSO_DIRECT_DEFAULT	250	/* Arbitrary */
+#define SMACK_CIPSO_MAPPED_DEFAULT	251	/* Also arbitrary */
 #define SMACK_CIPSO_MAXCATVAL		63	/* Bigger gets harder */
 #define SMACK_CIPSO_MAXLEVEL            255     /* CIPSO 2.2 standard */
 #define SMACK_CIPSO_MAXCATNUM           239     /* CIPSO 2.2 standard */
@@ -215,10 +211,9 @@ struct inode_smack *new_inode_smack(char *);
 int smk_access_entry(char *, char *, struct list_head *);
 int smk_access(char *, char *, int, struct smk_audit_info *);
 int smk_curacc(char *, u32, struct smk_audit_info *);
-int smack_to_cipso(const char *, struct smack_cipso *);
-char *smack_from_cipso(u32, char *);
 char *smack_from_secid(const u32);
-void smk_parse_smack(const char *string, int len, char *smack);
+char *smk_parse_smack(const char *string, int len);
+int smk_netlbl_mls(int, char *, struct netlbl_lsm_secattr *, int);
 char *smk_import(const char *, int);
 struct smack_known *smk_import_entry(const char *, int);
 struct smack_known *smk_find_entry(const char *);
@@ -228,6 +223,7 @@ u32 smack_to_secid(const char *);
  * Shared data.
  */
 extern int smack_cipso_direct;
+extern int smack_cipso_mapped;
 extern char *smack_net_ambient;
 extern char *smack_onlycap;
 extern const char *smack_cipso_option;
@@ -239,24 +235,13 @@ extern struct smack_known smack_known_invalid;
 extern struct smack_known smack_known_star;
 extern struct smack_known smack_known_web;
 
+extern struct mutex	smack_known_lock;
 extern struct list_head smack_known_list;
 extern struct list_head smk_netlbladdr_list;
 
 extern struct security_operations smack_ops;
 
 /*
- * Stricly for CIPSO level manipulation.
- * Set the category bit number in a smack label sized buffer.
- */
-static inline void smack_catset_bit(int cat, char *catsetp)
-{
-	if (cat > SMK_LABELLEN * 8)
-		return;
-
-	catsetp[(cat - 1) / 8] |= 0x80 >> ((cat - 1) % 8);
-}
-
-/*
  * Is the directory transmuting?
  */
 static inline int smk_inode_transmutable(const struct inode *isp)
@@ -319,7 +304,7 @@ void smack_log(char *subject_label, char *object_label,
 static inline void smk_ad_init(struct smk_audit_info *a, const char *func,
 			       char type)
 {
-	memset(a, 0, sizeof(*a));
+	memset(&a->sad, 0, sizeof(a->sad));
 	a->a.type = type;
 	a->a.smack_audit_data = &a->sad;
 	a->a.smack_audit_data->function = func;
diff --git a/security/smack/smack_access.c b/security/smack/smack_access.c
index c8115f7308f8..9f3705e92712 100644
--- a/security/smack/smack_access.c
+++ b/security/smack/smack_access.c
@@ -19,37 +19,31 @@
 struct smack_known smack_known_huh = {
 	.smk_known	= "?",
 	.smk_secid	= 2,
-	.smk_cipso	= NULL,
 };
 
 struct smack_known smack_known_hat = {
 	.smk_known	= "^",
 	.smk_secid	= 3,
-	.smk_cipso	= NULL,
 };
 
 struct smack_known smack_known_star = {
 	.smk_known	= "*",
 	.smk_secid	= 4,
-	.smk_cipso	= NULL,
 };
 
 struct smack_known smack_known_floor = {
 	.smk_known	= "_",
 	.smk_secid	= 5,
-	.smk_cipso	= NULL,
 };
 
 struct smack_known smack_known_invalid = {
 	.smk_known	= "",
 	.smk_secid	= 6,
-	.smk_cipso	= NULL,
 };
 
 struct smack_known smack_known_web = {
 	.smk_known	= "@",
 	.smk_secid	= 7,
-	.smk_cipso	= NULL,
 };
 
 LIST_HEAD(smack_known_list);
@@ -331,7 +325,7 @@ void smack_log(char *subject_label, char *object_label, int request,
 }
 #endif
 
-static DEFINE_MUTEX(smack_known_lock);
+DEFINE_MUTEX(smack_known_lock);
 
 /**
  * smk_find_entry - find a label on the list, return the list entry
@@ -345,7 +339,7 @@ struct smack_known *smk_find_entry(const char *string)
 	struct smack_known *skp;
 
 	list_for_each_entry_rcu(skp, &smack_known_list, list) {
-		if (strncmp(skp->smk_known, string, SMK_MAXLEN) == 0)
+		if (strcmp(skp->smk_known, string) == 0)
 			return skp;
 	}
 
@@ -356,27 +350,76 @@ struct smack_known *smk_find_entry(const char *string)
  * smk_parse_smack - parse smack label from a text string
  * @string: a text string that might contain a Smack label
  * @len: the maximum size, or zero if it is NULL terminated.
- * @smack: parsed smack label, or NULL if parse error
+ *
+ * Returns a pointer to the clean label, or NULL
  */
-void smk_parse_smack(const char *string, int len, char *smack)
+char *smk_parse_smack(const char *string, int len)
 {
-	int found;
+	char *smack;
 	int i;
 
-	if (len <= 0 || len > SMK_MAXLEN)
-		len = SMK_MAXLEN;
-
-	for (i = 0, found = 0; i < SMK_LABELLEN; i++) {
-		if (found)
-			smack[i] = '\0';
-		else if (i >= len || string[i] > '~' || string[i] <= ' ' ||
-			 string[i] == '/' || string[i] == '"' ||
-			 string[i] == '\\' || string[i] == '\'') {
-			smack[i] = '\0';
-			found = 1;
-		} else
-			smack[i] = string[i];
+	if (len <= 0)
+		len = strlen(string) + 1;
+
+	/*
+	 * Reserve a leading '-' as an indicator that
+	 * this isn't a label, but an option to interfaces
+	 * including /smack/cipso and /smack/cipso2
+	 */
+	if (string[0] == '-')
+		return NULL;
+
+	for (i = 0; i < len; i++)
+		if (string[i] > '~' || string[i] <= ' ' || string[i] == '/' ||
+		    string[i] == '"' || string[i] == '\\' || string[i] == '\'')
+			break;
+
+	if (i == 0 || i >= SMK_LONGLABEL)
+		return NULL;
+
+	smack = kzalloc(i + 1, GFP_KERNEL);
+	if (smack != NULL) {
+		strncpy(smack, string, i + 1);
+		smack[i] = '\0';
 	}
+	return smack;
+}
+
+/**
+ * smk_netlbl_mls - convert a catset to netlabel mls categories
+ * @catset: the Smack categories
+ * @sap: where to put the netlabel categories
+ *
+ * Allocates and fills attr.mls
+ * Returns 0 on success, error code on failure.
+ */
+int smk_netlbl_mls(int level, char *catset, struct netlbl_lsm_secattr *sap,
+			int len)
+{
+	unsigned char *cp;
+	unsigned char m;
+	int cat;
+	int rc;
+	int byte;
+
+	sap->flags |= NETLBL_SECATTR_MLS_CAT;
+	sap->attr.mls.lvl = level;
+	sap->attr.mls.cat = netlbl_secattr_catmap_alloc(GFP_ATOMIC);
+	sap->attr.mls.cat->startbit = 0;
+
+	for (cat = 1, cp = catset, byte = 0; byte < len; cp++, byte++)
+		for (m = 0x80; m != 0; m >>= 1, cat++) {
+			if ((m & *cp) == 0)
+				continue;
+			rc = netlbl_secattr_catmap_setbit(sap->attr.mls.cat,
+							  cat, GFP_ATOMIC);
+			if (rc < 0) {
+				netlbl_secattr_catmap_free(sap->attr.mls.cat);
+				return rc;
+			}
+		}
+
+	return 0;
 }
 
 /**
@@ -390,33 +433,59 @@ void smk_parse_smack(const char *string, int len, char *smack)
 struct smack_known *smk_import_entry(const char *string, int len)
 {
 	struct smack_known *skp;
-	char smack[SMK_LABELLEN];
+	char *smack;
+	int slen;
+	int rc;
 
-	smk_parse_smack(string, len, smack);
-	if (smack[0] == '\0')
+	smack = smk_parse_smack(string, len);
+	if (smack == NULL)
 		return NULL;
 
 	mutex_lock(&smack_known_lock);
 
 	skp = smk_find_entry(smack);
+	if (skp != NULL)
+		goto freeout;
 
-	if (skp == NULL) {
-		skp = kzalloc(sizeof(struct smack_known), GFP_KERNEL);
-		if (skp != NULL) {
-			strncpy(skp->smk_known, smack, SMK_MAXLEN);
-			skp->smk_secid = smack_next_secid++;
-			skp->smk_cipso = NULL;
-			INIT_LIST_HEAD(&skp->smk_rules);
-			spin_lock_init(&skp->smk_cipsolock);
-			mutex_init(&skp->smk_rules_lock);
-			/*
-			 * Make sure that the entry is actually
-			 * filled before putting it on the list.
-			 */
-			list_add_rcu(&skp->list, &smack_known_list);
-		}
-	}
+	skp = kzalloc(sizeof(*skp), GFP_KERNEL);
+	if (skp == NULL)
+		goto freeout;
 
+	skp->smk_known = smack;
+	skp->smk_secid = smack_next_secid++;
+	skp->smk_netlabel.domain = skp->smk_known;
+	skp->smk_netlabel.flags =
+		NETLBL_SECATTR_DOMAIN | NETLBL_SECATTR_MLS_LVL;
+	/*
+	 * If direct labeling works use it.
+	 * Otherwise use mapped labeling.
+	 */
+	slen = strlen(smack);
+	if (slen < SMK_CIPSOLEN)
+		rc = smk_netlbl_mls(smack_cipso_direct, skp->smk_known,
+			       &skp->smk_netlabel, slen);
+	else
+		rc = smk_netlbl_mls(smack_cipso_mapped, (char *)&skp->smk_secid,
+			       &skp->smk_netlabel, sizeof(skp->smk_secid));
+
+	if (rc >= 0) {
+		INIT_LIST_HEAD(&skp->smk_rules);
+		mutex_init(&skp->smk_rules_lock);
+		/*
+		 * Make sure that the entry is actually
+		 * filled before putting it on the list.
+		 */
+		list_add_rcu(&skp->list, &smack_known_list);
+		goto unlockout;
+	}
+	/*
+	 * smk_netlbl_mls failed.
+	 */
+	kfree(skp);
+	skp = NULL;
+freeout:
+	kfree(smack);
+unlockout:
 	mutex_unlock(&smack_known_lock);
 
 	return skp;
@@ -479,79 +548,9 @@ char *smack_from_secid(const u32 secid)
  */
 u32 smack_to_secid(const char *smack)
 {
-	struct smack_known *skp;
+	struct smack_known *skp = smk_find_entry(smack);
 
-	rcu_read_lock();
-	list_for_each_entry_rcu(skp, &smack_known_list, list) {
-		if (strncmp(skp->smk_known, smack, SMK_MAXLEN) == 0) {
-			rcu_read_unlock();
-			return skp->smk_secid;
-		}
-	}
-	rcu_read_unlock();
-	return 0;
-}
-
-/**
- * smack_from_cipso - find the Smack label associated with a CIPSO option
- * @level: Bell & LaPadula level from the network
- * @cp: Bell & LaPadula categories from the network
- *
- * This is a simple lookup in the label table.
- *
- * Return the matching label from the label list or NULL.
- */
-char *smack_from_cipso(u32 level, char *cp)
-{
-	struct smack_known *kp;
-	char *final = NULL;
-
-	rcu_read_lock();
-	list_for_each_entry(kp, &smack_known_list, list) {
-		if (kp->smk_cipso == NULL)
-			continue;
-
-		spin_lock_bh(&kp->smk_cipsolock);
-
-		if (kp->smk_cipso->smk_level == level &&
-		    memcmp(kp->smk_cipso->smk_catset, cp, SMK_LABELLEN) == 0)
-			final = kp->smk_known;
-
-		spin_unlock_bh(&kp->smk_cipsolock);
-
-		if (final != NULL)
-			break;
-	}
-	rcu_read_unlock();
-
-	return final;
-}
-
-/**
- * smack_to_cipso - find the CIPSO option to go with a Smack label
- * @smack: a pointer to the smack label in question
- * @cp: where to put the result
- *
- * Returns zero if a value is available, non-zero otherwise.
- */
-int smack_to_cipso(const char *smack, struct smack_cipso *cp)
-{
-	struct smack_known *kp;
-	int found = 0;
-
-	rcu_read_lock();
-	list_for_each_entry_rcu(kp, &smack_known_list, list) {
-		if (kp->smk_known == smack ||
-		    strcmp(kp->smk_known, smack) == 0) {
-			found = 1;
-			break;
-		}
-	}
-	rcu_read_unlock();
-
-	if (found == 0 || kp->smk_cipso == NULL)
-		return -ENOENT;
-
-	memcpy(cp, kp->smk_cipso, sizeof(struct smack_cipso));
-	return 0;
+	if (skp == NULL)
+		return 0;
+	return skp->smk_secid;
 }
diff --git a/security/smack/smack_lsm.c b/security/smack/smack_lsm.c
index 45c32f074166..d583c0545808 100644
--- a/security/smack/smack_lsm.c
+++ b/security/smack/smack_lsm.c
@@ -30,7 +30,6 @@
 #include <linux/slab.h>
 #include <linux/mutex.h>
 #include <linux/pipe_fs_i.h>
-#include <net/netlabel.h>
 #include <net/cipso_ipv4.h>
 #include <linux/audit.h>
 #include <linux/magic.h>
@@ -57,16 +56,23 @@
 static char *smk_fetch(const char *name, struct inode *ip, struct dentry *dp)
 {
 	int rc;
-	char in[SMK_LABELLEN];
+	char *buffer;
+	char *result = NULL;
 
 	if (ip->i_op->getxattr == NULL)
 		return NULL;
 
-	rc = ip->i_op->getxattr(dp, name, in, SMK_LABELLEN);
-	if (rc < 0)
+	buffer = kzalloc(SMK_LONGLABEL, GFP_KERNEL);
+	if (buffer == NULL)
 		return NULL;
 
-	return smk_import(in, rc);
+	rc = ip->i_op->getxattr(dp, name, buffer, SMK_LONGLABEL);
+	if (rc > 0)
+		result = smk_import(buffer, rc);
+
+	kfree(buffer);
+
+	return result;
 }
 
 /**
@@ -79,7 +85,7 @@ struct inode_smack *new_inode_smack(char *smack)
 {
 	struct inode_smack *isp;
 
-	isp = kzalloc(sizeof(struct inode_smack), GFP_KERNEL);
+	isp = kzalloc(sizeof(struct inode_smack), GFP_NOFS);
 	if (isp == NULL)
 		return NULL;
 
@@ -556,13 +562,14 @@ static int smack_inode_init_security(struct inode *inode, struct inode *dir,
 				     void **value, size_t *len)
 {
 	struct smack_known *skp;
+	struct inode_smack *issp = inode->i_security;
 	char *csp = smk_of_current();
 	char *isp = smk_of_inode(inode);
 	char *dsp = smk_of_inode(dir);
 	int may;
 
 	if (name) {
-		*name = kstrdup(XATTR_SMACK_SUFFIX, GFP_KERNEL);
+		*name = kstrdup(XATTR_SMACK_SUFFIX, GFP_NOFS);
 		if (*name == NULL)
 			return -ENOMEM;
 	}
@@ -577,12 +584,15 @@ static int smack_inode_init_security(struct inode *inode, struct inode *dir,
 		 * If the access rule allows transmutation and
 		 * the directory requests transmutation then
 		 * by all means transmute.
+		 * Mark the inode as changed.
 		 */
 		if (may > 0 && ((may & MAY_TRANSMUTE) != 0) &&
-		    smk_inode_transmutable(dir))
+		    smk_inode_transmutable(dir)) {
 			isp = dsp;
+			issp->smk_flags |= SMK_INODE_CHANGED;
+		}
 
-		*value = kstrdup(isp, GFP_KERNEL);
+		*value = kstrdup(isp, GFP_NOFS);
 		if (*value == NULL)
 			return -ENOMEM;
 	}
@@ -821,7 +831,7 @@ static int smack_inode_setxattr(struct dentry *dentry, const char *name,
 		 * check label validity here so import wont fail on
 		 * post_setxattr
 		 */
-		if (size == 0 || size >= SMK_LABELLEN ||
+		if (size == 0 || size >= SMK_LONGLABEL ||
 		    smk_import(value, size) == NULL)
 			rc = -EINVAL;
 	} else if (strcmp(name, XATTR_NAME_SMACKTRANSMUTE) == 0) {
@@ -1349,7 +1359,7 @@ static int smack_file_receive(struct file *file)
 }
 
 /**
- * smack_dentry_open - Smack dentry open processing
+ * smack_file_open - Smack dentry open processing
  * @file: the object
  * @cred: unused
  *
@@ -1357,7 +1367,7 @@ static int smack_file_receive(struct file *file)
  *
  * Returns 0
  */
-static int smack_dentry_open(struct file *file, const struct cred *cred)
+static int smack_file_open(struct file *file, const struct cred *cred)
 {
 	struct inode_smack *isp = file->f_path.dentry->d_inode->i_security;
 
@@ -1820,65 +1830,6 @@ static char *smack_host_label(struct sockaddr_in *sip)
 }
 
 /**
- * smack_set_catset - convert a capset to netlabel mls categories
- * @catset: the Smack categories
- * @sap: where to put the netlabel categories
- *
- * Allocates and fills attr.mls.cat
- */
-static void smack_set_catset(char *catset, struct netlbl_lsm_secattr *sap)
-{
-	unsigned char *cp;
-	unsigned char m;
-	int cat;
-	int rc;
-	int byte;
-
-	if (!catset)
-		return;
-
-	sap->flags |= NETLBL_SECATTR_MLS_CAT;
-	sap->attr.mls.cat = netlbl_secattr_catmap_alloc(GFP_ATOMIC);
-	sap->attr.mls.cat->startbit = 0;
-
-	for (cat = 1, cp = catset, byte = 0; byte < SMK_LABELLEN; cp++, byte++)
-		for (m = 0x80; m != 0; m >>= 1, cat++) {
-			if ((m & *cp) == 0)
-				continue;
-			rc = netlbl_secattr_catmap_setbit(sap->attr.mls.cat,
-							  cat, GFP_ATOMIC);
-		}
-}
-
-/**
- * smack_to_secattr - fill a secattr from a smack value
- * @smack: the smack value
- * @nlsp: where the result goes
- *
- * Casey says that CIPSO is good enough for now.
- * It can be used to effect.
- * It can also be abused to effect when necessary.
- * Apologies to the TSIG group in general and GW in particular.
- */
-static void smack_to_secattr(char *smack, struct netlbl_lsm_secattr *nlsp)
-{
-	struct smack_cipso cipso;
-	int rc;
-
-	nlsp->domain = smack;
-	nlsp->flags = NETLBL_SECATTR_DOMAIN | NETLBL_SECATTR_MLS_LVL;
-
-	rc = smack_to_cipso(smack, &cipso);
-	if (rc == 0) {
-		nlsp->attr.mls.lvl = cipso.smk_level;
-		smack_set_catset(cipso.smk_catset, nlsp);
-	} else {
-		nlsp->attr.mls.lvl = smack_cipso_direct;
-		smack_set_catset(smack, nlsp);
-	}
-}
-
-/**
  * smack_netlabel - Set the secattr on a socket
  * @sk: the socket
  * @labeled: socket label scheme
@@ -1890,8 +1841,8 @@ static void smack_to_secattr(char *smack, struct netlbl_lsm_secattr *nlsp)
  */
 static int smack_netlabel(struct sock *sk, int labeled)
 {
+	struct smack_known *skp;
 	struct socket_smack *ssp = sk->sk_security;
-	struct netlbl_lsm_secattr secattr;
 	int rc = 0;
 
 	/*
@@ -1909,10 +1860,8 @@ static int smack_netlabel(struct sock *sk, int labeled)
 	    labeled == SMACK_UNLABELED_SOCKET)
 		netlbl_sock_delattr(sk);
 	else {
-		netlbl_secattr_init(&secattr);
-		smack_to_secattr(ssp->smk_out, &secattr);
-		rc = netlbl_sock_setattr(sk, sk->sk_family, &secattr);
-		netlbl_secattr_destroy(&secattr);
+		skp = smk_find_entry(ssp->smk_out);
+		rc = netlbl_sock_setattr(sk, sk->sk_family, &skp->smk_netlabel);
 	}
 
 	bh_unlock_sock(sk);
@@ -1985,7 +1934,7 @@ static int smack_inode_setsecurity(struct inode *inode, const char *name,
 	struct socket *sock;
 	int rc = 0;
 
-	if (value == NULL || size > SMK_LABELLEN || size == 0)
+	if (value == NULL || size > SMK_LONGLABEL || size == 0)
 		return -EACCES;
 
 	sp = smk_import(value, size);
@@ -2552,6 +2501,7 @@ static void smack_d_instantiate(struct dentry *opt_dentry, struct inode *inode)
 	char *final;
 	char trattr[TRANS_TRUE_SIZE];
 	int transflag = 0;
+	int rc;
 	struct dentry *dp;
 
 	if (inode == NULL)
@@ -2670,17 +2620,38 @@ static void smack_d_instantiate(struct dentry *opt_dentry, struct inode *inode)
 		 */
 		dp = dget(opt_dentry);
 		fetched = smk_fetch(XATTR_NAME_SMACK, inode, dp);
-		if (fetched != NULL) {
+		if (fetched != NULL)
 			final = fetched;
-			if (S_ISDIR(inode->i_mode)) {
-				trattr[0] = '\0';
-				inode->i_op->getxattr(dp,
+
+		/*
+		 * Transmuting directory
+		 */
+		if (S_ISDIR(inode->i_mode)) {
+			/*
+			 * If this is a new directory and the label was
+			 * transmuted when the inode was initialized
+			 * set the transmute attribute on the directory
+			 * and mark the inode.
+			 *
+			 * If there is a transmute attribute on the
+			 * directory mark the inode.
+			 */
+			if (isp->smk_flags & SMK_INODE_CHANGED) {
+				isp->smk_flags &= ~SMK_INODE_CHANGED;
+				rc = inode->i_op->setxattr(dp,
 					XATTR_NAME_SMACKTRANSMUTE,
-					trattr, TRANS_TRUE_SIZE);
-				if (strncmp(trattr, TRANS_TRUE,
-					    TRANS_TRUE_SIZE) == 0)
-					transflag = SMK_INODE_TRANSMUTE;
+					TRANS_TRUE, TRANS_TRUE_SIZE,
+					0);
+			} else {
+				rc = inode->i_op->getxattr(dp,
+					XATTR_NAME_SMACKTRANSMUTE, trattr,
+					TRANS_TRUE_SIZE);
+				if (rc >= 0 && strncmp(trattr, TRANS_TRUE,
+						       TRANS_TRUE_SIZE) != 0)
+					rc = -EINVAL;
 			}
+			if (rc >= 0)
+				transflag = SMK_INODE_TRANSMUTE;
 		}
 		isp->smk_task = smk_fetch(XATTR_NAME_SMACKEXEC, inode, dp);
 		isp->smk_mmap = smk_fetch(XATTR_NAME_SMACKMMAP, inode, dp);
@@ -2759,7 +2730,7 @@ static int smack_setprocattr(struct task_struct *p, char *name,
 	if (!capable(CAP_MAC_ADMIN))
 		return -EPERM;
 
-	if (value == NULL || size == 0 || size >= SMK_LABELLEN)
+	if (value == NULL || size == 0 || size >= SMK_LONGLABEL)
 		return -EINVAL;
 
 	if (strcmp(name, "current") != 0)
@@ -2895,10 +2866,9 @@ static int smack_socket_sendmsg(struct socket *sock, struct msghdr *msg,
 static char *smack_from_secattr(struct netlbl_lsm_secattr *sap,
 				struct socket_smack *ssp)
 {
-	struct smack_known *skp;
-	char smack[SMK_LABELLEN];
+	struct smack_known *kp;
 	char *sp;
-	int pcat;
+	int found = 0;
 
 	if ((sap->flags & NETLBL_SECATTR_MLS_LVL) != 0) {
 		/*
@@ -2906,59 +2876,27 @@ static char *smack_from_secattr(struct netlbl_lsm_secattr *sap,
 		 * If there are flags but no level netlabel isn't
 		 * behaving the way we expect it to.
 		 *
-		 * Get the categories, if any
+		 * Look it up in the label table
 		 * Without guidance regarding the smack value
 		 * for the packet fall back on the network
 		 * ambient value.
 		 */
-		memset(smack, '\0', SMK_LABELLEN);
-		if ((sap->flags & NETLBL_SECATTR_MLS_CAT) != 0)
-			for (pcat = -1;;) {
-				pcat = netlbl_secattr_catmap_walk(
-					sap->attr.mls.cat, pcat + 1);
-				if (pcat < 0)
-					break;
-				smack_catset_bit(pcat, smack);
-			}
-		/*
-		 * If it is CIPSO using smack direct mapping
-		 * we are already done. WeeHee.
-		 */
-		if (sap->attr.mls.lvl == smack_cipso_direct) {
-			/*
-			 * The label sent is usually on the label list.
-			 *
-			 * If it is not we may still want to allow the
-			 * delivery.
-			 *
-			 * If the recipient is accepting all packets
-			 * because it is using the star ("*") label
-			 * for SMACK64IPIN provide the web ("@") label
-			 * so that a directed response will succeed.
-			 * This is not very correct from a MAC point
-			 * of view, but gets around the problem that
-			 * locking prevents adding the newly discovered
-			 * label to the list.
-			 * The case where the recipient is not using
-			 * the star label should obviously fail.
-			 * The easy way to do this is to provide the
-			 * star label as the subject label.
-			 */
-			skp = smk_find_entry(smack);
-			if (skp != NULL)
-				return skp->smk_known;
-			if (ssp != NULL &&
-			    ssp->smk_in == smack_known_star.smk_known)
-				return smack_known_web.smk_known;
-			return smack_known_star.smk_known;
+		rcu_read_lock();
+		list_for_each_entry(kp, &smack_known_list, list) {
+			if (sap->attr.mls.lvl != kp->smk_netlabel.attr.mls.lvl)
+				continue;
+			if (memcmp(sap->attr.mls.cat,
+				kp->smk_netlabel.attr.mls.cat,
+				SMK_CIPSOLEN) != 0)
+				continue;
+			found = 1;
+			break;
 		}
-		/*
-		 * Look it up in the supplied table if it is not
-		 * a direct mapping.
-		 */
-		sp = smack_from_cipso(sap->attr.mls.lvl, smack);
-		if (sp != NULL)
-			return sp;
+		rcu_read_unlock();
+
+		if (found)
+			return kp->smk_known;
+
 		if (ssp != NULL && ssp->smk_in == smack_known_star.smk_known)
 			return smack_known_web.smk_known;
 		return smack_known_star.smk_known;
@@ -3158,11 +3096,13 @@ static int smack_inet_conn_request(struct sock *sk, struct sk_buff *skb,
 				   struct request_sock *req)
 {
 	u16 family = sk->sk_family;
+	struct smack_known *skp;
 	struct socket_smack *ssp = sk->sk_security;
 	struct netlbl_lsm_secattr secattr;
 	struct sockaddr_in addr;
 	struct iphdr *hdr;
 	char *sp;
+	char *hsp;
 	int rc;
 	struct smk_audit_info ad;
 #ifdef CONFIG_AUDIT
@@ -3209,16 +3149,14 @@ static int smack_inet_conn_request(struct sock *sk, struct sk_buff *skb,
 	hdr = ip_hdr(skb);
 	addr.sin_addr.s_addr = hdr->saddr;
 	rcu_read_lock();
-	if (smack_host_label(&addr) == NULL) {
-		rcu_read_unlock();
-		netlbl_secattr_init(&secattr);
-		smack_to_secattr(sp, &secattr);
-		rc = netlbl_req_setattr(req, &secattr);
-		netlbl_secattr_destroy(&secattr);
-	} else {
-		rcu_read_unlock();
+	hsp = smack_host_label(&addr);
+	rcu_read_unlock();
+
+	if (hsp == NULL) {
+		skp = smk_find_entry(sp);
+		rc = netlbl_req_setattr(req, &skp->smk_netlabel);
+	} else
 		netlbl_req_delattr(req);
-	}
 
 	return rc;
 }
@@ -3400,7 +3338,7 @@ static int smack_audit_rule_match(u32 secid, u32 field, u32 op, void *vrule,
 	char *rule = vrule;
 
 	if (!rule) {
-		audit_log(actx, GFP_KERNEL, AUDIT_SELINUX_ERR,
+		audit_log(actx, GFP_ATOMIC, AUDIT_SELINUX_ERR,
 			  "Smack: missing rule\n");
 		return -ENOENT;
 	}
@@ -3549,7 +3487,7 @@ struct security_operations smack_ops = {
 	.file_send_sigiotask = 		smack_file_send_sigiotask,
 	.file_receive = 		smack_file_receive,
 
-	.dentry_open =			smack_dentry_open,
+	.file_open =			smack_file_open,
 
 	.cred_alloc_blank =		smack_cred_alloc_blank,
 	.cred_free =			smack_cred_free,
@@ -3643,15 +3581,6 @@ struct security_operations smack_ops = {
 static __init void init_smack_known_list(void)
 {
 	/*
-	 * Initialize CIPSO locks
-	 */
-	spin_lock_init(&smack_known_huh.smk_cipsolock);
-	spin_lock_init(&smack_known_hat.smk_cipsolock);
-	spin_lock_init(&smack_known_star.smk_cipsolock);
-	spin_lock_init(&smack_known_floor.smk_cipsolock);
-	spin_lock_init(&smack_known_invalid.smk_cipsolock);
-	spin_lock_init(&smack_known_web.smk_cipsolock);
-	/*
 	 * Initialize rule list locks
 	 */
 	mutex_init(&smack_known_huh.smk_rules_lock);
diff --git a/security/smack/smackfs.c b/security/smack/smackfs.c
index 038811cb7e62..1810c9a4ed48 100644
--- a/security/smack/smackfs.c
+++ b/security/smack/smackfs.c
@@ -22,7 +22,6 @@
 #include <linux/mutex.h>
 #include <linux/slab.h>
 #include <net/net_namespace.h>
-#include <net/netlabel.h>
 #include <net/cipso_ipv4.h>
 #include <linux/seq_file.h>
 #include <linux/ctype.h>
@@ -45,6 +44,11 @@ enum smk_inos {
 	SMK_LOGGING	= 10,	/* logging */
 	SMK_LOAD_SELF	= 11,	/* task specific rules */
 	SMK_ACCESSES	= 12,	/* access policy */
+	SMK_MAPPED	= 13,	/* CIPSO level indicating mapped label */
+	SMK_LOAD2	= 14,	/* load policy with long labels */
+	SMK_LOAD_SELF2	= 15,	/* load task specific rules with long labels */
+	SMK_ACCESS2	= 16,	/* make an access check with long labels */
+	SMK_CIPSO2	= 17,	/* load long label -> CIPSO mapping */
 };
 
 /*
@@ -60,7 +64,7 @@ static DEFINE_MUTEX(smk_netlbladdr_lock);
  * If it isn't somehow marked, use this.
  * It can be reset via smackfs/ambient
  */
-char *smack_net_ambient = smack_known_floor.smk_known;
+char *smack_net_ambient;
 
 /*
  * This is the level in a CIPSO header that indicates a
@@ -70,6 +74,13 @@ char *smack_net_ambient = smack_known_floor.smk_known;
 int smack_cipso_direct = SMACK_CIPSO_DIRECT_DEFAULT;
 
 /*
+ * This is the level in a CIPSO header that indicates a
+ * secid is contained directly in the category set.
+ * It can be reset via smackfs/mapped
+ */
+int smack_cipso_mapped = SMACK_CIPSO_MAPPED_DEFAULT;
+
+/*
  * Unless a process is running with this label even
  * having CAP_MAC_OVERRIDE isn't enough to grant
  * privilege to violate MAC policy. If no label is
@@ -89,7 +100,7 @@ LIST_HEAD(smk_netlbladdr_list);
 
 /*
  * Rule lists are maintained for each label.
- * This master list is just for reading /smack/load.
+ * This master list is just for reading /smack/load and /smack/load2.
  */
 struct smack_master_list {
 	struct list_head	list;
@@ -125,6 +136,18 @@ const char *smack_cipso_option = SMACK_CIPSO_OPTION;
 #define SMK_OLOADLEN	(SMK_LABELLEN + SMK_LABELLEN + SMK_OACCESSLEN)
 #define SMK_LOADLEN	(SMK_LABELLEN + SMK_LABELLEN + SMK_ACCESSLEN)
 
+/*
+ * Stricly for CIPSO level manipulation.
+ * Set the category bit number in a smack label sized buffer.
+ */
+static inline void smack_catset_bit(unsigned int cat, char *catsetp)
+{
+	if (cat == 0 || cat > (SMK_CIPSOLEN * 8))
+		return;
+
+	catsetp[(cat - 1) / 8] |= 0x80 >> ((cat - 1) % 8);
+}
+
 /**
  * smk_netlabel_audit_set - fill a netlbl_audit struct
  * @nap: structure to fill
@@ -137,12 +160,10 @@ static void smk_netlabel_audit_set(struct netlbl_audit *nap)
 }
 
 /*
- * Values for parsing single label host rules
+ * Value for parsing single label host rules
  * "1.2.3.4 X"
- * "192.168.138.129/32 abcdefghijklmnopqrstuvw"
  */
 #define SMK_NETLBLADDRMIN	9
-#define SMK_NETLBLADDRMAX	42
 
 /**
  * smk_set_access - add a rule to the rule list
@@ -188,33 +209,47 @@ static int smk_set_access(struct smack_rule *srp, struct list_head *rule_list,
 }
 
 /**
- * smk_parse_rule - parse Smack rule from load string
- * @data: string to be parsed whose size is SMK_LOADLEN
+ * smk_fill_rule - Fill Smack rule from strings
+ * @subject: subject label string
+ * @object: object label string
+ * @access: access string
  * @rule: Smack rule
  * @import: if non-zero, import labels
+ *
+ * Returns 0 on success, -1 on failure
  */
-static int smk_parse_rule(const char *data, struct smack_rule *rule, int import)
+static int smk_fill_rule(const char *subject, const char *object,
+				const char *access, struct smack_rule *rule,
+				int import)
 {
-	char smack[SMK_LABELLEN];
+	int rc = -1;
+	int done;
+	const char *cp;
 	struct smack_known *skp;
 
 	if (import) {
-		rule->smk_subject = smk_import(data, 0);
+		rule->smk_subject = smk_import(subject, 0);
 		if (rule->smk_subject == NULL)
 			return -1;
 
-		rule->smk_object = smk_import(data + SMK_LABELLEN, 0);
+		rule->smk_object = smk_import(object, 0);
 		if (rule->smk_object == NULL)
 			return -1;
 	} else {
-		smk_parse_smack(data, 0, smack);
-		skp = smk_find_entry(smack);
+		cp = smk_parse_smack(subject, 0);
+		if (cp == NULL)
+			return -1;
+		skp = smk_find_entry(cp);
+		kfree(cp);
 		if (skp == NULL)
 			return -1;
 		rule->smk_subject = skp->smk_known;
 
-		smk_parse_smack(data + SMK_LABELLEN, 0, smack);
-		skp = smk_find_entry(smack);
+		cp = smk_parse_smack(object, 0);
+		if (cp == NULL)
+			return -1;
+		skp = smk_find_entry(cp);
+		kfree(cp);
 		if (skp == NULL)
 			return -1;
 		rule->smk_object = skp->smk_known;
@@ -222,90 +257,127 @@ static int smk_parse_rule(const char *data, struct smack_rule *rule, int import)
 
 	rule->smk_access = 0;
 
-	switch (data[SMK_LABELLEN + SMK_LABELLEN]) {
-	case '-':
-		break;
-	case 'r':
-	case 'R':
-		rule->smk_access |= MAY_READ;
-		break;
-	default:
-		return -1;
+	for (cp = access, done = 0; *cp && !done; cp++) {
+		switch (*cp) {
+		case '-':
+			break;
+		case 'r':
+		case 'R':
+			rule->smk_access |= MAY_READ;
+			break;
+		case 'w':
+		case 'W':
+			rule->smk_access |= MAY_WRITE;
+			break;
+		case 'x':
+		case 'X':
+			rule->smk_access |= MAY_EXEC;
+			break;
+		case 'a':
+		case 'A':
+			rule->smk_access |= MAY_APPEND;
+			break;
+		case 't':
+		case 'T':
+			rule->smk_access |= MAY_TRANSMUTE;
+			break;
+		default:
+			done = 1;
+			break;
+		}
 	}
+	rc = 0;
 
-	switch (data[SMK_LABELLEN + SMK_LABELLEN + 1]) {
-	case '-':
-		break;
-	case 'w':
-	case 'W':
-		rule->smk_access |= MAY_WRITE;
-		break;
-	default:
-		return -1;
-	}
+	return rc;
+}
 
-	switch (data[SMK_LABELLEN + SMK_LABELLEN + 2]) {
-	case '-':
-		break;
-	case 'x':
-	case 'X':
-		rule->smk_access |= MAY_EXEC;
-		break;
-	default:
-		return -1;
-	}
+/**
+ * smk_parse_rule - parse Smack rule from load string
+ * @data: string to be parsed whose size is SMK_LOADLEN
+ * @rule: Smack rule
+ * @import: if non-zero, import labels
+ *
+ * Returns 0 on success, -1 on errors.
+ */
+static int smk_parse_rule(const char *data, struct smack_rule *rule, int import)
+{
+	int rc;
 
-	switch (data[SMK_LABELLEN + SMK_LABELLEN + 3]) {
-	case '-':
-		break;
-	case 'a':
-	case 'A':
-		rule->smk_access |= MAY_APPEND;
-		break;
-	default:
-		return -1;
-	}
+	rc = smk_fill_rule(data, data + SMK_LABELLEN,
+			   data + SMK_LABELLEN + SMK_LABELLEN, rule, import);
+	return rc;
+}
 
-	switch (data[SMK_LABELLEN + SMK_LABELLEN + 4]) {
-	case '-':
-		break;
-	case 't':
-	case 'T':
-		rule->smk_access |= MAY_TRANSMUTE;
-		break;
-	default:
-		return -1;
-	}
+/**
+ * smk_parse_long_rule - parse Smack rule from rule string
+ * @data: string to be parsed, null terminated
+ * @rule: Smack rule
+ * @import: if non-zero, import labels
+ *
+ * Returns 0 on success, -1 on failure
+ */
+static int smk_parse_long_rule(const char *data, struct smack_rule *rule,
+				int import)
+{
+	char *subject;
+	char *object;
+	char *access;
+	int datalen;
+	int rc = -1;
 
-	return 0;
+	/*
+	 * This is probably inefficient, but safe.
+	 */
+	datalen = strlen(data);
+	subject = kzalloc(datalen, GFP_KERNEL);
+	if (subject == NULL)
+		return -1;
+	object = kzalloc(datalen, GFP_KERNEL);
+	if (object == NULL)
+		goto free_out_s;
+	access = kzalloc(datalen, GFP_KERNEL);
+	if (access == NULL)
+		goto free_out_o;
+
+	if (sscanf(data, "%s %s %s", subject, object, access) == 3)
+		rc = smk_fill_rule(subject, object, access, rule, import);
+
+	kfree(access);
+free_out_o:
+	kfree(object);
+free_out_s:
+	kfree(subject);
+	return rc;
 }
 
+#define SMK_FIXED24_FMT	0	/* Fixed 24byte label format */
+#define SMK_LONG_FMT	1	/* Variable long label format */
 /**
- * smk_write_load_list - write() for any /smack/load
+ * smk_write_rules_list - write() for any /smack rule file
  * @file: file pointer, not actually used
  * @buf: where to get the data from
  * @count: bytes sent
  * @ppos: where to start - must be 0
  * @rule_list: the list of rules to write to
  * @rule_lock: lock for the rule list
+ * @format: /smack/load or /smack/load2 format.
  *
  * Get one smack access rule from above.
- * The format is exactly:
- *     char subject[SMK_LABELLEN]
- *     char object[SMK_LABELLEN]
- *     char access[SMK_ACCESSLEN]
- *
- * writes must be SMK_LABELLEN+SMK_LABELLEN+SMK_ACCESSLEN bytes.
+ * The format for SMK_LONG_FMT is:
+ *	"subject<whitespace>object<whitespace>access[<whitespace>...]"
+ * The format for SMK_FIXED24_FMT is exactly:
+ *	"subject                 object                  rwxat"
  */
-static ssize_t smk_write_load_list(struct file *file, const char __user *buf,
-				size_t count, loff_t *ppos,
-				struct list_head *rule_list,
-				struct mutex *rule_lock)
+static ssize_t smk_write_rules_list(struct file *file, const char __user *buf,
+					size_t count, loff_t *ppos,
+					struct list_head *rule_list,
+					struct mutex *rule_lock, int format)
 {
 	struct smack_master_list *smlp;
 	struct smack_known *skp;
 	struct smack_rule *rule;
 	char *data;
+	int datalen;
 	int rc = -EINVAL;
 	int load = 0;
 
@@ -315,13 +387,18 @@ static ssize_t smk_write_load_list(struct file *file, const char __user *buf,
 	 */
 	if (*ppos != 0)
 		return -EINVAL;
-	/*
-	 * Minor hack for backward compatibility
-	 */
-	if (count < (SMK_OLOADLEN) || count > SMK_LOADLEN)
-		return -EINVAL;
 
-	data = kzalloc(SMK_LOADLEN, GFP_KERNEL);
+	if (format == SMK_FIXED24_FMT) {
+		/*
+		 * Minor hack for backward compatibility
+		 */
+		if (count != SMK_OLOADLEN && count != SMK_LOADLEN)
+			return -EINVAL;
+		datalen = SMK_LOADLEN;
+	} else
+		datalen = count + 1;
+
+	data = kzalloc(datalen, GFP_KERNEL);
 	if (data == NULL)
 		return -ENOMEM;
 
@@ -330,20 +407,29 @@ static ssize_t smk_write_load_list(struct file *file, const char __user *buf,
 		goto out;
 	}
 
-	/*
-	 * More on the minor hack for backward compatibility
-	 */
-	if (count == (SMK_OLOADLEN))
-		data[SMK_OLOADLEN] = '-';
-
 	rule = kzalloc(sizeof(*rule), GFP_KERNEL);
 	if (rule == NULL) {
 		rc = -ENOMEM;
 		goto out;
 	}
 
-	if (smk_parse_rule(data, rule, 1))
-		goto out_free_rule;
+	if (format == SMK_LONG_FMT) {
+		/*
+		 * Be sure the data string is terminated.
+		 */
+		data[count] = '\0';
+		if (smk_parse_long_rule(data, rule, 1))
+			goto out_free_rule;
+	} else {
+		/*
+		 * More on the minor hack for backward compatibility
+		 */
+		if (count == (SMK_OLOADLEN))
+			data[SMK_OLOADLEN] = '-';
+		if (smk_parse_rule(data, rule, 1))
+			goto out_free_rule;
+	}
+
 
 	if (rule_list == NULL) {
 		load = 1;
@@ -354,18 +440,20 @@ static ssize_t smk_write_load_list(struct file *file, const char __user *buf,
 
 	rc = count;
 	/*
-	 * If this is "load" as opposed to "load-self" and a new rule
+	 * If this is a global as opposed to self and a new rule
 	 * it needs to get added for reporting.
 	 * smk_set_access returns true if there was already a rule
 	 * for the subject/object pair, and false if it was new.
 	 */
-	if (load && !smk_set_access(rule, rule_list, rule_lock)) {
-		smlp = kzalloc(sizeof(*smlp), GFP_KERNEL);
-		if (smlp != NULL) {
-			smlp->smk_rule = rule;
-			list_add_rcu(&smlp->list, &smack_rule_list);
-		} else
-			rc = -ENOMEM;
+	if (!smk_set_access(rule, rule_list, rule_lock)) {
+		if (load) {
+			smlp = kzalloc(sizeof(*smlp), GFP_KERNEL);
+			if (smlp != NULL) {
+				smlp->smk_rule = rule;
+				list_add_rcu(&smlp->list, &smack_rule_list);
+			} else
+				rc = -ENOMEM;
+		}
 		goto out;
 	}
 
@@ -421,29 +509,18 @@ static void smk_seq_stop(struct seq_file *s, void *v)
 	/* No-op */
 }
 
-/*
- * Seq_file read operations for /smack/load
- */
-
-static void *load_seq_start(struct seq_file *s, loff_t *pos)
-{
-	return smk_seq_start(s, pos, &smack_rule_list);
-}
-
-static void *load_seq_next(struct seq_file *s, void *v, loff_t *pos)
+static void smk_rule_show(struct seq_file *s, struct smack_rule *srp, int max)
 {
-	return smk_seq_next(s, v, pos, &smack_rule_list);
-}
-
-static int load_seq_show(struct seq_file *s, void *v)
-{
-	struct list_head *list = v;
-	struct smack_master_list *smlp =
-		 list_entry(list, struct smack_master_list, list);
-	struct smack_rule *srp = smlp->smk_rule;
+	/*
+	 * Don't show any rules with label names too long for
+	 * interface file (/smack/load or /smack/load2)
+	 * because you should expect to be able to write
+	 * anything you read back.
+	 */
+	if (strlen(srp->smk_subject) >= max || strlen(srp->smk_object) >= max)
+		return;
 
-	seq_printf(s, "%s %s", (char *)srp->smk_subject,
-		   (char *)srp->smk_object);
+	seq_printf(s, "%s %s", srp->smk_subject, srp->smk_object);
 
 	seq_putc(s, ' ');
 
@@ -461,13 +538,36 @@ static int load_seq_show(struct seq_file *s, void *v)
 		seq_putc(s, '-');
 
 	seq_putc(s, '\n');
+}
+
+/*
+ * Seq_file read operations for /smack/load
+ */
+
+static void *load2_seq_start(struct seq_file *s, loff_t *pos)
+{
+	return smk_seq_start(s, pos, &smack_rule_list);
+}
+
+static void *load2_seq_next(struct seq_file *s, void *v, loff_t *pos)
+{
+	return smk_seq_next(s, v, pos, &smack_rule_list);
+}
+
+static int load_seq_show(struct seq_file *s, void *v)
+{
+	struct list_head *list = v;
+	struct smack_master_list *smlp =
+		 list_entry(list, struct smack_master_list, list);
+
+	smk_rule_show(s, smlp->smk_rule, SMK_LABELLEN);
 
 	return 0;
 }
 
 static const struct seq_operations load_seq_ops = {
-	.start = load_seq_start,
-	.next  = load_seq_next,
+	.start = load2_seq_start,
+	.next  = load2_seq_next,
 	.show  = load_seq_show,
 	.stop  = smk_seq_stop,
 };
@@ -504,7 +604,8 @@ static ssize_t smk_write_load(struct file *file, const char __user *buf,
 	if (!capable(CAP_MAC_ADMIN))
 		return -EPERM;
 
-	return smk_write_load_list(file, buf, count, ppos, NULL, NULL);
+	return smk_write_rules_list(file, buf, count, ppos, NULL, NULL,
+				    SMK_FIXED24_FMT);
 }
 
 static const struct file_operations smk_load_ops = {
@@ -574,6 +675,8 @@ static void smk_unlbl_ambient(char *oldambient)
 			printk(KERN_WARNING "%s:%d remove rc = %d\n",
 			       __func__, __LINE__, rc);
 	}
+	if (smack_net_ambient == NULL)
+		smack_net_ambient = smack_known_floor.smk_known;
 
 	rc = netlbl_cfg_unlbl_map_add(smack_net_ambient, PF_INET,
 				      NULL, NULL, &nai);
@@ -605,27 +708,28 @@ static int cipso_seq_show(struct seq_file *s, void *v)
 	struct list_head  *list = v;
 	struct smack_known *skp =
 		 list_entry(list, struct smack_known, list);
-	struct smack_cipso *scp = skp->smk_cipso;
-	char *cbp;
+	struct netlbl_lsm_secattr_catmap *cmp = skp->smk_netlabel.attr.mls.cat;
 	char sep = '/';
-	int cat = 1;
 	int i;
-	unsigned char m;
 
-	if (scp == NULL)
+	/*
+	 * Don't show a label that could not have been set using
+	 * /smack/cipso. This is in support of the notion that
+	 * anything read from /smack/cipso ought to be writeable
+	 * to /smack/cipso.
+	 *
+	 * /smack/cipso2 should be used instead.
+	 */
+	if (strlen(skp->smk_known) >= SMK_LABELLEN)
 		return 0;
 
-	seq_printf(s, "%s %3d", (char *)&skp->smk_known, scp->smk_level);
+	seq_printf(s, "%s %3d", skp->smk_known, skp->smk_netlabel.attr.mls.lvl);
 
-	cbp = scp->smk_catset;
-	for (i = 0; i < SMK_LABELLEN; i++)
-		for (m = 0x80; m != 0; m >>= 1) {
-			if (m & cbp[i]) {
-				seq_printf(s, "%c%d", sep, cat);
-				sep = ',';
-			}
-			cat++;
-		}
+	for (i = netlbl_secattr_catmap_walk(cmp, 0); i >= 0;
+	     i = netlbl_secattr_catmap_walk(cmp, i + 1)) {
+		seq_printf(s, "%c%d", sep, i);
+		sep = ',';
+	}
 
 	seq_putc(s, '\n');
 
@@ -653,23 +757,24 @@ static int smk_open_cipso(struct inode *inode, struct file *file)
 }
 
 /**
- * smk_write_cipso - write() for /smack/cipso
+ * smk_set_cipso - do the work for write() for cipso and cipso2
  * @file: file pointer, not actually used
  * @buf: where to get the data from
  * @count: bytes sent
  * @ppos: where to start
+ * @format: /smack/cipso or /smack/cipso2
  *
  * Accepts only one cipso rule per write call.
  * Returns number of bytes written or error code, as appropriate
  */
-static ssize_t smk_write_cipso(struct file *file, const char __user *buf,
-			       size_t count, loff_t *ppos)
+static ssize_t smk_set_cipso(struct file *file, const char __user *buf,
+				size_t count, loff_t *ppos, int format)
 {
 	struct smack_known *skp;
-	struct smack_cipso *scp = NULL;
-	char mapcatset[SMK_LABELLEN];
+	struct netlbl_lsm_secattr ncats;
+	char mapcatset[SMK_CIPSOLEN];
 	int maplevel;
-	int cat;
+	unsigned int cat;
 	int catlen;
 	ssize_t rc = -EINVAL;
 	char *data = NULL;
@@ -686,7 +791,8 @@ static ssize_t smk_write_cipso(struct file *file, const char __user *buf,
 		return -EPERM;
 	if (*ppos != 0)
 		return -EINVAL;
-	if (count < SMK_CIPSOMIN || count > SMK_CIPSOMAX)
+	if (format == SMK_FIXED24_FMT &&
+	    (count < SMK_CIPSOMIN || count > SMK_CIPSOMAX))
 		return -EINVAL;
 
 	data = kzalloc(count + 1, GFP_KERNEL);
@@ -698,11 +804,6 @@ static ssize_t smk_write_cipso(struct file *file, const char __user *buf,
 		goto unlockedout;
 	}
 
-	/* labels cannot begin with a '-' */
-	if (data[0] == '-') {
-		rc = -EINVAL;
-		goto unlockedout;
-	}
 	data[count] = '\0';
 	rule = data;
 	/*
@@ -715,7 +816,11 @@ static ssize_t smk_write_cipso(struct file *file, const char __user *buf,
 	if (skp == NULL)
 		goto out;
 
-	rule += SMK_LABELLEN;
+	if (format == SMK_FIXED24_FMT)
+		rule += SMK_LABELLEN;
+	else
+		rule += strlen(skp->smk_known);
+
 	ret = sscanf(rule, "%d", &maplevel);
 	if (ret != 1 || maplevel > SMACK_CIPSO_MAXLEVEL)
 		goto out;
@@ -725,41 +830,29 @@ static ssize_t smk_write_cipso(struct file *file, const char __user *buf,
 	if (ret != 1 || catlen > SMACK_CIPSO_MAXCATNUM)
 		goto out;
 
-	if (count != (SMK_CIPSOMIN + catlen * SMK_DIGITLEN))
+	if (format == SMK_FIXED24_FMT &&
+	    count != (SMK_CIPSOMIN + catlen * SMK_DIGITLEN))
 		goto out;
 
 	memset(mapcatset, 0, sizeof(mapcatset));
 
 	for (i = 0; i < catlen; i++) {
 		rule += SMK_DIGITLEN;
-		ret = sscanf(rule, "%d", &cat);
+		ret = sscanf(rule, "%u", &cat);
 		if (ret != 1 || cat > SMACK_CIPSO_MAXCATVAL)
 			goto out;
 
 		smack_catset_bit(cat, mapcatset);
 	}
 
-	if (skp->smk_cipso == NULL) {
-		scp = kzalloc(sizeof(struct smack_cipso), GFP_KERNEL);
-		if (scp == NULL) {
-			rc = -ENOMEM;
-			goto out;
-		}
+	rc = smk_netlbl_mls(maplevel, mapcatset, &ncats, SMK_CIPSOLEN);
+	if (rc >= 0) {
+		netlbl_secattr_catmap_free(skp->smk_netlabel.attr.mls.cat);
+		skp->smk_netlabel.attr.mls.cat = ncats.attr.mls.cat;
+		skp->smk_netlabel.attr.mls.lvl = ncats.attr.mls.lvl;
+		rc = count;
 	}
 
-	spin_lock_bh(&skp->smk_cipsolock);
-
-	if (scp == NULL)
-		scp = skp->smk_cipso;
-	else
-		skp->smk_cipso = scp;
-
-	scp->smk_level = maplevel;
-	memcpy(scp->smk_catset, mapcatset, sizeof(mapcatset));
-
-	spin_unlock_bh(&skp->smk_cipsolock);
-
-	rc = count;
 out:
 	mutex_unlock(&smack_cipso_lock);
 unlockedout:
@@ -767,6 +860,22 @@ unlockedout:
 	return rc;
 }
 
+/**
+ * smk_write_cipso - write() for /smack/cipso
+ * @file: file pointer, not actually used
+ * @buf: where to get the data from
+ * @count: bytes sent
+ * @ppos: where to start
+ *
+ * Accepts only one cipso rule per write call.
+ * Returns number of bytes written or error code, as appropriate
+ */
+static ssize_t smk_write_cipso(struct file *file, const char __user *buf,
+			       size_t count, loff_t *ppos)
+{
+	return smk_set_cipso(file, buf, count, ppos, SMK_FIXED24_FMT);
+}
+
 static const struct file_operations smk_cipso_ops = {
 	.open           = smk_open_cipso,
 	.read		= seq_read,
@@ -776,6 +885,80 @@ static const struct file_operations smk_cipso_ops = {
 };
 
 /*
+ * Seq_file read operations for /smack/cipso2
+ */
+
+/*
+ * Print cipso labels in format:
+ * label level[/cat[,cat]]
+ */
+static int cipso2_seq_show(struct seq_file *s, void *v)
+{
+	struct list_head  *list = v;
+	struct smack_known *skp =
+		 list_entry(list, struct smack_known, list);
+	struct netlbl_lsm_secattr_catmap *cmp = skp->smk_netlabel.attr.mls.cat;
+	char sep = '/';
+	int i;
+
+	seq_printf(s, "%s %3d", skp->smk_known, skp->smk_netlabel.attr.mls.lvl);
+
+	for (i = netlbl_secattr_catmap_walk(cmp, 0); i >= 0;
+	     i = netlbl_secattr_catmap_walk(cmp, i + 1)) {
+		seq_printf(s, "%c%d", sep, i);
+		sep = ',';
+	}
+
+	seq_putc(s, '\n');
+
+	return 0;
+}
+
+static const struct seq_operations cipso2_seq_ops = {
+	.start = cipso_seq_start,
+	.next  = cipso_seq_next,
+	.show  = cipso2_seq_show,
+	.stop  = smk_seq_stop,
+};
+
+/**
+ * smk_open_cipso2 - open() for /smack/cipso2
+ * @inode: inode structure representing file
+ * @file: "cipso2" file pointer
+ *
+ * Connect our cipso_seq_* operations with /smack/cipso2
+ * file_operations
+ */
+static int smk_open_cipso2(struct inode *inode, struct file *file)
+{
+	return seq_open(file, &cipso2_seq_ops);
+}
+
+/**
+ * smk_write_cipso2 - write() for /smack/cipso2
+ * @file: file pointer, not actually used
+ * @buf: where to get the data from
+ * @count: bytes sent
+ * @ppos: where to start
+ *
+ * Accepts only one cipso rule per write call.
+ * Returns number of bytes written or error code, as appropriate
+ */
+static ssize_t smk_write_cipso2(struct file *file, const char __user *buf,
+			      size_t count, loff_t *ppos)
+{
+	return smk_set_cipso(file, buf, count, ppos, SMK_LONG_FMT);
+}
+
+static const struct file_operations smk_cipso2_ops = {
+	.open           = smk_open_cipso2,
+	.read		= seq_read,
+	.llseek         = seq_lseek,
+	.write		= smk_write_cipso2,
+	.release        = seq_release,
+};
+
+/*
  * Seq_file read operations for /smack/netlabel
  */
 
@@ -887,9 +1070,9 @@ static ssize_t smk_write_netlbladdr(struct file *file, const char __user *buf,
 {
 	struct smk_netlbladdr *skp;
 	struct sockaddr_in newname;
-	char smack[SMK_LABELLEN];
+	char *smack;
 	char *sp;
-	char data[SMK_NETLBLADDRMAX + 1];
+	char *data;
 	char *host = (char *)&newname.sin_addr.s_addr;
 	int rc;
 	struct netlbl_audit audit_info;
@@ -911,10 +1094,23 @@ static ssize_t smk_write_netlbladdr(struct file *file, const char __user *buf,
 		return -EPERM;
 	if (*ppos != 0)
 		return -EINVAL;
-	if (count < SMK_NETLBLADDRMIN || count > SMK_NETLBLADDRMAX)
+	if (count < SMK_NETLBLADDRMIN)
 		return -EINVAL;
-	if (copy_from_user(data, buf, count) != 0)
-		return -EFAULT;
+
+	data = kzalloc(count + 1, GFP_KERNEL);
+	if (data == NULL)
+		return -ENOMEM;
+
+	if (copy_from_user(data, buf, count) != 0) {
+		rc = -EFAULT;
+		goto free_data_out;
+	}
+
+	smack = kzalloc(count + 1, GFP_KERNEL);
+	if (smack == NULL) {
+		rc = -ENOMEM;
+		goto free_data_out;
+	}
 
 	data[count] = '\0';
 
@@ -923,24 +1119,34 @@ static ssize_t smk_write_netlbladdr(struct file *file, const char __user *buf,
 	if (rc != 6) {
 		rc = sscanf(data, "%hhd.%hhd.%hhd.%hhd %s",
 			&host[0], &host[1], &host[2], &host[3], smack);
-		if (rc != 5)
-			return -EINVAL;
+		if (rc != 5) {
+			rc = -EINVAL;
+			goto free_out;
+		}
 		m = BEBITS;
 	}
-	if (m > BEBITS)
-		return -EINVAL;
+	if (m > BEBITS) {
+		rc = -EINVAL;
+		goto free_out;
+	}
 
-	/* if smack begins with '-', its an option, don't import it */
+	/*
+	 * If smack begins with '-', it is an option, don't import it
+	 */
 	if (smack[0] != '-') {
 		sp = smk_import(smack, 0);
-		if (sp == NULL)
-			return -EINVAL;
+		if (sp == NULL) {
+			rc = -EINVAL;
+			goto free_out;
+		}
 	} else {
 		/* check known options */
 		if (strcmp(smack, smack_cipso_option) == 0)
 			sp = (char *)smack_cipso_option;
-		else
-			return -EINVAL;
+		else {
+			rc = -EINVAL;
+			goto free_out;
+		}
 	}
 
 	for (temp_mask = 0; m > 0; m--) {
@@ -1006,6 +1212,11 @@ static ssize_t smk_write_netlbladdr(struct file *file, const char __user *buf,
 
 	mutex_unlock(&smk_netlbladdr_lock);
 
+free_out:
+	kfree(smack);
+free_data_out:
+	kfree(data);
+
 	return rc;
 }
 
@@ -1119,6 +1330,7 @@ static ssize_t smk_read_direct(struct file *filp, char __user *buf,
 static ssize_t smk_write_direct(struct file *file, const char __user *buf,
 				size_t count, loff_t *ppos)
 {
+	struct smack_known *skp;
 	char temp[80];
 	int i;
 
@@ -1136,7 +1348,20 @@ static ssize_t smk_write_direct(struct file *file, const char __user *buf,
 	if (sscanf(temp, "%d", &i) != 1)
 		return -EINVAL;
 
-	smack_cipso_direct = i;
+	/*
+	 * Don't do anything if the value hasn't actually changed.
+	 * If it is changing reset the level on entries that were
+	 * set up to be direct when they were created.
+	 */
+	if (smack_cipso_direct != i) {
+		mutex_lock(&smack_known_lock);
+		list_for_each_entry_rcu(skp, &smack_known_list, list)
+			if (skp->smk_netlabel.attr.mls.lvl ==
+			    smack_cipso_direct)
+				skp->smk_netlabel.attr.mls.lvl = i;
+		smack_cipso_direct = i;
+		mutex_unlock(&smack_known_lock);
+	}
 
 	return count;
 }
@@ -1148,6 +1373,84 @@ static const struct file_operations smk_direct_ops = {
 };
 
 /**
+ * smk_read_mapped - read() for /smack/mapped
+ * @filp: file pointer, not actually used
+ * @buf: where to put the result
+ * @count: maximum to send along
+ * @ppos: where to start
+ *
+ * Returns number of bytes read or error code, as appropriate
+ */
+static ssize_t smk_read_mapped(struct file *filp, char __user *buf,
+			       size_t count, loff_t *ppos)
+{
+	char temp[80];
+	ssize_t rc;
+
+	if (*ppos != 0)
+		return 0;
+
+	sprintf(temp, "%d", smack_cipso_mapped);
+	rc = simple_read_from_buffer(buf, count, ppos, temp, strlen(temp));
+
+	return rc;
+}
+
+/**
+ * smk_write_mapped - write() for /smack/mapped
+ * @file: file pointer, not actually used
+ * @buf: where to get the data from
+ * @count: bytes sent
+ * @ppos: where to start
+ *
+ * Returns number of bytes written or error code, as appropriate
+ */
+static ssize_t smk_write_mapped(struct file *file, const char __user *buf,
+				size_t count, loff_t *ppos)
+{
+	struct smack_known *skp;
+	char temp[80];
+	int i;
+
+	if (!capable(CAP_MAC_ADMIN))
+		return -EPERM;
+
+	if (count >= sizeof(temp) || count == 0)
+		return -EINVAL;
+
+	if (copy_from_user(temp, buf, count) != 0)
+		return -EFAULT;
+
+	temp[count] = '\0';
+
+	if (sscanf(temp, "%d", &i) != 1)
+		return -EINVAL;
+
+	/*
+	 * Don't do anything if the value hasn't actually changed.
+	 * If it is changing reset the level on entries that were
+	 * set up to be mapped when they were created.
+	 */
+	if (smack_cipso_mapped != i) {
+		mutex_lock(&smack_known_lock);
+		list_for_each_entry_rcu(skp, &smack_known_list, list)
+			if (skp->smk_netlabel.attr.mls.lvl ==
+			    smack_cipso_mapped)
+				skp->smk_netlabel.attr.mls.lvl = i;
+		smack_cipso_mapped = i;
+		mutex_unlock(&smack_known_lock);
+	}
+
+	return count;
+}
+
+static const struct file_operations smk_mapped_ops = {
+	.read		= smk_read_mapped,
+	.write		= smk_write_mapped,
+	.llseek		= default_llseek,
+};
+
+/**
  * smk_read_ambient - read() for /smack/ambient
  * @filp: file pointer, not actually used
  * @buf: where to put the result
@@ -1195,22 +1498,28 @@ static ssize_t smk_read_ambient(struct file *filp, char __user *buf,
 static ssize_t smk_write_ambient(struct file *file, const char __user *buf,
 				 size_t count, loff_t *ppos)
 {
-	char in[SMK_LABELLEN];
 	char *oldambient;
-	char *smack;
+	char *smack = NULL;
+	char *data;
+	int rc = count;
 
 	if (!capable(CAP_MAC_ADMIN))
 		return -EPERM;
 
-	if (count >= SMK_LABELLEN)
-		return -EINVAL;
+	data = kzalloc(count + 1, GFP_KERNEL);
+	if (data == NULL)
+		return -ENOMEM;
 
-	if (copy_from_user(in, buf, count) != 0)
-		return -EFAULT;
+	if (copy_from_user(data, buf, count) != 0) {
+		rc = -EFAULT;
+		goto out;
+	}
 
-	smack = smk_import(in, count);
-	if (smack == NULL)
-		return -EINVAL;
+	smack = smk_import(data, count);
+	if (smack == NULL) {
+		rc = -EINVAL;
+		goto out;
+	}
 
 	mutex_lock(&smack_ambient_lock);
 
@@ -1220,7 +1529,9 @@ static ssize_t smk_write_ambient(struct file *file, const char __user *buf,
 
 	mutex_unlock(&smack_ambient_lock);
 
-	return count;
+out:
+	kfree(data);
+	return rc;
 }
 
 static const struct file_operations smk_ambient_ops = {
@@ -1271,8 +1582,9 @@ static ssize_t smk_read_onlycap(struct file *filp, char __user *buf,
 static ssize_t smk_write_onlycap(struct file *file, const char __user *buf,
 				 size_t count, loff_t *ppos)
 {
-	char in[SMK_LABELLEN];
+	char *data;
 	char *sp = smk_of_task(current->cred->security);
+	int rc = count;
 
 	if (!capable(CAP_MAC_ADMIN))
 		return -EPERM;
@@ -1285,11 +1597,9 @@ static ssize_t smk_write_onlycap(struct file *file, const char __user *buf,
 	if (smack_onlycap != NULL && smack_onlycap != sp)
 		return -EPERM;
 
-	if (count >= SMK_LABELLEN)
-		return -EINVAL;
-
-	if (copy_from_user(in, buf, count) != 0)
-		return -EFAULT;
+	data = kzalloc(count, GFP_KERNEL);
+	if (data == NULL)
+		return -ENOMEM;
 
 	/*
 	 * Should the null string be passed in unset the onlycap value.
@@ -1297,10 +1607,17 @@ static ssize_t smk_write_onlycap(struct file *file, const char __user *buf,
 	 * smk_import only expects to return NULL for errors. It
 	 * is usually the case that a nullstring or "\n" would be
 	 * bad to pass to smk_import but in fact this is useful here.
+	 *
+	 * smk_import will also reject a label beginning with '-',
+	 * so "-usecapabilities" will also work.
 	 */
-	smack_onlycap = smk_import(in, count);
+	if (copy_from_user(data, buf, count) != 0)
+		rc = -EFAULT;
+	else
+		smack_onlycap = smk_import(data, count);
 
-	return count;
+	kfree(data);
+	return rc;
 }
 
 static const struct file_operations smk_onlycap_ops = {
@@ -1398,25 +1715,7 @@ static int load_self_seq_show(struct seq_file *s, void *v)
 	struct smack_rule *srp =
 		 list_entry(list, struct smack_rule, list);
 
-	seq_printf(s, "%s %s", (char *)srp->smk_subject,
-		   (char *)srp->smk_object);
-
-	seq_putc(s, ' ');
-
-	if (srp->smk_access & MAY_READ)
-		seq_putc(s, 'r');
-	if (srp->smk_access & MAY_WRITE)
-		seq_putc(s, 'w');
-	if (srp->smk_access & MAY_EXEC)
-		seq_putc(s, 'x');
-	if (srp->smk_access & MAY_APPEND)
-		seq_putc(s, 'a');
-	if (srp->smk_access & MAY_TRANSMUTE)
-		seq_putc(s, 't');
-	if (srp->smk_access == 0)
-		seq_putc(s, '-');
-
-	seq_putc(s, '\n');
+	smk_rule_show(s, srp, SMK_LABELLEN);
 
 	return 0;
 }
@@ -1430,7 +1729,7 @@ static const struct seq_operations load_self_seq_ops = {
 
 
 /**
- * smk_open_load_self - open() for /smack/load-self
+ * smk_open_load_self - open() for /smack/load-self2
  * @inode: inode structure representing file
  * @file: "load" file pointer
  *
@@ -1454,8 +1753,8 @@ static ssize_t smk_write_load_self(struct file *file, const char __user *buf,
 {
 	struct task_smack *tsp = current_security();
 
-	return smk_write_load_list(file, buf, count, ppos, &tsp->smk_rules,
-					&tsp->smk_rules_lock);
+	return smk_write_rules_list(file, buf, count, ppos, &tsp->smk_rules,
+				    &tsp->smk_rules_lock, SMK_FIXED24_FMT);
 }
 
 static const struct file_operations smk_load_self_ops = {
@@ -1467,24 +1766,42 @@ static const struct file_operations smk_load_self_ops = {
 };
 
 /**
- * smk_write_access - handle access check transaction
+ * smk_user_access - handle access check transaction
  * @file: file pointer
  * @buf: data from user space
  * @count: bytes sent
  * @ppos: where to start - must be 0
  */
-static ssize_t smk_write_access(struct file *file, const char __user *buf,
-				size_t count, loff_t *ppos)
+static ssize_t smk_user_access(struct file *file, const char __user *buf,
+				size_t count, loff_t *ppos, int format)
 {
 	struct smack_rule rule;
 	char *data;
+	char *cod;
 	int res;
 
 	data = simple_transaction_get(file, buf, count);
 	if (IS_ERR(data))
 		return PTR_ERR(data);
 
-	if (count < SMK_LOADLEN || smk_parse_rule(data, &rule, 0))
+	if (format == SMK_FIXED24_FMT) {
+		if (count < SMK_LOADLEN)
+			return -EINVAL;
+		res = smk_parse_rule(data, &rule, 0);
+	} else {
+		/*
+		 * Copy the data to make sure the string is terminated.
+		 */
+		cod = kzalloc(count + 1, GFP_KERNEL);
+		if (cod == NULL)
+			return -ENOMEM;
+		memcpy(cod, data, count);
+		cod[count] = '\0';
+		res = smk_parse_long_rule(cod, &rule, 0);
+		kfree(cod);
+	}
+
+	if (res)
 		return -EINVAL;
 
 	res = smk_access(rule.smk_subject, rule.smk_object, rule.smk_access,
@@ -1493,7 +1810,23 @@ static ssize_t smk_write_access(struct file *file, const char __user *buf,
 	data[1] = '\0';
 
 	simple_transaction_set(file, 2);
-	return SMK_LOADLEN;
+
+	if (format == SMK_FIXED24_FMT)
+		return SMK_LOADLEN;
+	return count;
+}
+
+/**
+ * smk_write_access - handle access check transaction
+ * @file: file pointer
+ * @buf: data from user space
+ * @count: bytes sent
+ * @ppos: where to start - must be 0
+ */
+static ssize_t smk_write_access(struct file *file, const char __user *buf,
+				size_t count, loff_t *ppos)
+{
+	return smk_user_access(file, buf, count, ppos, SMK_FIXED24_FMT);
 }
 
 static const struct file_operations smk_access_ops = {
@@ -1503,6 +1836,163 @@ static const struct file_operations smk_access_ops = {
 	.llseek		= generic_file_llseek,
 };
 
+
+/*
+ * Seq_file read operations for /smack/load2
+ */
+
+static int load2_seq_show(struct seq_file *s, void *v)
+{
+	struct list_head *list = v;
+	struct smack_master_list *smlp =
+		 list_entry(list, struct smack_master_list, list);
+
+	smk_rule_show(s, smlp->smk_rule, SMK_LONGLABEL);
+
+	return 0;
+}
+
+static const struct seq_operations load2_seq_ops = {
+	.start = load2_seq_start,
+	.next  = load2_seq_next,
+	.show  = load2_seq_show,
+	.stop  = smk_seq_stop,
+};
+
+/**
+ * smk_open_load2 - open() for /smack/load2
+ * @inode: inode structure representing file
+ * @file: "load2" file pointer
+ *
+ * For reading, use load2_seq_* seq_file reading operations.
+ */
+static int smk_open_load2(struct inode *inode, struct file *file)
+{
+	return seq_open(file, &load2_seq_ops);
+}
+
+/**
+ * smk_write_load2 - write() for /smack/load2
+ * @file: file pointer, not actually used
+ * @buf: where to get the data from
+ * @count: bytes sent
+ * @ppos: where to start - must be 0
+ *
+ */
+static ssize_t smk_write_load2(struct file *file, const char __user *buf,
+				size_t count, loff_t *ppos)
+{
+	/*
+	 * Must have privilege.
+	 */
+	if (!capable(CAP_MAC_ADMIN))
+		return -EPERM;
+
+	return smk_write_rules_list(file, buf, count, ppos, NULL, NULL,
+				    SMK_LONG_FMT);
+}
+
+static const struct file_operations smk_load2_ops = {
+	.open           = smk_open_load2,
+	.read		= seq_read,
+	.llseek         = seq_lseek,
+	.write		= smk_write_load2,
+	.release        = seq_release,
+};
+
+/*
+ * Seq_file read operations for /smack/load-self2
+ */
+
+static void *load_self2_seq_start(struct seq_file *s, loff_t *pos)
+{
+	struct task_smack *tsp = current_security();
+
+	return smk_seq_start(s, pos, &tsp->smk_rules);
+}
+
+static void *load_self2_seq_next(struct seq_file *s, void *v, loff_t *pos)
+{
+	struct task_smack *tsp = current_security();
+
+	return smk_seq_next(s, v, pos, &tsp->smk_rules);
+}
+
+static int load_self2_seq_show(struct seq_file *s, void *v)
+{
+	struct list_head *list = v;
+	struct smack_rule *srp =
+		 list_entry(list, struct smack_rule, list);
+
+	smk_rule_show(s, srp, SMK_LONGLABEL);
+
+	return 0;
+}
+
+static const struct seq_operations load_self2_seq_ops = {
+	.start = load_self2_seq_start,
+	.next  = load_self2_seq_next,
+	.show  = load_self2_seq_show,
+	.stop  = smk_seq_stop,
+};
+
+/**
+ * smk_open_load_self2 - open() for /smack/load-self2
+ * @inode: inode structure representing file
+ * @file: "load" file pointer
+ *
+ * For reading, use load_seq_* seq_file reading operations.
+ */
+static int smk_open_load_self2(struct inode *inode, struct file *file)
+{
+	return seq_open(file, &load_self2_seq_ops);
+}
+
+/**
+ * smk_write_load_self2 - write() for /smack/load-self2
+ * @file: file pointer, not actually used
+ * @buf: where to get the data from
+ * @count: bytes sent
+ * @ppos: where to start - must be 0
+ *
+ */
+static ssize_t smk_write_load_self2(struct file *file, const char __user *buf,
+			      size_t count, loff_t *ppos)
+{
+	struct task_smack *tsp = current_security();
+
+	return smk_write_rules_list(file, buf, count, ppos, &tsp->smk_rules,
+				    &tsp->smk_rules_lock, SMK_LONG_FMT);
+}
+
+static const struct file_operations smk_load_self2_ops = {
+	.open           = smk_open_load_self2,
+	.read		= seq_read,
+	.llseek         = seq_lseek,
+	.write		= smk_write_load_self2,
+	.release        = seq_release,
+};
+
+/**
+ * smk_write_access2 - handle access check transaction
+ * @file: file pointer
+ * @buf: data from user space
+ * @count: bytes sent
+ * @ppos: where to start - must be 0
+ */
+static ssize_t smk_write_access2(struct file *file, const char __user *buf,
+					size_t count, loff_t *ppos)
+{
+	return smk_user_access(file, buf, count, ppos, SMK_LONG_FMT);
+}
+
+static const struct file_operations smk_access2_ops = {
+	.write		= smk_write_access2,
+	.read		= simple_transaction_read,
+	.release	= simple_transaction_release,
+	.llseek		= generic_file_llseek,
+};
+
 /**
  * smk_fill_super - fill the /smackfs superblock
  * @sb: the empty superblock
@@ -1539,6 +2029,16 @@ static int smk_fill_super(struct super_block *sb, void *data, int silent)
 			"load-self", &smk_load_self_ops, S_IRUGO|S_IWUGO},
 		[SMK_ACCESSES] = {
 			"access", &smk_access_ops, S_IRUGO|S_IWUGO},
+		[SMK_MAPPED] = {
+			"mapped", &smk_mapped_ops, S_IRUGO|S_IWUSR},
+		[SMK_LOAD2] = {
+			"load2", &smk_load2_ops, S_IRUGO|S_IWUSR},
+		[SMK_LOAD_SELF2] = {
+			"load-self2", &smk_load_self2_ops, S_IRUGO|S_IWUGO},
+		[SMK_ACCESS2] = {
+			"access2", &smk_access2_ops, S_IRUGO|S_IWUGO},
+		[SMK_CIPSO2] = {
+			"cipso2", &smk_cipso2_ops, S_IRUGO|S_IWUSR},
 		/* last one */
 			{""}
 	};
@@ -1581,6 +2081,15 @@ static struct file_system_type smk_fs_type = {
 
 static struct vfsmount *smackfs_mount;
 
+static int __init smk_preset_netlabel(struct smack_known *skp)
+{
+	skp->smk_netlabel.domain = skp->smk_known;
+	skp->smk_netlabel.flags =
+		NETLBL_SECATTR_DOMAIN | NETLBL_SECATTR_MLS_LVL;
+	return smk_netlbl_mls(smack_cipso_direct, skp->smk_known,
+				&skp->smk_netlabel, strlen(skp->smk_known));
+}
+
 /**
  * init_smk_fs - get the smackfs superblock
  *
@@ -1597,6 +2106,7 @@ static struct vfsmount *smackfs_mount;
 static int __init init_smk_fs(void)
 {
 	int err;
+	int rc;
 
 	if (!security_module_enable(&smack_ops))
 		return 0;
@@ -1614,6 +2124,25 @@ static int __init init_smk_fs(void)
 	smk_cipso_doi();
 	smk_unlbl_ambient(NULL);
 
+	rc = smk_preset_netlabel(&smack_known_floor);
+	if (err == 0 && rc < 0)
+		err = rc;
+	rc = smk_preset_netlabel(&smack_known_hat);
+	if (err == 0 && rc < 0)
+		err = rc;
+	rc = smk_preset_netlabel(&smack_known_huh);
+	if (err == 0 && rc < 0)
+		err = rc;
+	rc = smk_preset_netlabel(&smack_known_invalid);
+	if (err == 0 && rc < 0)
+		err = rc;
+	rc = smk_preset_netlabel(&smack_known_star);
+	if (err == 0 && rc < 0)
+		err = rc;
+	rc = smk_preset_netlabel(&smack_known_web);
+	if (err == 0 && rc < 0)
+		err = rc;
+
 	return err;
 }
 
diff --git a/security/tomoyo/common.c b/security/tomoyo/common.c
index 8656b16eef7b..2e0f12c62938 100644
--- a/security/tomoyo/common.c
+++ b/security/tomoyo/common.c
@@ -850,14 +850,9 @@ static int tomoyo_update_manager_entry(const char *manager,
 		policy_list[TOMOYO_ID_MANAGER],
 	};
 	int error = is_delete ? -ENOENT : -ENOMEM;
-	if (tomoyo_domain_def(manager)) {
-		if (!tomoyo_correct_domain(manager))
-			return -EINVAL;
-		e.is_domain = true;
-	} else {
-		if (!tomoyo_correct_path(manager))
-			return -EINVAL;
-	}
+	if (!tomoyo_correct_domain(manager) &&
+	    !tomoyo_correct_word(manager))
+		return -EINVAL;
 	e.manager = tomoyo_get_name(manager);
 	if (e.manager) {
 		error = tomoyo_update_policy(&e.head, sizeof(e), &param,
@@ -932,23 +927,14 @@ static bool tomoyo_manager(void)
 		return true;
 	if (!tomoyo_manage_by_non_root && (task->cred->uid || task->cred->euid))
 		return false;
-	list_for_each_entry_rcu(ptr, &tomoyo_kernel_namespace.
-				policy_list[TOMOYO_ID_MANAGER], head.list) {
-		if (!ptr->head.is_deleted && ptr->is_domain
-		    && !tomoyo_pathcmp(domainname, ptr->manager)) {
-			found = true;
-			break;
-		}
-	}
-	if (found)
-		return true;
 	exe = tomoyo_get_exe();
 	if (!exe)
 		return false;
 	list_for_each_entry_rcu(ptr, &tomoyo_kernel_namespace.
 				policy_list[TOMOYO_ID_MANAGER], head.list) {
-		if (!ptr->head.is_deleted && !ptr->is_domain
-		    && !strcmp(exe, ptr->manager->name)) {
+		if (!ptr->head.is_deleted &&
+		    (!tomoyo_pathcmp(domainname, ptr->manager) ||
+		     !strcmp(exe, ptr->manager->name))) {
 			found = true;
 			break;
 		}
diff --git a/security/tomoyo/common.h b/security/tomoyo/common.h
index 30fd98369700..75e4dc1c02a0 100644
--- a/security/tomoyo/common.h
+++ b/security/tomoyo/common.h
@@ -860,7 +860,6 @@ struct tomoyo_aggregator {
 /* Structure for policy manager. */
 struct tomoyo_manager {
 	struct tomoyo_acl_head head;
-	bool is_domain;  /* True if manager is a domainname. */
 	/* A path to program or a domainname. */
 	const struct tomoyo_path_info *manager;
 };
diff --git a/security/tomoyo/tomoyo.c b/security/tomoyo/tomoyo.c
index 620d37c159a3..c2d04a50f76a 100644
--- a/security/tomoyo/tomoyo.c
+++ b/security/tomoyo/tomoyo.c
@@ -319,14 +319,14 @@ static int tomoyo_file_fcntl(struct file *file, unsigned int cmd,
 }
 
 /**
- * tomoyo_dentry_open - Target for security_dentry_open().
+ * tomoyo_file_open - Target for security_file_open().
  *
  * @f:    Pointer to "struct file".
  * @cred: Pointer to "struct cred".
  *
  * Returns 0 on success, negative value otherwise.
  */
-static int tomoyo_dentry_open(struct file *f, const struct cred *cred)
+static int tomoyo_file_open(struct file *f, const struct cred *cred)
 {
 	int flags = f->f_flags;
 	/* Don't check read permission here if called from do_execve(). */
@@ -510,7 +510,7 @@ static struct security_operations tomoyo_security_ops = {
 	.bprm_set_creds      = tomoyo_bprm_set_creds,
 	.bprm_check_security = tomoyo_bprm_check_security,
 	.file_fcntl          = tomoyo_file_fcntl,
-	.dentry_open         = tomoyo_dentry_open,
+	.file_open           = tomoyo_file_open,
 	.path_truncate       = tomoyo_path_truncate,
 	.path_unlink         = tomoyo_path_unlink,
 	.path_mkdir          = tomoyo_path_mkdir,
diff --git a/security/yama/yama_lsm.c b/security/yama/yama_lsm.c
index 573723843a04..83554ee8a587 100644
--- a/security/yama/yama_lsm.c
+++ b/security/yama/yama_lsm.c
@@ -18,7 +18,12 @@
 #include <linux/prctl.h>
 #include <linux/ratelimit.h>
 
-static int ptrace_scope = 1;
+#define YAMA_SCOPE_DISABLED	0
+#define YAMA_SCOPE_RELATIONAL	1
+#define YAMA_SCOPE_CAPABILITY	2
+#define YAMA_SCOPE_NO_ATTACH	3
+
+static int ptrace_scope = YAMA_SCOPE_RELATIONAL;
 
 /* describe a ptrace relationship for potential exception */
 struct ptrace_relation {
@@ -251,17 +256,32 @@ static int yama_ptrace_access_check(struct task_struct *child,
 		return rc;
 
 	/* require ptrace target be a child of ptracer on attach */
-	if (mode == PTRACE_MODE_ATTACH &&
-	    ptrace_scope &&
-	    !task_is_descendant(current, child) &&
-	    !ptracer_exception_found(current, child) &&
-	    !capable(CAP_SYS_PTRACE))
-		rc = -EPERM;
+	if (mode == PTRACE_MODE_ATTACH) {
+		switch (ptrace_scope) {
+		case YAMA_SCOPE_DISABLED:
+			/* No additional restrictions. */
+			break;
+		case YAMA_SCOPE_RELATIONAL:
+			if (!task_is_descendant(current, child) &&
+			    !ptracer_exception_found(current, child) &&
+			    !ns_capable(task_user_ns(child), CAP_SYS_PTRACE))
+				rc = -EPERM;
+			break;
+		case YAMA_SCOPE_CAPABILITY:
+			if (!ns_capable(task_user_ns(child), CAP_SYS_PTRACE))
+				rc = -EPERM;
+			break;
+		case YAMA_SCOPE_NO_ATTACH:
+		default:
+			rc = -EPERM;
+			break;
+		}
+	}
 
 	if (rc) {
 		char name[sizeof(current->comm)];
-		printk_ratelimited(KERN_NOTICE "ptrace of non-child"
-			" pid %d was attempted by: %s (pid %d)\n",
+		printk_ratelimited(KERN_NOTICE
+			"ptrace of pid %d was attempted by: %s (pid %d)\n",
 			child->pid,
 			get_task_comm(name, current),
 			current->pid);
@@ -279,8 +299,27 @@ static struct security_operations yama_ops = {
 };
 
 #ifdef CONFIG_SYSCTL
+static int yama_dointvec_minmax(struct ctl_table *table, int write,
+				void __user *buffer, size_t *lenp, loff_t *ppos)
+{
+	int rc;
+
+	if (write && !capable(CAP_SYS_PTRACE))
+		return -EPERM;
+
+	rc = proc_dointvec_minmax(table, write, buffer, lenp, ppos);
+	if (rc)
+		return rc;
+
+	/* Lock the max value if it ever gets set. */
+	if (write && *(int *)table->data == *(int *)table->extra2)
+		table->extra1 = table->extra2;
+
+	return rc;
+}
+
 static int zero;
-static int one = 1;
+static int max_scope = YAMA_SCOPE_NO_ATTACH;
 
 struct ctl_path yama_sysctl_path[] = {
 	{ .procname = "kernel", },
@@ -294,9 +333,9 @@ static struct ctl_table yama_sysctl_table[] = {
 		.data           = &ptrace_scope,
 		.maxlen         = sizeof(int),
 		.mode           = 0644,
-		.proc_handler   = proc_dointvec_minmax,
+		.proc_handler   = yama_dointvec_minmax,
 		.extra1         = &zero,
-		.extra2         = &one,
+		.extra2         = &max_scope,
 	},
 	{ }
 };
author	Linus Torvalds <torvalds@linux-foundation.org>	2012-05-21 20:27:36 -0700
committer	Linus Torvalds <torvalds@linux-foundation.org>	2012-05-21 20:27:36 -0700
commit	cb60e3e65c1b96a4d6444a7a13dc7dd48bc15a2b (patch)
tree	4322be35db678f6299348a76ad60a2023954af7d
parent	99262a3dafa3290866512ddfb32609198f8973e9 (diff)
parent	ff2bb047c4bce9742e94911eeb44b4d6ff4734ab (diff)
download	lwn-cb60e3e65c1b96a4d6444a7a13dc7dd48bc15a2b.tar.gz lwn-cb60e3e65c1b96a4d6444a7a13dc7dd48bc15a2b.zip