lwn.git - Linux kernel documentation tree maintained by Jonathan Corbet

Age	Commit message (Collapse)	Author
2024-01-11	Merge tag 'for-6.8/io_uring-2024-01-08' of git://git.kernel.dk/linux	Linus Torvalds
	Pull io_uring updates from Jens Axboe: "Mostly just come fixes and cleanups, but one feature as well. In detail: - Harden the check for handling IOPOLL based on return (Pavel) - Various minor optimizations (Pavel) - Drop remnants of SCM_RIGHTS fd passing support, now that it's no longer supported since 6.7 (me) - Fix for a case where bytes_done wasn't initialized properly on a failure condition for read/write requests (me) - Move the register related code to a separate file (me) - Add support for returning the provided ring buffer head (me) - Add support for adding a direct descriptor to the normal file table (me, Christian Brauner) - Fix for ensuring pending task_work for a ring with DEFER_TASKRUN is run even if we timeout waiting (me)" * tag 'for-6.8/io_uring-2024-01-08' of git://git.kernel.dk/linux: io_uring: ensure local task_work is run on wait timeout io_uring/kbuf: add method for returning provided buffer ring head io_uring/rw: ensure io->bytes_done is always initialized io_uring: drop any code related to SCM_RIGHTS io_uring/unix: drop usage of io_uring socket io_uring/register: move io_uring_register(2) related code to register.c io_uring/openclose: add support for IORING_OP_FIXED_FD_INSTALL io_uring/cmd: inline io_uring_cmd_get_task io_uring/cmd: inline io_uring_cmd_do_in_task_lazy io_uring: split out cmd api into a separate header io_uring: optimise ltimeout for inline execution io_uring: don't check iopoll if request completes
2024-01-10	Merge tag 'header_cleanup-2024-01-10' of https://evilpiepirate.org/git/bcachefs	Linus Torvalds
	Pull header cleanups from Kent Overstreet: "The goal is to get sched.h down to a type only header, so the main thing happening in this patchset is splitting out various _types.h headers and dependency fixups, as well as moving some things out of sched.h to better locations. This is prep work for the memory allocation profiling patchset which adds new sched.h interdepencencies" * tag 'header_cleanup-2024-01-10' of https://evilpiepirate.org/git/bcachefs: (51 commits) Kill sched.h dependency on rcupdate.h kill unnecessary thread_info.h include Kill unnecessary kernel.h include preempt.h: Kill dependency on list.h rseq: Split out rseq.h from sched.h LoongArch: signal.c: add header file to fix build error restart_block: Trim includes lockdep: move held_lock to lockdep_types.h sem: Split out sem_types.h uidgid: Split out uidgid_types.h seccomp: Split out seccomp_types.h refcount: Split out refcount_types.h uapi/linux/resource.h: fix include x86/signal: kill dependency on time.h syscall_user_dispatch.h: split out *_types.h mm_types_task.h: Trim dependencies Split out irqflags_types.h ipc: Kill bogus dependency on spinlock.h shm: Slim down dependencies workqueue: Split out workqueue_types.h ...
2024-01-09	Merge tag 'lsm-pr-20240105' of ↵	Linus Torvalds
	git://git.kernel.org/pub/scm/linux/kernel/git/pcmoore/lsm Pull security module updates from Paul Moore: - Add three new syscalls: lsm_list_modules(), lsm_get_self_attr(), and lsm_set_self_attr(). The first syscall simply lists the LSMs enabled, while the second and third get and set the current process' LSM attributes. Yes, these syscalls may provide similar functionality to what can be found under /proc or /sys, but they were designed to support multiple, simultaneaous (stacked) LSMs from the start as opposed to the current /proc based solutions which were created at a time when only one LSM was allowed to be active at a given time. We have spent considerable time discussing ways to extend the existing /proc interfaces to support multiple, simultaneaous LSMs and even our best ideas have been far too ugly to support as a kernel API; after +20 years in the kernel, I felt the LSM layer had established itself enough to justify a handful of syscalls. Support amongst the individual LSM developers has been nearly unanimous, with a single objection coming from Tetsuo (TOMOYO) as he is worried that the LSM_ID_XXX token concept will make it more difficult for out-of-tree LSMs to survive. Several members of the LSM community have demonstrated the ability for out-of-tree LSMs to continue to exist by picking high/unused LSM_ID values as well as pointing out that many kernel APIs rely on integer identifiers, e.g. syscalls (!), but unfortunately Tetsuo's objections remain. My personal opinion is that while I have no interest in penalizing out-of-tree LSMs, I'm not going to penalize in-tree development to support out-of-tree development, and I view this as a necessary step forward to support the push for expanded LSM stacking and reduce our reliance on /proc and /sys which has occassionally been problematic for some container users. Finally, we have included the linux-api folks on (all?) recent revisions of the patchset and addressed all of their concerns. - Add a new security_file_ioctl_compat() LSM hook to handle the 32-bit ioctls on 64-bit systems problem. This patch includes support for all of the existing LSMs which provide ioctl hooks, although it turns out only SELinux actually cares about the individual ioctls. It is worth noting that while Casey (Smack) and Tetsuo (TOMOYO) did not give explicit ACKs to this patch, they did both indicate they are okay with the changes. - Fix a potential memory leak in the CALIPSO code when IPv6 is disabled at boot. While it's good that we are fixing this, I doubt this is something users are seeing in the wild as you need to both disable IPv6 and then attempt to configure IPv6 labeled networking via NetLabel/CALIPSO; that just doesn't make much sense. Normally this would go through netdev, but Jakub asked me to take this patch and of all the trees I maintain, the LSM tree seemed like the best fit. - Update the LSM MAINTAINERS entry with additional information about our process docs, patchwork, bug reporting, etc. I also noticed that the Lockdown LSM is missing a dedicated MAINTAINERS entry so I've added that to the pull request. I've been working with one of the major Lockdown authors/contributors to see if they are willing to step up and assume a Lockdown maintainer role; hopefully that will happen soon, but in the meantime I'll continue to look after it. - Add a handful of mailmap entries for Serge Hallyn and myself. * tag 'lsm-pr-20240105' of git://git.kernel.org/pub/scm/linux/kernel/git/pcmoore/lsm: (27 commits) lsm: new security_file_ioctl_compat() hook lsm: Add a __counted_by() annotation to lsm_ctx.ctx calipso: fix memory leak in netlbl_calipso_add_pass() selftests: remove the LSM_ID_IMA check in lsm/lsm_list_modules_test MAINTAINERS: add an entry for the lockdown LSM MAINTAINERS: update the LSM entry mailmap: add entries for Serge Hallyn's dead accounts mailmap: update/replace my old email addresses lsm: mark the lsm_id variables are marked as static lsm: convert security_setselfattr() to use memdup_user() lsm: align based on pointer length in lsm_fill_user_ctx() lsm: consolidate buffer size handling into lsm_fill_user_ctx() lsm: correct error codes in security_getselfattr() lsm: cleanup the size counters in security_getselfattr() lsm: don't yet account for IMA in LSM_CONFIG_COUNT calculation lsm: drop LSM_ID_IMA LSM: selftests for Linux Security Module syscalls SELinux: Add selfattr hooks AppArmor: Add selfattr hooks Smack: implement setselfattr and getselfattr hooks ...
2024-01-09	Merge tag 'selinux-pr-20240105' of ↵	Linus Torvalds
	git://git.kernel.org/pub/scm/linux/kernel/git/pcmoore/selinux Pull selinux updates from Paul Moore: - Add a new SELinux initial SID, SECINITSID_INIT, to represent userspace processes started before the SELinux policy is loaded in early boot. Prior to this patch all processes were marked as SECINITSID_KERNEL before the SELinux policy was loaded, making it difficult to distinquish early boot userspace processes from the kernel in the SELinux policy. For most users this will be a non-issue as the policy is loaded early enough during boot, but for users who load their SELinux policy relatively late, this should make it easier to construct meaningful security policies. - Cleanups to the selinuxfs code by Al, mostly on VFS related issues during a policy reload. The commit description has more detail, but the quick summary is that we are replacing a disconnected directory approach with a temporary directory that we swapover at the end of the reload. - Fix an issue where the input sanity checking on socket bind() operations was slightly different depending on the presence of SELinux. This is caused by the placement of the LSM hooks in the generic socket layer as opposed to the protocol specific bind() handler where the protocol specific sanity checks are performed. Mickaël has mentioned that he is working to fix this, but in the meantime we just ensure that we are replicating the checks properly. We need to balance the placement of the LSM hooks with the number of LSM hooks; pushing the hooks down into the protocol layers is likely not the right answer. - Update the avc_has_perm_noaudit() prototype to better match the function definition. - Migrate from using partial_name_hash() to full_name_hash() the filename transition hash table. This improves the quality of the code and has the potential for a minor performance bump. - Consolidate some open coded SELinux access vector comparisions into a single new function, avtab_node_cmp(), and use that instead. A small, but nice win for code quality and maintainability. - Updated the SELinux MAINTAINERS entry with additional information around process, bug reporting, etc. We're also updating some of our "official" roles: dropping Eric Paris and adding Ondrej as a reviewer. - Cleanup the coding style crimes in security/selinux/include. While I'm not a fan of code churn, I am pushing for more automated code checks that can be done at the developer level and one of the obvious things to check for is coding style. In an effort to start from a "good" base I'm slowly working through our source files cleaning them up with the help of clang-format and good ol' fashioned human eyeballs; this has the first batch of these changes. I've been splitting the changes up per-file to help reduce the impact if backports are required (either for LTS or distro kernels), and I expect the some of the larger files, e.g. hooks.c and ss/services.c, will likely need to be split even further. - Cleanup old, outdated comments. * tag 'selinux-pr-20240105' of git://git.kernel.org/pub/scm/linux/kernel/git/pcmoore/selinux: (24 commits) selinux: Fix error priority for bind with AF_UNSPEC on PF_INET6 socket selinux: fix style issues in security/selinux/include/initial_sid_to_string.h selinux: fix style issues in security/selinux/include/xfrm.h selinux: fix style issues in security/selinux/include/security.h selinux: fix style issues with security/selinux/include/policycap_names.h selinux: fix style issues in security/selinux/include/policycap.h selinux: fix style issues in security/selinux/include/objsec.h selinux: fix style issues with security/selinux/include/netlabel.h selinux: fix style issues in security/selinux/include/netif.h selinux: fix style issues in security/selinux/include/ima.h selinux: fix style issues in security/selinux/include/conditional.h selinux: fix style issues in security/selinux/include/classmap.h selinux: fix style issues in security/selinux/include/avc_ss.h selinux: align avc_has_perm_noaudit() prototype with definition selinux: fix style issues in security/selinux/include/avc.h selinux: fix style issues in security/selinux/include/audit.h MAINTAINERS: drop Eric Paris from his SELinux role MAINTAINERS: add Ondrej Mosnacek as a SELinux reviewer selinux: remove the wrong comment about multithreaded process handling selinux: introduce an initial SID for early boot processes ...
2024-01-04	selinux: Fix error priority for bind with AF_UNSPEC on PF_INET6 socket	Mickaël Salaün
	The IPv6 network stack first checks the sockaddr length (-EINVAL error) before checking the family (-EAFNOSUPPORT error). This was discovered thanks to commit a549d055a22e ("selftests/landlock: Add network tests"). Cc: Eric Paris <eparis@parisplace.org> Cc: Konstantin Meskhidze <konstantin.meskhidze@huawei.com> Cc: Paul Moore <paul@paul-moore.com> Cc: Stephen Smalley <stephen.smalley.work@gmail.com> Reported-by: Muhammad Usama Anjum <usama.anjum@collabora.com> Closes: https://lore.kernel.org/r/0584f91c-537c-4188-9e4f-04f192565667@collabora.com Fixes: 0f8db8cc73df ("selinux: add AF_UNSPEC and INADDR_ANY checks to selinux_socket_bind()") Signed-off-by: Mickaël Salaün <mic@digikod.net> Tested-by: Muhammad Usama Anjum <usama.anjum@collabora.com> Signed-off-by: Paul Moore <paul@paul-moore.com>
2023-12-24	lsm: new security_file_ioctl_compat() hook	Alfred Piccioni
	Some ioctl commands do not require ioctl permission, but are routed to other permissions such as FILE_GETATTR or FILE_SETATTR. This routing is done by comparing the ioctl cmd to a set of 64-bit flags (FS_IOC_). However, if a 32-bit process is running on a 64-bit kernel, it emits 32-bit flags (FS_IOC32_) for certain ioctl operations. These flags are being checked erroneously, which leads to these ioctl operations being routed to the ioctl permission, rather than the correct file permissions. This was also noted in a RED-PEN finding from a while back - "/* RED-PEN how should LSM module know it's handling 32bit? */". This patch introduces a new hook, security_file_ioctl_compat(), that is called from the compat ioctl syscall. All current LSMs have been changed to support this hook. Reviewing the three places where we are currently using security_file_ioctl(), it appears that only SELinux needs a dedicated compat change; TOMOYO and SMACK appear to be functional without any change. Cc: stable@vger.kernel.org Fixes: 0b24dcb7f2f7 ("Revert "selinux: simplify ioctl checking"") Signed-off-by: Alfred Piccioni <alpic@google.com> Reviewed-by: Stephen Smalley <stephen.smalley.work@gmail.com> [PM: subject tweak, line length fixes, and alignment corrections] Signed-off-by: Paul Moore <paul@paul-moore.com>
2023-12-20	shm: Slim down dependencies	Kent Overstreet
	list_head is in types.h, not list.h., and the uapi header wasn't needed. Signed-off-by: Kent Overstreet <kent.overstreet@linux.dev>
2023-12-15	cred: get rid of CONFIG_DEBUG_CREDENTIALS	Jens Axboe
	This code is rarely (never?) enabled by distros, and it hasn't caught anything in decades. Let's kill off this legacy debug code. Suggested-by: Linus Torvalds <torvalds@linuxfoundation.org> Signed-off-by: Jens Axboe <axboe@kernel.dk> Signed-off-by: Linus Torvalds <torvalds@linux-foundation.org>
2023-12-12	io_uring: split out cmd api into a separate header	Pavel Begunkov
	linux/io_uring.h is slowly becoming a rubbish bin where we put anything exposed to other subsystems. For instance, the task exit hooks and io_uring cmd infra are completely orthogonal and don't need each other's definitions. Start cleaning it up by splitting out all command bits into a new header file. Signed-off-by: Pavel Begunkov <asml.silence@gmail.com> Link: https://lore.kernel.org/r/7ec50bae6e21f371d3850796e716917fc141225a.1701391955.git.asml.silence@gmail.com Signed-off-by: Jens Axboe <axboe@kernel.dk>
2023-12-07	selinux: remove the wrong comment about multithreaded process handling	Munehisa Kamata
	Since commit d9250dea3f89 ("SELinux: add boundary support and thread context assignment"), SELinux has been supporting assigning per-thread security context under a constraint and the comment was updated accordingly. However, seems like commit d84f4f992cbd ("CRED: Inaugurate COW credentials") accidentally brought the old comment back that doesn't match what the code does. Considering the ease of understanding the code, this patch just removes the wrong comment. Fixes: d84f4f992cbd ("CRED: Inaugurate COW credentials") Signed-off-by: Munehisa Kamata <kamatam@amazon.com> Signed-off-by: Paul Moore <paul@paul-moore.com>
2023-11-21	selinux: introduce an initial SID for early boot processes	Ondrej Mosnacek
	Currently, SELinux doesn't allow distinguishing between kernel threads and userspace processes that are started before the policy is first loaded - both get the label corresponding to the kernel SID. The only way a process that persists from early boot can get a meaningful label is by doing a voluntary dyntransition or re-executing itself. Reusing the kernel label for userspace processes is problematic for several reasons: 1. The kernel is considered to be a privileged domain and generally needs to have a wide range of permissions allowed to work correctly, which prevents the policy writer from effectively hardening against early boot processes that might remain running unintentionally after the policy is loaded (they represent a potential extra attack surface that should be mitigated). 2. Despite the kernel being treated as a privileged domain, the policy writer may want to impose certain special limitations on kernel threads that may conflict with the requirements of intentional early boot processes. For example, it is a good hardening practice to limit what executables the kernel can execute as usermode helpers and to confine the resulting usermode helper processes. However, a (legitimate) process surviving from early boot may need to execute a different set of executables. 3. As currently implemented, overlayfs remembers the security context of the process that created an overlayfs mount and uses it to bound subsequent operations on files using this context. If an overlayfs mount is created before the SELinux policy is loaded, these "mounter" checks are made against the kernel context, which may clash with restrictions on the kernel domain (see 2.). To resolve this, introduce a new initial SID (reusing the slot of the former "init" initial SID) that will be assigned to any userspace process started before the policy is first loaded. This is easy to do, as we can simply label any process that goes through the bprm_creds_for_exec LSM hook with the new init-SID instead of propagating the kernel SID from the parent. To provide backwards compatibility for existing policies that are unaware of this new semantic of the "init" initial SID, introduce a new policy capability "userspace_initial_context" and set the "init" SID to the same context as the "kernel" SID unless this capability is set by the policy. Another small backwards compatibility measure is needed in security_sid_to_context_core() for before the initial SELinux policy load - see the code comment for explanation. Signed-off-by: Ondrej Mosnacek <omosnace@redhat.com> Reviewed-by: Stephen Smalley <stephen.smalley.work@gmail.com> [PM: edited comments based on feedback/discussion] Signed-off-by: Paul Moore <paul@paul-moore.com>
2023-11-12	lsm: mark the lsm_id variables are marked as static	Paul Moore
	As the kernel test robot helpfully reminded us, all of the lsm_id instances defined inside the various LSMs should be marked as static. The one exception is Landlock which uses its lsm_id variable across multiple source files with an extern declaration in a header file. Reported-by: kernel test robot <lkp@intel.com> Suggested-by: Casey Schaufler <casey@schaufler-ca.com> Reviewed-by: Casey Schaufler <casey@schaufler-ca.com> Signed-off-by: Paul Moore <paul@paul-moore.com>
2023-11-12	lsm: consolidate buffer size handling into lsm_fill_user_ctx()	Paul Moore
	While we have a lsm_fill_user_ctx() helper function designed to make life easier for LSMs which return lsm_ctx structs to userspace, we didn't include all of the buffer length safety checks and buffer padding adjustments in the helper. This led to code duplication across the different LSMs and the possibility for mistakes across the different LSM subsystems. In order to reduce code duplication and decrease the chances of silly mistakes, we're consolidating all of this code into the lsm_fill_user_ctx() helper. The buffer padding is also modified from a fixed 8-byte alignment to an alignment that matches the word length of the machine (BITS_PER_LONG / 8). Signed-off-by: Paul Moore <paul@paul-moore.com>
2023-11-12	SELinux: Add selfattr hooks	Casey Schaufler
	Add hooks for setselfattr and getselfattr. These hooks are not very different from their setprocattr and getprocattr equivalents, and much of the code is shared. Cc: selinux@vger.kernel.org Signed-off-by: Casey Schaufler <casey@schaufler-ca.com> Signed-off-by: Paul Moore <paul@paul-moore.com>
2023-11-12	LSM: Identify modules by more than name	Casey Schaufler
	Create a struct lsm_id to contain identifying information about Linux Security Modules (LSMs). At inception this contains the name of the module and an identifier associated with the security module. Change the security_add_hooks() interface to use this structure. Change the individual modules to maintain their own struct lsm_id and pass it to security_add_hooks(). The values are for LSM identifiers are defined in a new UAPI header file linux/lsm.h. Each existing LSM has been updated to include it's LSMID in the lsm_id. The LSM ID values are sequential, with the oldest module LSM_ID_CAPABILITY being the lowest value and the existing modules numbered in the order they were included in the main line kernel. This is an arbitrary convention for assigning the values, but none better presents itself. The value 0 is defined as being invalid. The values 1-99 are reserved for any special case uses which may arise in the future. This may include attributes of the LSM infrastructure itself, possibly related to namespacing or network attribute management. A special range is identified for such attributes to help reduce confusion for developers unfamiliar with LSMs. LSM attribute values are defined for the attributes presented by modules that are available today. As with the LSM IDs, The value 0 is defined as being invalid. The values 1-99 are reserved for any special case uses which may arise in the future. Cc: linux-security-module <linux-security-module@vger.kernel.org> Signed-off-by: Casey Schaufler <casey@schaufler-ca.com> Reviewed-by: Kees Cook <keescook@chromium.org> Reviewed-by: Serge Hallyn <serge@hallyn.com> Reviewed-by: Mickael Salaun <mic@digikod.net> Reviewed-by: John Johansen <john.johansen@canonical.com> Signed-off-by: Kees Cook <keescook@chromium.org> Nacked-by: Tetsuo Handa <penguin-kernel@I-love.SAKURA.ne.jp> [PM: forward ported beyond v6.6 due merge window changes] Signed-off-by: Paul Moore <paul@paul-moore.com>
2023-10-30	Merge tag 'lsm-pr-20231030' of ↵	Linus Torvalds
	git://git.kernel.org/pub/scm/linux/kernel/git/pcmoore/lsm Pull LSM updates from Paul Moore: - Add new credential functions, get_cred_many() and put_cred_many() to save some atomic_t operations for a few operations. While not strictly LSM related, this patchset had been rotting on the mailing lists for some time and since the LSMs do care a lot about credentials I thought it reasonable to give this patch a home. - Five patches to constify different LSM hook parameters. - Fix a spelling mistake. * tag 'lsm-pr-20231030' of git://git.kernel.org/pub/scm/linux/kernel/git/pcmoore/lsm: lsm: fix a spelling mistake cred: add get_cred_many and put_cred_many lsm: constify 'sb' parameter in security_sb_kern_mount() lsm: constify 'bprm' parameter in security_bprm_committed_creds() lsm: constify 'bprm' parameter in security_bprm_committing_creds() lsm: constify 'file' parameter in security_bprm_creds_from_file() lsm: constify 'sb' parameter in security_quotactl()
2023-09-14	lsm: constify 'sb' parameter in security_sb_kern_mount()	Khadija Kamran
	The "sb_kern_mount" hook has implementation registered in SELinux. Looking at the function implementation we observe that the "sb" parameter is not changing. Mark the "sb" parameter of LSM hook security_sb_kern_mount() as "const" since it will not be changing in the LSM hook. Signed-off-by: Khadija Kamran <kamrankhadijadj@gmail.com> [PM: minor merge fuzzing due to other constification patches] Signed-off-by: Paul Moore <paul@paul-moore.com>
2023-09-14	lsm: constify 'bprm' parameter in security_bprm_committed_creds()	Khadija Kamran
	Three LSMs register the implementations for the 'bprm_committed_creds()' hook: AppArmor, SELinux and tomoyo. Looking at the function implementations we may observe that the 'bprm' parameter is not changing. Mark the 'bprm' parameter of LSM hook security_bprm_committed_creds() as 'const' since it will not be changing in the LSM hook. Signed-off-by: Khadija Kamran <kamrankhadijadj@gmail.com> [PM: minor merge fuzzing due to other constification patches] Signed-off-by: Paul Moore <paul@paul-moore.com>
2023-09-13	lsm: constify 'bprm' parameter in security_bprm_committing_creds()	Khadija Kamran
	The 'bprm_committing_creds' hook has implementations registered in SELinux and Apparmor. Looking at the function implementations we observe that the 'bprm' parameter is not changing. Mark the 'bprm' parameter of LSM hook security_bprm_committing_creds() as 'const' since it will not be changing in the LSM hook. Signed-off-by: Khadija Kamran <kamrankhadijadj@gmail.com> Signed-off-by: Paul Moore <paul@paul-moore.com>
2023-09-13	lsm: constify 'sb' parameter in security_quotactl()	Khadija Kamran
	SELinux registers the implementation for the "quotactl" hook. Looking at the function implementation we observe that the parameter "sb" is not changing. Mark the "sb" parameter of LSM hook security_quotactl() as "const" since it will not be changing in the LSM hook. Signed-off-by: Khadija Kamran <kamrankhadijadj@gmail.com> Signed-off-by: Paul Moore <paul@paul-moore.com>
2023-09-12	selinux: fix handling of empty opts in selinux_fs_context_submount()	Ondrej Mosnacek
	selinux_set_mnt_opts() relies on the fact that the mount options pointer is always NULL when all options are unset (specifically in its !selinux_initialized() branch. However, the new selinux_fs_context_submount() hook breaks this rule by allocating a new structure even if no options are set. That causes any submount created before a SELinux policy is loaded to be rejected in selinux_set_mnt_opts(). Fix this by making selinux_fs_context_submount() leave fc->security set to NULL when there are no options to be copied from the reference superblock. Cc: <stable@vger.kernel.org> Reported-by: Adam Williamson <awilliam@redhat.com> Link: https://bugzilla.redhat.com/show_bug.cgi?id=2236345 Fixes: d80a8f1b58c2 ("vfs, security: Fix automount superblock LSM init problem, preventing NFS sb sharing") Signed-off-by: Ondrej Mosnacek <omosnace@redhat.com> Reviewed-by: Jeff Layton <jlayton@kernel.org> Signed-off-by: Paul Moore <paul@paul-moore.com>
2023-08-30	Merge tag 'lsm-pr-20230829' of ↵	Linus Torvalds
	git://git.kernel.org/pub/scm/linux/kernel/git/pcmoore/lsm Pull LSM updates from Paul Moore: - Add proper multi-LSM support for xattrs in the security_inode_init_security() hook Historically the LSM layer has only allowed a single LSM to add an xattr to an inode, with IMA/EVM measuring that and adding its own as well. As we work towards promoting IMA/EVM to a "proper LSM" instead of the special case that it is now, we need to better support the case of multiple LSMs each adding xattrs to an inode and after several attempts we now appear to have something that is working well. It is worth noting that in the process of making this change we uncovered a problem with Smack's SMACK64TRANSMUTE xattr which is also fixed in this pull request. - Additional LSM hook constification Two patches to constify parameters to security_capget() and security_binder_transfer_file(). While I generally don't make a special note of who submitted these patches, these were the work of an Outreachy intern, Khadija Kamran, and that makes me happy; hopefully it does the same for all of you reading this. - LSM hook comment header fixes One patch to add a missing hook comment header, one to fix a minor typo. - Remove an old, unused credential function declaration It wasn't clear to me who should pick this up, but it was trivial, obviously correct, and arguably the LSM layer has a vested interest in credentials so I merged it. Sadly I'm now noticing that despite my subject line cleanup I didn't cleanup the "unsued" misspelling, sigh * tag 'lsm-pr-20230829' of git://git.kernel.org/pub/scm/linux/kernel/git/pcmoore/lsm: lsm: constify the 'file' parameter in security_binder_transfer_file() lsm: constify the 'target' parameter in security_capget() lsm: add comment block for security_sk_classify_flow LSM hook security: Fix ret values doc for security_inode_init_security() cred: remove unsued extern declaration change_create_files_as() evm: Support multiple LSMs providing an xattr evm: Align evm_inode_init_security() definition with LSM infrastructure smack: Set the SMACK64TRANSMUTE xattr in smack_inode_init_security() security: Allow all LSMs to provide xattrs for inode_init_security hook lsm: fix typo in security_file_lock() comment header
2023-08-30	Merge tag 'selinux-pr-20230829' of ↵	Linus Torvalds
	git://git.kernel.org/pub/scm/linux/kernel/git/pcmoore/selinux Pull selinux updates from Paul Moore: "Thirty three SELinux patches, which is a pretty big number for us, but there isn't really anything scary in here; in fact we actually manage to remove 10 lines of code with this :) - Promote the SELinux DEBUG_HASHES macro to CONFIG_SECURITY_SELINUX_DEBUG The DEBUG_HASHES macro was a buried SELinux specific preprocessor debug macro that was a problem waiting to happen. Promoting the debug macro to a proper Kconfig setting should help both improve the visibility of the feature as well enable improved test coverage. We've moved some additional debug functions under the CONFIG_SECURITY_SELINUX_DEBUG flag and we may see more work in the future. - Emit a pr_notice() message if virtual memory is executable by default As this impacts the SELinux access control policy enforcement, if the system's configuration is such that virtual memory is executable by default we print a single line notice to the console. - Drop avtab_search() in favor of avtab_search_node() Both functions are nearly identical so we removed avtab_search() and converted the callers to avtab_search_node(). - Add some SELinux network auditing helpers The helpers not only reduce a small amount of code duplication, but they provide an opportunity to improve UDP flood performance slightly by delaying initialization of the audit data in some cases. - Convert GFP_ATOMIC allocators to GFP_KERNEL when reading SELinux policy There were two SELinux policy load helper functions that were allocating memory using GFP_ATOMIC, they have been converted to GFP_KERNEL. - Quiet a KMSAN warning in selinux_inet_conn_request() A one-line error path (re)set patch that resolves a KMSAN warning. It is important to note that this doesn't represent a real bug in the current code, but it quiets KMSAN and arguably hardens the code against future changes. - Cleanup the policy capability accessor functions This is a follow-up to the patch which reverted SELinux to using a global selinux_state pointer. This patch cleans up some artifacts of that change and turns each accessor into a one-line READ_ONCE() call into the policy capabilities array. - A number of patches from Christian Göttsche Christian submitted almost two-thirds of the patches in this pull request as he worked to harden the SELinux code against type differences, variable overflows, etc. - Support for separating early userspace from the kernel in policy, with a later revert We did have a patch that added a new userspace initial SID which would allow SELinux to distinguish between early user processes created before the initial policy load and the kernel itself. Unfortunately additional post-merge testing revealed a problematic interaction with an old SELinux userspace on an old version of Ubuntu so we've reverted the patch until we can resolve the compatibility issue. - Remove some outdated comments dealing with LSM hook registration When we removed the runtime disable functionality we forgot to remove some old comments discussing the importance of LSM hook registration ordering. - Minor administrative changes Stephen Smalley updated his email address and "debranded" SELinux from "NSA SELinux" to simply "SELinux". We've come a long way from the original NSA submission and I would consider SELinux a true community project at this point so removing the NSA branding just makes sense" * tag 'selinux-pr-20230829' of git://git.kernel.org/pub/scm/linux/kernel/git/pcmoore/selinux: (33 commits) selinux: prevent KMSAN warning in selinux_inet_conn_request() selinux: use unsigned iterator in nlmsgtab code selinux: avoid implicit conversions in policydb code selinux: avoid implicit conversions in selinuxfs code selinux: make left shifts well defined selinux: update type for number of class permissions in services code selinux: avoid implicit conversions in avtab code selinux: revert SECINITSID_INIT support selinux: use GFP_KERNEL while reading binary policy selinux: update comment on selinux_hooks[] selinux: avoid implicit conversions in services code selinux: avoid implicit conversions in mls code selinux: use identical iterator type in hashtab_duplicate() selinux: move debug functions into debug configuration selinux: log about VM being executable by default selinux: fix a 0/NULL mistmatch in ad_net_init_from_iif() selinux: introduce SECURITY_SELINUX_DEBUG configuration selinux: introduce and use lsm_ad_net_init*() helpers selinux: update my email address selinux: add missing newlines in pr_err() statements ...
2023-08-29	Merge tag 'mm-stable-2023-08-28-18-26' of ↵	Linus Torvalds
	git://git.kernel.org/pub/scm/linux/kernel/git/akpm/mm Pull MM updates from Andrew Morton: - Some swap cleanups from Ma Wupeng ("fix WARN_ON in add_to_avail_list") - Peter Xu has a series (mm/gup: Unify hugetlb, speed up thp") which reduces the special-case code for handling hugetlb pages in GUP. It also speeds up GUP handling of transparent hugepages. - Peng Zhang provides some maple tree speedups ("Optimize the fast path of mas_store()"). - Sergey Senozhatsky has improved te performance of zsmalloc during compaction (zsmalloc: small compaction improvements"). - Domenico Cerasuolo has developed additional selftest code for zswap ("selftests: cgroup: add zswap test program"). - xu xin has doe some work on KSM's handling of zero pages. These changes are mainly to enable the user to better understand the effectiveness of KSM's treatment of zero pages ("ksm: support tracking KSM-placed zero-pages"). - Jeff Xu has fixes the behaviour of memfd's MEMFD_NOEXEC_SCOPE_NOEXEC_ENFORCED sysctl ("mm/memfd: fix sysctl MEMFD_NOEXEC_SCOPE_NOEXEC_ENFORCED"). - David Howells has fixed an fscache optimization ("mm, netfs, fscache: Stop read optimisation when folio removed from pagecache"). - Axel Rasmussen has given userfaultfd the ability to simulate memory poisoning ("add UFFDIO_POISON to simulate memory poisoning with UFFD"). - Miaohe Lin has contributed some routine maintenance work on the memory-failure code ("mm: memory-failure: remove unneeded PageHuge() check"). - Peng Zhang has contributed some maintenance work on the maple tree code ("Improve the validation for maple tree and some cleanup"). - Hugh Dickins has optimized the collapsing of shmem or file pages into THPs ("mm: free retracted page table by RCU"). - Jiaqi Yan has a patch series which permits us to use the healthy subpages within a hardware poisoned huge page for general purposes ("Improve hugetlbfs read on HWPOISON hugepages"). - Kemeng Shi has done some maintenance work on the pagetable-check code ("Remove unused parameters in page_table_check"). - More folioification work from Matthew Wilcox ("More filesystem folio conversions for 6.6"), ("Followup folio conversions for zswap"). And from ZhangPeng ("Convert several functions in page_io.c to use a folio"). - page_ext cleanups from Kemeng Shi ("minor cleanups for page_ext"). - Baoquan He has converted some architectures to use the GENERIC_IOREMAP ioremap()/iounmap() code ("mm: ioremap: Convert architectures to take GENERIC_IOREMAP way"). - Anshuman Khandual has optimized arm64 tlb shootdown ("arm64: support batched/deferred tlb shootdown during page reclamation/migration"). - Better maple tree lockdep checking from Liam Howlett ("More strict maple tree lockdep"). Liam also developed some efficiency improvements ("Reduce preallocations for maple tree"). - Cleanup and optimization to the secondary IOMMU TLB invalidation, from Alistair Popple ("Invalidate secondary IOMMU TLB on permission upgrade"). - Ryan Roberts fixes some arm64 MM selftest issues ("selftests/mm fixes for arm64"). - Kemeng Shi provides some maintenance work on the compaction code ("Two minor cleanups for compaction"). - Some reduction in mmap_lock pressure from Matthew Wilcox ("Handle most file-backed faults under the VMA lock"). - Aneesh Kumar contributes code to use the vmemmap optimization for DAX on ppc64, under some circumstances ("Add support for DAX vmemmap optimization for ppc64"). - page-ext cleanups from Kemeng Shi ("add page_ext_data to get client data in page_ext"), ("minor cleanups to page_ext header"). - Some zswap cleanups from Johannes Weiner ("mm: zswap: three cleanups"). - kmsan cleanups from ZhangPeng ("minor cleanups for kmsan"). - VMA handling cleanups from Kefeng Wang ("mm: convert to vma_is_initial_heap/stack()"). - DAMON feature work from SeongJae Park ("mm/damon/sysfs-schemes: implement DAMOS tried total bytes file"), ("Extend DAMOS filters for address ranges and DAMON monitoring targets"). - Compaction work from Kemeng Shi ("Fixes and cleanups to compaction"). - Liam Howlett has improved the maple tree node replacement code ("maple_tree: Change replacement strategy"). - ZhangPeng has a general code cleanup - use the K() macro more widely ("cleanup with helper macro K()"). - Aneesh Kumar brings memmap-on-memory to ppc64 ("Add support for memmap on memory feature on ppc64"). - pagealloc cleanups from Kemeng Shi ("Two minor cleanups for pcp list in page_alloc"), ("Two minor cleanups for get pageblock migratetype"). - Vishal Moola introduces a memory descriptor for page table tracking, "struct ptdesc" ("Split ptdesc from struct page"). - memfd selftest maintenance work from Aleksa Sarai ("memfd: cleanups for vm.memfd_noexec"). - MM include file rationalization from Hugh Dickins ("arch: include asm/cacheflush.h in asm/hugetlb.h"). - THP debug output fixes from Hugh Dickins ("mm,thp: fix sloppy text output"). - kmemleak improvements from Xiaolei Wang ("mm/kmemleak: use object_cache instead of kmemleak_initialized"). - More folio-related cleanups from Matthew Wilcox ("Remove _folio_dtor and _folio_order"). - A VMA locking scalability improvement from Suren Baghdasaryan ("Per-VMA lock support for swap and userfaults"). - pagetable handling cleanups from Matthew Wilcox ("New page table range API"). - A batch of swap/thp cleanups from David Hildenbrand ("mm/swap: stop using page->private on tail pages for THP_SWAP + cleanups"). - Cleanups and speedups to the hugetlb fault handling from Matthew Wilcox ("Change calling convention for ->huge_fault"). - Matthew Wilcox has also done some maintenance work on the MM subsystem documentation ("Improve mm documentation"). * tag 'mm-stable-2023-08-28-18-26' of git://git.kernel.org/pub/scm/linux/kernel/git/akpm/mm: (489 commits) maple_tree: shrink struct maple_tree maple_tree: clean up mas_wr_append() secretmem: convert page_is_secretmem() to folio_is_secretmem() nios2: fix flush_dcache_page() for usage from irq context hugetlb: add documentation for vma_kernel_pagesize() mm: add orphaned kernel-doc to the rst files. mm: fix clean_record_shared_mapping_range kernel-doc mm: fix get_mctgt_type() kernel-doc mm: fix kernel-doc warning from tlb_flush_rmaps() mm: remove enum page_entry_size mm: allow ->huge_fault() to be called without the mmap_lock held mm: move PMD_ORDER to pgtable.h mm: remove checks for pte_index memcg: remove duplication detection for mem_cgroup_uncharge_swap mm/huge_memory: work on folio->swap instead of page->private when splitting folio mm/swap: inline folio_set_swap_entry() and folio_swap_entry() mm/swap: use dedicated entry for swap in folio mm/swap: stop using page->private on tail pages for THP_SWAP selftests/mm: fix WARNING comparing pointer to 0 selftests: cgroup: fix test_kmem_memcg_deletion kernel mem check ...
2023-08-29	Merge tag 'net-next-6.6' of ↵	Linus Torvalds
	git://git.kernel.org/pub/scm/linux/kernel/git/netdev/net-next Pull networking updates from Paolo Abeni: "Core: - Increase size limits for to-be-sent skb frag allocations. This allows tun, tap devices and packet sockets to better cope with large writes operations - Store netdevs in an xarray, to simplify iterating over netdevs - Refactor nexthop selection for multipath routes - Improve sched class lifetime handling - Add backup nexthop ID support for bridge - Implement drop reasons support in openvswitch - Several data races annotations and fixes - Constify the sk parameter of routing functions - Prepend kernel version to netconsole message Protocols: - Implement support for TCP probing the peer being under memory pressure - Remove hard coded limitation on IPv6 specific info placement inside the socket struct - Get rid of sysctl_tcp_adv_win_scale and use an auto-estimated per socket scaling factor - Scaling-up the IPv6 expired route GC via a separated list of expiring routes - In-kernel support for the TLS alert protocol - Better support for UDP reuseport with connected sockets - Add NEXT-C-SID support for SRv6 End.X behavior, reducing the SR header size - Get rid of additional ancillary per MPTCP connection struct socket - Implement support for BPF-based MPTCP packet schedulers - Format MPTCP subtests selftests results in TAP - Several new SMC 2.1 features including unique experimental options, max connections per lgr negotiation, max links per lgr negotiation BPF: - Multi-buffer support in AF_XDP - Add multi uprobe BPF links for attaching multiple uprobes and usdt probes, which is significantly faster and saves extra fds - Implement an fd-based tc BPF attach API (TCX) and BPF link support on top of it - Add SO_REUSEPORT support for TC bpf_sk_assign - Support new instructions from cpu v4 to simplify the generated code and feature completeness, for x86, arm64, riscv64 - Support defragmenting IPv(4\|6) packets in BPF - Teach verifier actual bounds of bpf_get_smp_processor_id() and fix perf+libbpf issue related to custom section handling - Introduce bpf map element count and enable it for all program types - Add a BPF hook in sys_socket() to change the protocol ID from IPPROTO_TCP to IPPROTO_MPTCP to cover migration for legacy - Introduce bpf_me_mcache_free_rcu() and fix OOM under stress - Add uprobe support for the bpf_get_func_ip helper - Check skb ownership against full socket - Support for up to 12 arguments in BPF trampoline - Extend link_info for kprobe_multi and perf_event links Netfilter: - Speed-up process exit by aborting ruleset validation if a fatal signal is pending - Allow NLA_POLICY_MASK to be used with BE16/BE32 types Driver API: - Page pool optimizations, to improve data locality and cache usage - Introduce ndo_hwtstamp_get() and ndo_hwtstamp_set() to avoid the need for raw ioctl() handling in drivers - Simplify genetlink dump operations (doit/dumpit) providing them the common information already populated in struct genl_info - Extend and use the yaml devlink specs to [re]generate the split ops - Introduce devlink selective dumps, to allow SF filtering SF based on handle and other attributes - Add yaml netlink spec for netlink-raw families, allow route, link and address related queries via the ynl tool - Remove phylink legacy mode support - Support offload LED blinking to phy - Add devlink port function attributes for IPsec New hardware / drivers: - Ethernet: - Broadcom ASP 2.0 (72165) ethernet controller - MediaTek MT7988 SoC - Texas Instruments AM654 SoC - Texas Instruments IEP driver - Atheros qca8081 phy - Marvell 88Q2110 phy - NXP TJA1120 phy - WiFi: - MediaTek mt7981 support - Can: - Kvaser SmartFusion2 PCI Express devices - Allwinner T113 controllers - Texas Instruments tcan4552/4553 chips - Bluetooth: - Intel Gale Peak - Qualcomm WCN3988 and WCN7850 - NXP AW693 and IW624 - Mediatek MT2925 Drivers: - Ethernet NICs: - nVidia/Mellanox: - mlx5: - support UDP encapsulation in packet offload mode - IPsec packet offload support in eswitch mode - improve aRFS observability by adding new set of counters - extends MACsec offload support to cover RoCE traffic - dynamic completion EQs - mlx4: - convert to use auxiliary bus instead of custom interface logic - Intel - ice: - implement switchdev bridge offload, even for LAG interfaces - implement SRIOV support for LAG interfaces - igc: - add support for multiple in-flight TX timestamps - Broadcom: - bnxt: - use the unified RX page pool buffers for XDP and non-XDP - use the NAPI skb allocation cache - OcteonTX2: - support Round Robin scheduling HTB offload - TC flower offload support for SPI field - Freescale: - add XDP_TX feature support - AMD: - ionic: add support for PCI FLR event - sfc: - basic conntrack offload - introduce eth, ipv4 and ipv6 pedit offloads - ST Microelectronics: - stmmac: maximze PTP timestamping resolution - Virtual NICs: - Microsoft vNIC: - batch ringing RX queue doorbell on receiving packets - add page pool for RX buffers - Virtio vNIC: - add per queue interrupt coalescing support - Google vNIC: - add queue-page-list mode support - Ethernet high-speed switches: - nVidia/Mellanox (mlxsw): - add port range matching tc-flower offload - permit enslavement to netdevices with uppers - Ethernet embedded switches: - Marvell (mv88e6xxx): - convert to phylink_pcs - Renesas: - r8A779fx: add speed change support - rzn1: enables vlan support - Ethernet PHYs: - convert mv88e6xxx to phylink_pcs - WiFi: - Qualcomm Wi-Fi 7 (ath12k): - extremely High Throughput (EHT) PHY support - RealTek (rtl8xxxu): - enable AP mode for: RTL8192FU, RTL8710BU (RTL8188GU), RTL8192EU and RTL8723BU - RealTek (rtw89): - Introduce Time Averaged SAR (TAS) support - Connector: - support for event filtering" * tag 'net-next-6.6' of git://git.kernel.org/pub/scm/linux/kernel/git/netdev/net-next: (1806 commits) net: ethernet: mtk_wed: minor change in wed_{tx,rx}info_show net: ethernet: mtk_wed: add some more info in wed_txinfo_show handler net: stmmac: clarify difference between "interface" and "phy_interface" r8152: add vendor/device ID pair for D-Link DUB-E250 devlink: move devlink_notify_register/unregister() to dev.c devlink: move small_ops definition into netlink.c devlink: move tracepoint definitions into core.c devlink: push linecard related code into separate file devlink: push rate related code into separate file devlink: push trap related code into separate file devlink: use tracepoint_enabled() helper devlink: push region related code into separate file devlink: push param related code into separate file devlink: push resource related code into separate file devlink: push dpipe related code into separate file devlink: move and rename devlink_dpipe_send_and_alloc_skb() helper devlink: push shared buffer related code into separate file devlink: push port related code into separate file devlink: push object register/unregister notifications into separate helpers inet: fix IP_TRANSPARENT error handling ...
2023-08-21	selinux: use vma_is_initial_stack() and vma_is_initial_heap()	Kefeng Wang
	Use the helpers to simplify code. Link: https://lkml.kernel.org/r/20230728050043.59880-4-wangkefeng.wang@huawei.com Signed-off-by: Kefeng Wang <wangkefeng.wang@huawei.com> Acked-by: Paul Moore <paul@paul-moore.com> Reviewed-by: David Hildenbrand <david@redhat.com> Acked-by: Peter Zijlstra (Intel) <peterz@infradead.org> Cc: Stephen Smalley <stephen.smalley.work@gmail.com> Cc: Eric Paris <eparis@parisplace.org> Cc: Alex Deucher <alexander.deucher@amd.com> Cc: Arnaldo Carvalho de Melo <acme@kernel.org> Cc: Christian Göttsche <cgzones@googlemail.com> Cc: "Christian König" <christian.koenig@amd.com> Cc: Daniel Vetter <daniel@ffwll.ch> Cc: David Airlie <airlied@gmail.com> Cc: Felix Kuehling <felix.kuehling@amd.com> Cc: "Pan, Xinhui" <Xinhui.Pan@amd.com> Signed-off-by: Andrew Morton <akpm@linux-foundation.org>
2023-08-15	lsm: constify the 'file' parameter in security_binder_transfer_file()	Khadija Kamran
	SELinux registers the implementation for the "binder_transfer_file" hook. Looking at the function implementation we observe that the parameter "file" is not changing. Mark the "file" parameter of LSM hook security_binder_transfer_file() as "const" since it will not be changing in the LSM hook. Signed-off-by: Khadija Kamran <kamrankhadijadj@gmail.com> [PM: subject line whitespace fix] Signed-off-by: Paul Moore <paul@paul-moore.com>
2023-08-15	vfs, security: Fix automount superblock LSM init problem, preventing NFS sb ↵	David Howells
	sharing When NFS superblocks are created by automounting, their LSM parameters aren't set in the fs_context struct prior to sget_fc() being called, leading to failure to match existing superblocks. This bug leads to messages like the following appearing in dmesg when fscache is enabled: NFS: Cache volume key already in use (nfs,4.2,2,108,106a8c0,1,,,,100000,100000,2ee,3a98,1d4c,3a98,1) Fix this by adding a new LSM hook to load fc->security for submount creation. Signed-off-by: David Howells <dhowells@redhat.com> Signed-off-by: Jeff Layton <jlayton@kernel.org> Link: https://lore.kernel.org/r/165962680944.3334508.6610023900349142034.stgit@warthog.procyon.org.uk/ # v1 Link: https://lore.kernel.org/r/165962729225.3357250.14350728846471527137.stgit@warthog.procyon.org.uk/ # v2 Link: https://lore.kernel.org/r/165970659095.2812394.6868894171102318796.stgit@warthog.procyon.org.uk/ # v3 Link: https://lore.kernel.org/r/166133579016.3678898.6283195019480567275.stgit@warthog.procyon.org.uk/ # v4 Link: https://lore.kernel.org/r/217595.1662033775@warthog.procyon.org.uk/ # v5 Fixes: 9bc61ab18b1d ("vfs: Introduce fs_context, switch vfs_kern_mount() to it.") Fixes: 779df6a5480f ("NFS: Ensure security label is set for root inode") Tested-by: Jeff Layton <jlayton@kernel.org> Acked-by: Casey Schaufler <casey@schaufler-ca.com> Acked-by: "Christian Brauner (Microsoft)" <brauner@kernel.org> Acked-by: Paul Moore <paul@paul-moore.com> Reviewed-by: Jeff Layton <jlayton@kernel.org> Message-Id: <20230808-master-v9-1-e0ecde888221@kernel.org> Signed-off-by: Christian Brauner <brauner@kernel.org>
2023-08-09	selinux: revert SECINITSID_INIT support	Paul Moore
	This commit reverts 5b0eea835d4e ("selinux: introduce an initial SID for early boot processes") as it was found to cause problems on distros with old SELinux userspace tools/libraries, specifically Ubuntu 16.04. Hopefully we will be able to re-add this functionality at a later date, but let's revert this for now to help ensure a stable and backwards compatible SELinux tree. Link: https://lore.kernel.org/selinux/87edkseqf8.fsf@mail.lhotse Acked-by: Ondrej Mosnacek <omosnace@redhat.com> Signed-off-by: Paul Moore <paul@paul-moore.com>
2023-08-08	lsm: constify the 'target' parameter in security_capget()	Khadija Kamran
	Three LSMs register the implementations for the "capget" hook: AppArmor, SELinux, and the normal capability code. Looking at the function implementations we may observe that the first parameter "target" is not changing. Mark the first argument "target" of LSM hook security_capget() as "const" since it will not be changing in the LSM hook. cap_capget() LSM hook declaration exceeds the 80 characters per line limit. Split the function declaration to multiple lines to decrease the line length. Signed-off-by: Khadija Kamran <kamrankhadijadj@gmail.com> Acked-by: John Johansen <john.johansen@canonical.com> [PM: align the cap_capget() declaration, spelling fixes] Signed-off-by: Paul Moore <paul@paul-moore.com>
2023-08-08	selinux: update comment on selinux_hooks[]	Xiu Jianfeng
	After commit f22f9aaf6c3d ("selinux: remove the runtime disable functionality"), the comment on selinux_hooks[] is out-of-date, remove the last paragraph about runtime disable functionality. Signed-off-by: Xiu Jianfeng <xiujianfeng@huawei.com> Signed-off-by: Paul Moore <paul@paul-moore.com>
2023-07-28	selinux: log about VM being executable by default	Christian Göttsche
	In case virtual memory is being marked as executable by default, SELinux checks regarding explicit potential dangerous use are disabled. Inform the user about it. Signed-off-by: Christian Göttsche <cgzones@googlemail.com> Signed-off-by: Paul Moore <paul@paul-moore.com>
2023-07-20	selinux: fix a 0/NULL mistmatch in ad_net_init_from_iif()	Paul Moore
	Use a NULL instead of a zero to resolve a int/pointer mismatch. Cc: Paolo Abeni <pabeni@redhat.com> Reported-by: kernel test robot <lkp@intel.com> Closes: https://lore.kernel.org/oe-kbuild-all/202307210332.4AqFZfzI-lkp@intel.com/ Fixes: dd51fcd42fd6 ("selinux: introduce and use lsm_ad_net_init*() helpers") Acked-by: Paolo Abeni <pabeni@redhat.com> Signed-off-by: Paul Moore <paul@paul-moore.com>
2023-07-19	selinux: introduce and use lsm_ad_net_init*() helpers	Paolo Abeni
	Perf traces of network-related workload shows a measurable overhead inside the network-related selinux hooks while zeroing the lsm_network_audit struct. In most cases we can delay the initialization of such structure to the usage point, avoiding such overhead in a few cases. Additionally, the audit code accesses the IP address information only for AF_INET* families, and selinux_parse_skb() will fill-out the relevant fields in such cases. When the family field is zeroed or the initialization is followed by the mentioned parsing, the zeroing can be limited to the sk, family and netif fields. By factoring out the audit-data initialization to new helpers, this patch removes some duplicate code and gives small but measurable performance gain under UDP flood. Signed-off-by: Paolo Abeni <pabeni@redhat.com> Signed-off-by: Paul Moore <paul@paul-moore.com>
2023-07-19	selinux: update my email address	Stephen Smalley
	Update my email address; MAINTAINERS was updated some time ago. Signed-off-by: Stephen Smalley <stephen.smalley.work@gmail.com> Signed-off-by: Paul Moore <paul@paul-moore.com>
2023-07-19	selinux: add missing newlines in pr_err() statements	Christian Göttsche
	The kernel print statements do not append an implicit newline to format strings. Signed-off-by: Christian Göttsche <cgzones@googlemail.com> [PM: subject line tweak] Signed-off-by: Paul Moore <paul@paul-moore.com>
2023-07-18	selinux: de-brand SELinux	Stephen Smalley
	Change "NSA SELinux" to just "SELinux" in Kconfig help text and comments. While NSA was the original primary developer and continues to help maintain SELinux, SELinux has long since transitioned to a wide community of developers and maintainers. SELinux has been part of the mainline Linux kernel for nearly 20 years now [1] and has received contributions from many individuals and organizations. [1] https://lore.kernel.org/lkml/Pine.LNX.4.44.0308082228470.1852-100000@home.osdl.org/ Signed-off-by: Stephen Smalley <stephen.smalley.work@gmail.com> Signed-off-by: Paul Moore <paul@paul-moore.com>
2023-07-18	selinux: avoid implicit conversions in the LSM hooks	Christian Göttsche
	Use the identical types in assignments of local variables for the destination. Merge tail calls into return statements. Avoid using leading underscores for function local variable. Signed-off-by: Christian Göttsche <cgzones@googlemail.com> [PM: subject line tweaks] Signed-off-by: Paul Moore <paul@paul-moore.com>
2023-07-14	security: Constify sk in the sk_getsecid hook.	Guillaume Nault
	The sk_getsecid hook shouldn't need to modify its socket argument. Make it const so that callers of security_sk_classify_flow() can use a const struct sock *. Signed-off-by: Guillaume Nault <gnault@redhat.com> Reviewed-by: Simon Horman <simon.horman@corigine.com> Signed-off-by: David S. Miller <davem@davemloft.net>
2023-07-10	selinux: introduce an initial SID for early boot processes	Ondrej Mosnacek
	Currently, SELinux doesn't allow distinguishing between kernel threads and userspace processes that are started before the policy is first loaded - both get the label corresponding to the kernel SID. The only way a process that persists from early boot can get a meaningful label is by doing a voluntary dyntransition or re-executing itself. Reusing the kernel label for userspace processes is problematic for several reasons: 1. The kernel is considered to be a privileged domain and generally needs to have a wide range of permissions allowed to work correctly, which prevents the policy writer from effectively hardening against early boot processes that might remain running unintentionally after the policy is loaded (they represent a potential extra attack surface that should be mitigated). 2. Despite the kernel being treated as a privileged domain, the policy writer may want to impose certain special limitations on kernel threads that may conflict with the requirements of intentional early boot processes. For example, it is a good hardening practice to limit what executables the kernel can execute as usermode helpers and to confine the resulting usermode helper processes. However, a (legitimate) process surviving from early boot may need to execute a different set of executables. 3. As currently implemented, overlayfs remembers the security context of the process that created an overlayfs mount and uses it to bound subsequent operations on files using this context. If an overlayfs mount is created before the SELinux policy is loaded, these "mounter" checks are made against the kernel context, which may clash with restrictions on the kernel domain (see 2.). To resolve this, introduce a new initial SID (reusing the slot of the former "init" initial SID) that will be assigned to any userspace process started before the policy is first loaded. This is easy to do, as we can simply label any process that goes through the bprm_creds_for_exec LSM hook with the new init-SID instead of propagating the kernel SID from the parent. To provide backwards compatibility for existing policies that are unaware of this new semantic of the "init" initial SID, introduce a new policy capability "userspace_initial_context" and set the "init" SID to the same context as the "kernel" SID unless this capability is set by the policy. Signed-off-by: Ondrej Mosnacek <omosnace@redhat.com> Signed-off-by: Paul Moore <paul@paul-moore.com>
2023-07-10	security: Allow all LSMs to provide xattrs for inode_init_security hook	Roberto Sassu
	Currently, the LSM infrastructure supports only one LSM providing an xattr and EVM calculating the HMAC on that xattr, plus other inode metadata. Allow all LSMs to provide one or multiple xattrs, by extending the security blob reservation mechanism. Introduce the new lbs_xattr_count field of the lsm_blob_sizes structure, so that each LSM can specify how many xattrs it needs, and the LSM infrastructure knows how many xattr slots it should allocate. Modify the inode_init_security hook definition, by passing the full xattr array allocated in security_inode_init_security(), and the current number of xattr slots in that array filled by LSMs. The first parameter would allow EVM to access and calculate the HMAC on xattrs supplied by other LSMs, the second to not leave gaps in the xattr array, when an LSM requested but did not provide xattrs (e.g. if it is not initialized). Introduce lsm_get_xattr_slot(), which LSMs can call as many times as the number specified in the lbs_xattr_count field of the lsm_blob_sizes structure. During each call, lsm_get_xattr_slot() increments the number of filled xattrs, so that at the next invocation it returns the next xattr slot to fill. Cleanup security_inode_init_security(). Unify the !initxattrs and initxattrs case by simply not allocating the new_xattrs array in the former. Update the documentation to reflect the changes, and fix the description of the xattr name, as it is not allocated anymore. Adapt both SELinux and Smack to use the new definition of the inode_init_security hook, and to call lsm_get_xattr_slot() to obtain and fill the reserved slots in the xattr array. Move the xattr->name assignment after the xattr->value one, so that it is done only in case of successful memory allocation. Finally, change the default return value of the inode_init_security hook from zero to -EOPNOTSUPP, so that BPF LSM correctly follows the hook conventions. Reported-by: Nicolas Bouchinet <nicolas.bouchinet@clip-os.org> Link: https://lore.kernel.org/linux-integrity/Y1FTSIo+1x+4X0LS@archlinux/ Signed-off-by: Roberto Sassu <roberto.sassu@huawei.com> Acked-by: Casey Schaufler <casey@schaufler-ca.com> [PM: minor comment and variable tweaks, approved by RS] Signed-off-by: Paul Moore <paul@paul-moore.com>
2023-05-30	selinux: make labeled NFS work when mounted before policy load	Ondrej Mosnacek
	Currently, when an NFS filesystem that supports passing LSM/SELinux labels is mounted during early boot (before the SELinux policy is loaded), it ends up mounted without the labeling support (i.e. with Fedora policy all files get the generic NFS label system_u:object_r:nfs_t:s0). This is because the information that the NFS mount supports passing labels (communicated to the LSM layer via the kern_flags argument of security_set_mnt_opts()) gets lost and when the policy is loaded the mount is initialized as if the passing is not supported. Fix this by noting the "native labeling" in newsbsec->flags (using a new SE_SBNATIVE flag) on the pre-policy-loaded call of selinux_set_mnt_opts() and then making sure it is respected on the second call from delayed_superblock_init(). Additionally, make inode_doinit_with_dentry() initialize the inode's label from its extended attributes whenever it doesn't find it already intitialized by the filesystem. This is needed to properly initialize pre-existing inodes when delayed_superblock_init() is called. It should not trigger in any other cases (and if it does, it's still better to initialize the correct label instead of leaving the inode unlabeled). Fixes: eb9ae686507b ("SELinux: Add new labeling type native labels") Tested-by: Scott Mayhew <smayhew@redhat.com> Signed-off-by: Ondrej Mosnacek <omosnace@redhat.com> [PM: fixed 'Fixes' tag format] Signed-off-by: Paul Moore <paul@paul-moore.com>
2023-05-18	selinux: Implement mptcp_add_subflow hook	Paolo Abeni
	Newly added subflows should inherit the LSM label from the associated MPTCP socket regardless of the current context. This patch implements the above copying sid and class from the MPTCP socket context, deleting the existing subflow label, if any, and then re-creating the correct one. The new helper reuses the selinux_netlbl_sk_security_free() function, and the latter can end-up being called multiple times with the same argument; we additionally need to make it idempotent. Signed-off-by: Paolo Abeni <pabeni@redhat.com> Acked-by: Matthieu Baerts <matthieu.baerts@tessares.net> Signed-off-by: Matthieu Baerts <matthieu.baerts@tessares.net> Signed-off-by: Paul Moore <paul@paul-moore.com>
2023-05-08	selinux: declare read-only data arrays const	Christian Göttsche
	The array of mount tokens in only used in match_opt_prefix() and never modified. The array of symtab names is never modified and only used in the DEBUG_HASHES configuration as output. The array of files for the SElinux filesystem sub-directory `ss` is similar to the other `struct tree_descr` usages only read from to construct the containing entries. Signed-off-by: Christian Göttsche <cgzones@googlemail.com> Signed-off-by: Paul Moore <paul@paul-moore.com>
2023-05-08	selinux: adjust typos in comments	Christian Göttsche
	Found by codespell(1) Signed-off-by: Christian Göttsche <cgzones@googlemail.com> Signed-off-by: Paul Moore <paul@paul-moore.com>
2023-03-20	selinux: remove the runtime disable functionality	Paul Moore
	After working with the larger SELinux-based distros for several years, we're finally at a place where we can disable the SELinux runtime disable functionality. The existing kernel deprecation notice explains the functionality and why we want to remove it: The selinuxfs "disable" node allows SELinux to be disabled at runtime prior to a policy being loaded into the kernel. If disabled via this mechanism, SELinux will remain disabled until the system is rebooted. The preferred method of disabling SELinux is via the "selinux=0" boot parameter, but the selinuxfs "disable" node was created to make it easier for systems with primitive bootloaders that did not allow for easy modification of the kernel command line. Unfortunately, allowing for SELinux to be disabled at runtime makes it difficult to secure the kernel's LSM hooks using the "__ro_after_init" feature. It is that last sentence, mentioning the '__ro_after_init' hardening, which is the real motivation for this change, and if you look at the diffstat you'll see that the impact of this patch reaches across all the different LSMs, helping prevent tampering at the LSM hook level. From a SELinux perspective, it is important to note that if you continue to disable SELinux via "/etc/selinux/config" it may appear that SELinux is disabled, but it is simply in an uninitialized state. If you load a policy with `load_policy -i`, you will see SELinux come alive just as if you had loaded the policy during early-boot. It is also worth noting that the "/sys/fs/selinux/disable" file is always writable now, regardless of the Kconfig settings, but writing to the file has no effect on the system, other than to display an error on the console if a non-zero/true value is written. Finally, in the several years where we have been working on deprecating this functionality, there has only been one instance of someone mentioning any user visible breakage. In this particular case it was an individual's kernel test system, and the workaround documented in the deprecation notice ("selinux=0" on the kernel command line) resolved the issue without problem. Acked-by: Casey Schaufler <casey@schaufler-ca.com> Signed-off-by: Paul Moore <paul@paul-moore.com>
2023-03-20	selinux: remove the 'checkreqprot' functionality	Paul Moore
	We originally promised that the SELinux 'checkreqprot' functionality would be removed no sooner than June 2021, and now that it is March 2023 it seems like it is a good time to do the final removal. The deprecation notice in the kernel provides plenty of detail on why 'checkreqprot' is not desirable, with the key point repeated below: This was a compatibility mechanism for legacy userspace and for the READ_IMPLIES_EXEC personality flag. However, if set to 1, it weakens security by allowing mappings to be made executable without authorization by policy. The default value of checkreqprot at boot was changed starting in Linux v4.4 to 0 (i.e. check the actual protection), and Android and Linux distributions have been explicitly writing a "0" to /sys/fs/selinux/checkreqprot during initialization for some time. Along with the official deprecation notice, we have been discussing this on-list and directly with several of the larger SELinux-based distros and everyone is happy to see this feature finally removed. In an attempt to catch all of the smaller, and DIY, Linux systems we have been writing a deprecation notice URL into the kernel log, along with a growing ssleep() penalty, when admins enabled checkreqprot at runtime or via the kernel command line. We have yet to have anyone come to us and raise an objection to the deprecation or planned removal. It is worth noting that while this patch removes the checkreqprot functionality, it leaves the user visible interfaces (kernel command line and selinuxfs file) intact, just inert. This should help prevent breakages with existing userspace tools that correctly, but unnecessarily, disable checkreqprot at boot or runtime. Admins that attempt to enable checkreqprot will be met with a removal message in the kernel log. Acked-by: Stephen Smalley <stephen.smalley.work@gmail.com> Signed-off-by: Paul Moore <paul@paul-moore.com>
2023-03-14	selinux: stop passing selinux_state pointers and their offspring	Stephen Smalley
	Linus observed that the pervasive passing of selinux_state pointers introduced by me in commit aa8e712cee93 ("selinux: wrap global selinux state") adds overhead and complexity without providing any benefit. The original idea was to pave the way for SELinux namespaces but those have not yet been implemented and there isn't currently a concrete plan to do so. Remove the passing of the selinux_state pointers, reverting to direct use of the single global selinux_state, and likewise remove passing of child pointers like the selinux_avc. The selinux_policy pointer remains as it is needed for atomic switching of policies. Suggested-by: Linus Torvalds <torvalds@linux-foundation.org> Reported-by: kernel test robot <lkp@intel.com> Link: https://lore.kernel.org/oe-kbuild-all/202303101057.mZ3Gv5fK-lkp@intel.com/ Signed-off-by: Stephen Smalley <stephen.smalley.work@gmail.com> Signed-off-by: Paul Moore <paul@paul-moore.com>
2023-01-19	fs: port inode_owner_or_capable() to mnt_idmap	Christian Brauner
	Convert to struct mnt_idmap. Last cycle we merged the necessary infrastructure in 256c8aed2b42 ("fs: introduce dedicated idmap type for mounts"). This is just the conversion to struct mnt_idmap. Currently we still pass around the plain namespace that was attached to a mount. This is in general pretty convenient but it makes it easy to conflate namespaces that are relevant on the filesystem with namespaces that are relevent on the mount level. Especially for non-vfs developers without detailed knowledge in this area this can be a potential source for bugs. Once the conversion to struct mnt_idmap is done all helpers down to the really low-level helpers will take a struct mnt_idmap argument instead of two namespace arguments. This way it becomes impossible to conflate the two eliminating the possibility of any bugs. All of the vfs and all filesystems only operate on struct mnt_idmap. Acked-by: Dave Chinner <dchinner@redhat.com> Reviewed-by: Christoph Hellwig <hch@lst.de> Signed-off-by: Christian Brauner (Microsoft) <brauner@kernel.org>
2023-01-19	fs: port acl to mnt_idmap	Christian Brauner
	Convert to struct mnt_idmap. Last cycle we merged the necessary infrastructure in 256c8aed2b42 ("fs: introduce dedicated idmap type for mounts"). This is just the conversion to struct mnt_idmap. Currently we still pass around the plain namespace that was attached to a mount. This is in general pretty convenient but it makes it easy to conflate namespaces that are relevant on the filesystem with namespaces that are relevent on the mount level. Especially for non-vfs developers without detailed knowledge in this area this can be a potential source for bugs. Once the conversion to struct mnt_idmap is done all helpers down to the really low-level helpers will take a struct mnt_idmap argument instead of two namespace arguments. This way it becomes impossible to conflate the two eliminating the possibility of any bugs. All of the vfs and all filesystems only operate on struct mnt_idmap. Acked-by: Dave Chinner <dchinner@redhat.com> Reviewed-by: Christoph Hellwig <hch@lst.de> Signed-off-by: Christian Brauner (Microsoft) <brauner@kernel.org>