diff options
Diffstat (limited to 'Documentation/admin-guide/LSM')
| -rw-r--r-- | Documentation/admin-guide/LSM/SELinux.rst | 11 | ||||
| -rw-r--r-- | Documentation/admin-guide/LSM/SafeSetID.rst | 2 | ||||
| -rw-r--r-- | Documentation/admin-guide/LSM/Smack.rst | 16 | ||||
| -rw-r--r-- | Documentation/admin-guide/LSM/index.rst | 1 | ||||
| -rw-r--r-- | Documentation/admin-guide/LSM/ipe.rst | 86 | ||||
| -rw-r--r-- | Documentation/admin-guide/LSM/landlock.rst | 189 |
6 files changed, 279 insertions, 26 deletions
diff --git a/Documentation/admin-guide/LSM/SELinux.rst b/Documentation/admin-guide/LSM/SELinux.rst index 520a1c2c6fd2..cdd65164ca96 100644 --- a/Documentation/admin-guide/LSM/SELinux.rst +++ b/Documentation/admin-guide/LSM/SELinux.rst @@ -2,6 +2,17 @@ SELinux ======= +Information about the SELinux kernel subsystem can be found at the +following links: + + https://git.kernel.org/pub/scm/linux/kernel/git/pcmoore/selinux.git/tree/README.md + + https://github.com/selinuxproject/selinux-kernel/wiki + +Information about the SELinux userspace can be found at: + + https://github.com/SELinuxProject/selinux/wiki + If you want to use SELinux, chances are you will want to use the distro-provided policies, or install the latest reference policy release from diff --git a/Documentation/admin-guide/LSM/SafeSetID.rst b/Documentation/admin-guide/LSM/SafeSetID.rst index 0ec34863c674..6d439c987563 100644 --- a/Documentation/admin-guide/LSM/SafeSetID.rst +++ b/Documentation/admin-guide/LSM/SafeSetID.rst @@ -41,7 +41,7 @@ namespace). The higher level goal is to allow for uid-based sandboxing of system services without having to give out CAP_SETUID all over the place just so that non-root programs can drop to even-lesser-privileged uids. This is especially relevant when one non-root daemon on the system should be allowed to spawn other -processes as different uids, but its undesirable to give the daemon a +processes as different uids, but it's undesirable to give the daemon a basically-root-equivalent CAP_SETUID. diff --git a/Documentation/admin-guide/LSM/Smack.rst b/Documentation/admin-guide/LSM/Smack.rst index 6d44f4fdbf59..c5ed775f2d10 100644 --- a/Documentation/admin-guide/LSM/Smack.rst +++ b/Documentation/admin-guide/LSM/Smack.rst @@ -601,10 +601,15 @@ specification. Task Attribute ~~~~~~~~~~~~~~ -The Smack label of a process can be read from /proc/<pid>/attr/current. A -process can read its own Smack label from /proc/self/attr/current. A +The Smack label of a process can be read from ``/proc/<pid>/attr/current``. A +process can read its own Smack label from ``/proc/self/attr/current``. A privileged process can change its own Smack label by writing to -/proc/self/attr/current but not the label of another process. +``/proc/self/attr/current`` but not the label of another process. + +Format of writing is : only the label or the label followed by one of the +3 trailers: ``\n`` (by common agreement for ``/proc/...`` interfaces), +``\0`` (because some applications incorrectly include it), +``\n\0`` (because we think some applications may incorrectly include it). File Attribute ~~~~~~~~~~~~~~ @@ -696,6 +701,11 @@ sockets. A privileged program may set this to match the label of another task with which it hopes to communicate. +UNIX domain socket (UDS) with a BSD address functions both as a file in a +filesystem and as a socket. As a file, it carries the SMACK64 attribute. This +attribute is not involved in Smack security enforcement and is immutably +assigned the label "*". + Smack Netlabel Exceptions ~~~~~~~~~~~~~~~~~~~~~~~~~ diff --git a/Documentation/admin-guide/LSM/index.rst b/Documentation/admin-guide/LSM/index.rst index ce63be6d64ad..b44ef68f6e4d 100644 --- a/Documentation/admin-guide/LSM/index.rst +++ b/Documentation/admin-guide/LSM/index.rst @@ -48,3 +48,4 @@ subdirectories. Yama SafeSetID ipe + landlock diff --git a/Documentation/admin-guide/LSM/ipe.rst b/Documentation/admin-guide/LSM/ipe.rst index f93a467db628..a756d8158531 100644 --- a/Documentation/admin-guide/LSM/ipe.rst +++ b/Documentation/admin-guide/LSM/ipe.rst @@ -95,7 +95,20 @@ languages when these scripts are invoked by passing these program files to the interpreter. This is because the way interpreters execute these files; the scripts themselves are not evaluated as executable code through one of IPE's hooks, but they are merely text files that are read -(as opposed to compiled executables) [#interpreters]_. +(as opposed to compiled executables). However, with the introduction of the +``AT_EXECVE_CHECK`` flag (:doc:`AT_EXECVE_CHECK </userspace-api/check_exec>`), +interpreters can use it to signal the kernel that a script file will be executed, +and request the kernel to perform LSM security checks on it. + +IPE's EXECUTE operation enforcement differs between compiled executables and +interpreted scripts: For compiled executables, enforcement is triggered +automatically by the kernel during ``execve()``, ``execveat()``, ``mmap()`` +and ``mprotect()`` syscalls when loading executable content. For interpreted +scripts, enforcement requires explicit interpreter integration using +``execveat()`` with ``AT_EXECVE_CHECK`` flag. Unlike exec syscalls that IPE +intercepts during the execution process, this mechanism needs the interpreter +to take the initiative, and existing interpreters won't be automatically +supported unless the signal call is added. Threat Model ------------ @@ -423,7 +436,7 @@ Field descriptions: Event Example:: - type=1422 audit(1653425529.927:53): policy_name="boot_verified" policy_version=0.0.0 policy_digest=sha256:820EEA5B40CA42B51F68962354BA083122A20BB846F26765076DD8EED7B8F4DB auid=4294967295 ses=4294967295 lsm=ipe res=1 + type=1422 audit(1653425529.927:53): policy_name="boot_verified" policy_version=0.0.0 policy_digest=sha256:820EEA5B40CA42B51F68962354BA083122A20BB846F26765076DD8EED7B8F4DB auid=4294967295 ses=4294967295 lsm=ipe res=1 errno=0 type=1300 audit(1653425529.927:53): arch=c000003e syscall=1 success=yes exit=2567 a0=3 a1=5596fcae1fb0 a2=a07 a3=2 items=0 ppid=184 pid=229 auid=4294967295 uid=0 gid=0 euid=0 suid=0 fsuid=0 egid=0 sgid=0 fsgid=0 tty=pts0 ses=4294967295 comm="python3" exe="/usr/bin/python3.10" key=(null) type=1327 audit(1653425529.927:53): PROCTITLE proctitle=707974686F6E3300746573742F6D61696E2E7079002D66002E2E @@ -433,24 +446,55 @@ This record will always be emitted in conjunction with a ``AUDITSYSCALL`` record Field descriptions: -+----------------+------------+-----------+---------------------------------------------------+ -| Field | Value Type | Optional? | Description of Value | -+================+============+===========+===================================================+ -| policy_name | string | No | The policy_name | -+----------------+------------+-----------+---------------------------------------------------+ -| policy_version | string | No | The policy_version | -+----------------+------------+-----------+---------------------------------------------------+ -| policy_digest | string | No | The policy hash | -+----------------+------------+-----------+---------------------------------------------------+ -| auid | integer | No | The login user ID | -+----------------+------------+-----------+---------------------------------------------------+ -| ses | integer | No | The login session ID | -+----------------+------------+-----------+---------------------------------------------------+ -| lsm | string | No | The lsm name associated with the event | -+----------------+------------+-----------+---------------------------------------------------+ -| res | integer | No | The result of the audited operation(success/fail) | -+----------------+------------+-----------+---------------------------------------------------+ - ++----------------+------------+-----------+-------------------------------------------------------------+ +| Field | Value Type | Optional? | Description of Value | ++================+============+===========+=============================================================+ +| policy_name | string | Yes | The policy_name | ++----------------+------------+-----------+-------------------------------------------------------------+ +| policy_version | string | Yes | The policy_version | ++----------------+------------+-----------+-------------------------------------------------------------+ +| policy_digest | string | Yes | The policy hash | ++----------------+------------+-----------+-------------------------------------------------------------+ +| auid | integer | No | The login user ID | ++----------------+------------+-----------+-------------------------------------------------------------+ +| ses | integer | No | The login session ID | ++----------------+------------+-----------+-------------------------------------------------------------+ +| lsm | string | No | The lsm name associated with the event | ++----------------+------------+-----------+-------------------------------------------------------------+ +| res | integer | No | The result of the audited operation(success/fail) | ++----------------+------------+-----------+-------------------------------------------------------------+ +| errno | integer | No | Error code from policy loading operations (see table below) | ++----------------+------------+-----------+-------------------------------------------------------------+ + +Policy error codes (errno): + +The following table lists the error codes that may appear in the errno field while loading or updating the policy: + ++----------------+--------------------------------------------------------+ +| Error Code | Description | ++================+========================================================+ +| 0 | Success | ++----------------+--------------------------------------------------------+ +| -EPERM | Insufficient permission | ++----------------+--------------------------------------------------------+ +| -EEXIST | Same name policy already deployed | ++----------------+--------------------------------------------------------+ +| -EBADMSG | Policy is invalid | ++----------------+--------------------------------------------------------+ +| -ENOMEM | Out of memory (OOM) | ++----------------+--------------------------------------------------------+ +| -ERANGE | Policy version number overflow | ++----------------+--------------------------------------------------------+ +| -EINVAL | Policy version parsing error | ++----------------+--------------------------------------------------------+ +| -ENOKEY | Key used to sign the IPE policy not found in keyring | ++----------------+--------------------------------------------------------+ +| -EKEYREJECTED | Policy signature verification failed | ++----------------+--------------------------------------------------------+ +| -ESTALE | Attempting to update an IPE policy with older version | ++----------------+--------------------------------------------------------+ +| -ENOENT | Policy was deleted while updating | ++----------------+--------------------------------------------------------+ 1404 AUDIT_MAC_STATUS ^^^^^^^^^^^^^^^^^^^^^ @@ -775,8 +819,6 @@ A: .. [#digest_cache_lsm] https://lore.kernel.org/lkml/20240415142436.2545003-1-roberto.sassu@huaweicloud.com/ -.. [#interpreters] There is `some interest in solving this issue <https://lore.kernel.org/lkml/20220321161557.495388-1-mic@digikod.net/>`_. - .. [#devdoc] Please see :doc:`the design docs </security/ipe>` for more on this topic. diff --git a/Documentation/admin-guide/LSM/landlock.rst b/Documentation/admin-guide/LSM/landlock.rst new file mode 100644 index 000000000000..9923874e2156 --- /dev/null +++ b/Documentation/admin-guide/LSM/landlock.rst @@ -0,0 +1,189 @@ +.. SPDX-License-Identifier: GPL-2.0 +.. Copyright © 2025 Microsoft Corporation + +================================ +Landlock: system-wide management +================================ + +:Author: Mickaël Salaün +:Date: January 2026 + +Landlock can leverage the audit framework to log events. + +User space documentation can be found here: +Documentation/userspace-api/landlock.rst. + +Audit +===== + +Denied access requests are logged by default for a sandboxed program if `audit` +is enabled. This default behavior can be changed with the +sys_landlock_restrict_self() flags (cf. +Documentation/userspace-api/landlock.rst). Landlock logs can also be masked +thanks to audit rules. Landlock can generate 2 audit record types. + +Record types +------------ + +AUDIT_LANDLOCK_ACCESS + This record type identifies a denied access request to a kernel resource. + The ``domain`` field indicates the ID of the domain that blocked the + request. The ``blockers`` field indicates the cause(s) of this denial + (separated by a comma), and the following fields identify the kernel object + (similar to SELinux). There may be more than one of this record type per + audit event. + + Example with a file link request generating two records in the same event:: + + domain=195ba459b blockers=fs.refer path="/usr/bin" dev="vda2" ino=351 + domain=195ba459b blockers=fs.make_reg,fs.refer path="/usr/local" dev="vda2" ino=365 + + + The ``blockers`` field uses dot-separated prefixes to indicate the type of + restriction that caused the denial: + + **fs.*** - Filesystem access rights (ABI 1+): + - fs.execute, fs.write_file, fs.read_file, fs.read_dir + - fs.remove_dir, fs.remove_file + - fs.make_char, fs.make_dir, fs.make_reg, fs.make_sock + - fs.make_fifo, fs.make_block, fs.make_sym + - fs.refer (ABI 2+) + - fs.truncate (ABI 3+) + - fs.ioctl_dev (ABI 5+) + + **net.*** - Network access rights (ABI 4+): + - net.bind_tcp - TCP port binding was denied + - net.connect_tcp - TCP connection was denied + + **scope.*** - IPC scoping restrictions (ABI 6+): + - scope.abstract_unix_socket - Abstract UNIX socket connection denied + - scope.signal - Signal sending denied + + Multiple blockers can appear in a single event (comma-separated) when + multiple access rights are missing. For example, creating a regular file + in a directory that lacks both ``make_reg`` and ``refer`` rights would show + ``blockers=fs.make_reg,fs.refer``. + + The object identification fields (path, dev, ino for filesystem; opid, + ocomm for signals) depend on the type of access being blocked and provide + context about what resource was involved in the denial. + + +AUDIT_LANDLOCK_DOMAIN + This record type describes the status of a Landlock domain. The ``status`` + field can be either ``allocated`` or ``deallocated``. + + The ``allocated`` status is part of the same audit event and follows + the first logged ``AUDIT_LANDLOCK_ACCESS`` record of a domain. It identifies + Landlock domain information at the time of the sys_landlock_restrict_self() + call with the following fields: + + - the ``domain`` ID + - the enforcement ``mode`` + - the domain creator's ``pid`` + - the domain creator's ``uid`` + - the domain creator's executable path (``exe``) + - the domain creator's command line (``comm``) + + Example:: + + domain=195ba459b status=allocated mode=enforcing pid=300 uid=0 exe="/root/sandboxer" comm="sandboxer" + + The ``deallocated`` status is an event on its own and it identifies a + Landlock domain release. After such event, it is guarantee that the + related domain ID will never be reused during the lifetime of the system. + The ``domain`` field indicates the ID of the domain which is released, and + the ``denials`` field indicates the total number of denied access request, + which might not have been logged according to the audit rules and + sys_landlock_restrict_self()'s flags. + + Example:: + + domain=195ba459b status=deallocated denials=3 + + +Event samples +-------------- + +Here are two examples of log events (see serial numbers). + +In this example a sandboxed program (``kill``) tries to send a signal to the +init process, which is denied because of the signal scoping restriction +(``LL_SCOPED=s``):: + + $ LL_FS_RO=/ LL_FS_RW=/ LL_SCOPED=s LL_FORCE_LOG=1 ./sandboxer kill 1 + +This command generates two events, each identified with a unique serial +number following a timestamp (``msg=audit(1729738800.268:30)``). The first +event (serial ``30``) contains 4 records. The first record +(``type=LANDLOCK_ACCESS``) shows an access denied by the domain `1a6fdc66f`. +The cause of this denial is signal scoping restriction +(``blockers=scope.signal``). The process that would have receive this signal +is the init process (``opid=1 ocomm="systemd"``). + +The second record (``type=LANDLOCK_DOMAIN``) describes (``status=allocated``) +domain `1a6fdc66f`. This domain was created by process ``286`` executing the +``/root/sandboxer`` program launched by the root user. + +The third record (``type=SYSCALL``) describes the syscall, its provided +arguments, its result (``success=no exit=-1``), and the process that called it. + +The fourth record (``type=PROCTITLE``) shows the command's name as an +hexadecimal value. This can be translated with ``python -c +'print(bytes.fromhex("6B696C6C0031"))'``. + +Finally, the last record (``type=LANDLOCK_DOMAIN``) is also the only one from +the second event (serial ``31``). It is not tied to a direct user space action +but an asynchronous one to free resources tied to a Landlock domain +(``status=deallocated``). This can be useful to know that the following logs +will not concern the domain ``1a6fdc66f`` anymore. This record also summarize +the number of requests this domain denied (``denials=1``), whether they were +logged or not. + +.. code-block:: + + type=LANDLOCK_ACCESS msg=audit(1729738800.268:30): domain=1a6fdc66f blockers=scope.signal opid=1 ocomm="systemd" + type=LANDLOCK_DOMAIN msg=audit(1729738800.268:30): domain=1a6fdc66f status=allocated mode=enforcing pid=286 uid=0 exe="/root/sandboxer" comm="sandboxer" + type=SYSCALL msg=audit(1729738800.268:30): arch=c000003e syscall=62 success=no exit=-1 [..] ppid=272 pid=286 auid=0 uid=0 gid=0 [...] comm="kill" [...] + type=PROCTITLE msg=audit(1729738800.268:30): proctitle=6B696C6C0031 + type=LANDLOCK_DOMAIN msg=audit(1729738800.324:31): domain=1a6fdc66f status=deallocated denials=1 + +Here is another example showcasing filesystem access control:: + + $ LL_FS_RO=/ LL_FS_RW=/tmp LL_FORCE_LOG=1 ./sandboxer sh -c "echo > /etc/passwd" + +The related audit logs contains 8 records from 3 different events (serials 33, +34 and 35) created by the same domain `1a6fdc679`:: + + type=LANDLOCK_ACCESS msg=audit(1729738800.221:33): domain=1a6fdc679 blockers=fs.write_file path="/dev/tty" dev="devtmpfs" ino=9 + type=LANDLOCK_DOMAIN msg=audit(1729738800.221:33): domain=1a6fdc679 status=allocated mode=enforcing pid=289 uid=0 exe="/root/sandboxer" comm="sandboxer" + type=SYSCALL msg=audit(1729738800.221:33): arch=c000003e syscall=257 success=no exit=-13 [...] ppid=272 pid=289 auid=0 uid=0 gid=0 [...] comm="sh" [...] + type=PROCTITLE msg=audit(1729738800.221:33): proctitle=7368002D63006563686F203E202F6574632F706173737764 + type=LANDLOCK_ACCESS msg=audit(1729738800.221:34): domain=1a6fdc679 blockers=fs.write_file path="/etc/passwd" dev="vda2" ino=143821 + type=SYSCALL msg=audit(1729738800.221:34): arch=c000003e syscall=257 success=no exit=-13 [...] ppid=272 pid=289 auid=0 uid=0 gid=0 [...] comm="sh" [...] + type=PROCTITLE msg=audit(1729738800.221:34): proctitle=7368002D63006563686F203E202F6574632F706173737764 + type=LANDLOCK_DOMAIN msg=audit(1729738800.261:35): domain=1a6fdc679 status=deallocated denials=2 + + +Event filtering +--------------- + +If you get spammed with audit logs related to Landlock, this is either an +attack attempt or a bug in the security policy. We can put in place some +filters to limit noise with two complementary ways: + +- with sys_landlock_restrict_self()'s flags if we can fix the sandboxed + programs, +- or with audit rules (see :manpage:`auditctl(8)`). + +Additional documentation +======================== + +* `Linux Audit Documentation`_ +* Documentation/userspace-api/landlock.rst +* Documentation/security/landlock.rst +* https://landlock.io + +.. Links +.. _Linux Audit Documentation: + https://github.com/linux-audit/audit-documentation/wiki |
