lwn.git/arch/x86/kernel/ptrace.c, branch docs-6.9

x86: Add PTRACE interface for shadow stack

2023-08-02T22:01:51+00:00

Some applications (like GDB) would like to tweak shadow stack state via ptrace. This allows for existing functionality to continue to work for seized shadow stack applications. Provide a regset interface for manipulating the shadow stack pointer (SSP). There is already ptrace functionality for accessing xstate, but this does not include supervisor xfeatures. So there is not a completely clear place for where to put the shadow stack state. Adding it to the user xfeatures regset would complicate that code, as it currently shares logic with signals which should not have supervisor features. Don't add a general supervisor xfeature regset like the user one, because it is better to maintain flexibility for other supervisor xfeatures to define their own interface. For example, an xfeature may decide not to expose all of it's state to userspace, as is actually the case for shadow stack ptrace functionality. A lot of enum values remain to be used, so just put it in dedicated shadow stack regset. The only downside to not having a generic supervisor xfeature regset, is that apps need to be enlightened of any new supervisor xfeature exposed this way (i.e. they can't try to have generic save/restore logic). But maybe that is a good thing, because they have to think through each new xfeature instead of encountering issues when a new supervisor xfeature was added. By adding a shadow stack regset, it also has the effect of including the shadow stack state in a core dump, which could be useful for debugging. The shadow stack specific xstate includes the SSP, and the shadow stack and WRSS enablement status. Enabling shadow stack or WRSS in the kernel involves more than just flipping the bit. The kernel is made aware that it has to do extra things when cloning or handling signals. That logic is triggered off of separate feature enablement state kept in the task struct. So the flipping on HW shadow stack enforcement without notifying the kernel to change its behavior would severely limit what an application could do without crashing, and the results would depend on kernel internal implementation details. There is also no known use for controlling this state via ptrace today. So only expose the SSP, which is something that userspace already has indirect control over. Co-developed-by: Yu-cheng Yu Signed-off-by: Yu-cheng Yu Signed-off-by: Rick Edgecombe Signed-off-by: Dave Hansen Reviewed-by: Borislav Petkov (AMD) Reviewed-by: Kees Cook Acked-by: Mike Rapoport (IBM) Tested-by: Pengfei Xu Tested-by: John Allen Tested-by: Kees Cook Link: https://lore.kernel.org/all/20230613001108.3040476-41-rick.p.edgecombe%40intel.com

x86: Improve formatting of user_regset arrays

2022-11-01T22:36:52+00:00

Back in 2018, Ingo Molnar suggested[0] to improve the formatting of the struct user_regset arrays. They have multiple member initializations per line and some lines exceed 100 chars. Reformat them like he suggested. [0] https://lore.kernel.org/lkml/20180711102035.GB8574@gmail.com/ Signed-off-by: Rick Edgecombe Signed-off-by: Dave Hansen Link: https://lore.kernel.org/all/20221021221803.10910-3-rick.p.edgecombe%40intel.com

x86: Separate out x86_regset for 32 and 64 bit

2022-11-01T22:36:52+00:00

In fill_thread_core_info() the ptrace accessible registers are collected for a core file to be written out as notes. The note array is allocated from a size calculated by iterating the user regset view, and counting the regsets that have a non-zero core_note_type. However, this only allows for there to be non-zero core_note_type at the end of the regset view. If there are any in the middle, fill_thread_core_info() will overflow the note allocation, as it iterates over the size of the view and the allocation would be smaller than that. To apparently avoid this problem, x86_32_regsets and x86_64_regsets need to be constructed in a special way. They both draw their indices from a shared enum x86_regset, but 32 bit and 64 bit don't all support the same regsets and can be compiled in at the same time in the case of IA32_EMULATION. So this enum has to be laid out in a special way such that there are no gaps for both x86_32_regsets and x86_64_regsets. This involves ordering them just right by creating aliases for enum’s that are only in one view or the other, or creating multiple versions like REGSET32_IOPERM/REGSET64_IOPERM. So the collection of the registers tries to minimize the size of the allocation, but it doesn’t quite work. Then the x86 ptrace side works around it by constructing the enum just right to avoid a problem. In the end there is no functional problem, but it is somewhat strange and fragile. It could also be improved like this [1], by better utilizing the smaller array, but this still wastes space in the regset array’s if they are not carefully crafted to avoid gaps. Instead, just fully separate out the enums and give them separate 32 and 64 enum names. Add some bitsize-free defines for REGSET_GENERAL and REGSET_FP since they are the only two referred to in bitsize generic code. While introducing a bunch of new 32/64 enums, change the pattern of the name from REGSET_FOO32 to REGSET32_FOO to better indicate that the 32 is in reference to the CPU mode and not the register size, as suggested by Eric Biederman. This should have no functional change and is only changing how constants are generated and referred to. [1] https://lore.kernel.org/lkml/20180717162502.32274-1-yu-cheng.yu@intel.com/ Signed-off-by: Rick Edgecombe Signed-off-by: Dave Hansen Link: https://lore.kernel.org/all/20221021221803.10910-2-rick.p.edgecombe%40intel.com

x86/32: Remove lazy GS macros

2022-04-14T12:09:43+00:00

GS is always a user segment now. Signed-off-by: Brian Gerst Signed-off-by: Borislav Petkov Reviewed-by: Thomas Gleixner Acked-by: Andy Lutomirski Link: https://lore.kernel.org/r/20220325153953.162643-4-brgerst@gmail.com

Merge tag 'ptrace-cleanups-for-v5.18' of git://git.kernel.org/pub/scm/linux/kernel/git/ebiederm/user-namespace

2022-03-29T00:29:53+00:00

Pull ptrace cleanups from Eric Biederman: "This set of changes removes tracehook.h, moves modification of all of the ptrace fields inside of siglock to remove races, adds a missing permission check to ptrace.c The removal of tracehook.h is quite significant as it has been a major source of confusion in recent years. Much of that confusion was around task_work and TIF_NOTIFY_SIGNAL (which I have now decoupled making the semantics clearer). For people who don't know tracehook.h is a vestiage of an attempt to implement uprobes like functionality that was never fully merged, and was later superseeded by uprobes when uprobes was merged. For many years now we have been removing what tracehook functionaly a little bit at a time. To the point where anything left in tracehook.h was some weird strange thing that was difficult to understand" * tag 'ptrace-cleanups-for-v5.18' of git://git.kernel.org/pub/scm/linux/kernel/git/ebiederm/user-namespace: ptrace: Remove duplicated include in ptrace.c ptrace: Check PTRACE_O_SUSPEND_SECCOMP permission on PTRACE_SEIZE ptrace: Return the signal to continue with from ptrace_stop ptrace: Move setting/clearing ptrace_message into ptrace_stop tracehook: Remove tracehook.h resume_user_mode: Move to resume_user_mode.h resume_user_mode: Remove #ifdef TIF_NOTIFY_RESUME in set_notify_resume signal: Move set_notify_signal and clear_notify_signal into sched/signal.h task_work: Decouple TIF_NOTIFY_SIGNAL and task_work task_work: Call tracehook_notify_signal from get_signal on all architectures task_work: Introduce task_work_pending task_work: Remove unnecessary include from posix_timers.h ptrace: Remove tracehook_signal_handler ptrace: Remove arch_syscall_{enter,exit}_tracehook ptrace: Create ptrace_report_syscall_{entry,exit} in ptrace.h ptrace/arm: Rename tracehook_report_syscall report_syscall ptrace: Move ptrace_report_syscall into ptrace.h

tracehook: Remove tracehook.h

2022-03-10T22:51:51+00:00

Now that all of the definitions have moved out of tracehook.h into ptrace.h, sched/signal.h, resume_user_mode.h there is nothing left in tracehook.h so remove it. Update the few files that were depending upon tracehook.h to bring in definitions to use the headers they need directly. Reviewed-by: Kees Cook Link: https://lkml.kernel.org/r/20220309162454.123006-13-ebiederm@xmission.com Signed-off-by: "Eric W. Biederman"

x86/ptrace: Fix xfpregs_set()'s incorrect xmm clearing

2022-02-18T10:23:21+00:00

xfpregs_set() handles 32-bit REGSET_XFP and 64-bit REGSET_FP. The actual code treats these regsets as modern FX state (i.e. the beginning part of XSTATE). The declarations of the regsets thought they were the legacy i387 format. The code thought they were the 32-bit (no xmm8..15) variant of XSTATE and, for good measure, made the high bits disappear by zeroing the wrong part of the buffer. The latter broke ptrace, and everything else confused anyone trying to understand the code. In particular, the nonsense definitions of the regsets confused me when I wrote this code. Clean this all up. Change the declarations to match reality (which shouldn't change the generated code, let alone the ABI) and fix xfpregs_set() to clear the correct bits and to only do so for 32-bit callers. Fixes: 6164331d15f7 ("x86/fpu: Rewrite xfpregs_set()") Reported-by: Luís Ferreira Signed-off-by: Andy Lutomirski Signed-off-by: Borislav Petkov Cc: Link: https://bugzilla.kernel.org/show_bug.cgi?id=215524 Link: https://lore.kernel.org/r/YgpFnZpF01WwR8wU@zn.tnic

x86/fpu: Remove internal.h dependency from fpu/signal.h

2021-10-20T13:27:29+00:00

In order to remove internal.h make signal.h independent of it. Include asm/fpu/xstate.h to fix a missing update_regset_xstate_info() prototype, which is Reported-by: kernel test robot Signed-off-by: Thomas Gleixner Signed-off-by: Borislav Petkov Link: https://lkml.kernel.org/r/20211015011539.844565975@linutronix.de

x86/regs: Syscall_get_nr() returns -1 for a non-system call

2021-05-12T08:49:15+00:00

syscall_get_nr() is defined to return -1 for a non-system call or a ptrace/seccomp restart; not just any arbitrary number. See comment in for the official definition of this function. Signed-off-by: H. Peter Anvin Signed-off-by: Ingo Molnar Link: https://lore.kernel.org/r/20210510185316.3307264-7-hpa@zytor.com

x86/ptrace: Clean up PTRACE_GETREGS/PTRACE_PUTREGS regset selection

2021-02-04T11:33:15+00:00

task_user_regset_view() has nonsensical semantics, but those semantics appear to be relied on by existing users of PTRACE_GETREGSET and PTRACE_SETREGSET. (See added comments below for details.) It shouldn't be used for PTRACE_GETREGS or PTRACE_SETREGS, though. A native 64-bit ptrace() call and an x32 ptrace() call using GETREGS or SETREGS wants the 64-bit regset views, and a 32-bit ptrace() call (native or compat) should use the 32-bit regset. task_user_regset_view() almost does this except that it will malfunction if a ptracer is itself ptraced and the outer ptracer modifies CS on entry to a ptrace() syscall. Hopefully that has never happened. (The compat ptrace() code already hardcoded the 32-bit regset, so this change has no effect on that path.) Improve the situation and deobfuscate the code by hardcoding the 64-bit view in the x32 ptrace() and selecting the view based on the kernel config in the native ptrace(). I tried to figure out the history behind this API. I naïvely assumed that PTRAGE_GETREGSET and PTRACE_SETREGSET were ancient APIs that predated compat, but no. They were introduced by 2225a122ae26 ("ptrace: Add support for generic PTRACE_GETREGSET/PTRACE_SETREGSET") in 2010, and they are simply a poor design. ELF core dumps have the ELF e_machine field and a bunch of register sets in ELF notes, and the pair (e_machine, NT_XXX) indicates the format of the regset blob. But the new PTRACE_GET/SETREGSET API coopted the NT_XXX numbering without any way to specify which e_machine was in effect. This is especially bad on x86, where a process can freely switch between 32-bit and 64-bit mode, and, in fact, the PTRAGE_SETREGSET call itself can cause this switch to happen. Oops. Signed-off-by: Andy Lutomirski Signed-off-by: Borislav Petkov Link: https://lkml.kernel.org/r/9daa791d0c7eaebd59c5bc2b2af1b0e7bebe707d.1612375698.git.luto@kernel.org