diff options
| author | Linus Torvalds <torvalds@linux-foundation.org> | 2017-09-04 09:52:57 -0700 |
|---|---|---|
| committer | Linus Torvalds <torvalds@linux-foundation.org> | 2017-09-04 09:52:57 -0700 |
| commit | b0c79f49c343cda8954b3322984c32f258ca4ccb (patch) | |
| tree | dd823d13683b7e6b0caebcaf3964df6150aee294 /tools/objtool/Documentation | |
| parent | f213a6c84c1b4b396a0713ee33cff0e02ba8235f (diff) | |
| parent | dd88a0a0c8615417fe6b4285769b5b772de87279 (diff) | |
| download | lwn-b0c79f49c343cda8954b3322984c32f258ca4ccb.tar.gz lwn-b0c79f49c343cda8954b3322984c32f258ca4ccb.zip | |
Merge branch 'x86-asm-for-linus' of git://git.kernel.org/pub/scm/linux/kernel/git/tip/tip
Pull x86 asm updates from Ingo Molnar:
- Introduce the ORC unwinder, which can be enabled via
CONFIG_ORC_UNWINDER=y.
The ORC unwinder is a lightweight, Linux kernel specific debuginfo
implementation, which aims to be DWARF done right for unwinding.
Objtool is used to generate the ORC unwinder tables during build, so
the data format is flexible and kernel internal: there's no
dependency on debuginfo created by an external toolchain.
The ORC unwinder is almost two orders of magnitude faster than the
(out of tree) DWARF unwinder - which is important for perf call graph
profiling. It is also significantly simpler and is coded defensively:
there has not been a single ORC related kernel crash so far, even
with early versions. (knock on wood!)
But the main advantage is that enabling the ORC unwinder allows
CONFIG_FRAME_POINTERS to be turned off - which speeds up the kernel
measurably:
With frame pointers disabled, GCC does not have to add frame pointer
instrumentation code to every function in the kernel. The kernel's
.text size decreases by about 3.2%, resulting in better cache
utilization and fewer instructions executed, resulting in a broad
kernel-wide speedup. Average speedup of system calls should be
roughly in the 1-3% range - measurements by Mel Gorman [1] have shown
a speedup of 5-10% for some function execution intense workloads.
The main cost of the unwinder is that the unwinder data has to be
stored in RAM: the memory cost is 2-4MB of RAM, depending on kernel
config - which is a modest cost on modern x86 systems.
Given how young the ORC unwinder code is it's not enabled by default
- but given the performance advantages the plan is to eventually make
it the default unwinder on x86.
See Documentation/x86/orc-unwinder.txt for more details.
- Remove lguest support: its intended role was that of a temporary
proof of concept for virtualization, plus its removal will enable the
reduction (removal) of the paravirt API as well, so Rusty agreed to
its removal. (Juergen Gross)
- Clean up and fix FSGS related functionality (Andy Lutomirski)
- Clean up IO access APIs (Andy Shevchenko)
- Enhance the symbol namespace (Jiri Slaby)
* 'x86-asm-for-linus' of git://git.kernel.org/pub/scm/linux/kernel/git/tip/tip: (47 commits)
objtool: Handle GCC stack pointer adjustment bug
x86/entry/64: Use ENTRY() instead of ALIGN+GLOBAL for stub32_clone()
x86/fpu/math-emu: Add ENDPROC to functions
x86/boot/64: Extract efi_pe_entry() from startup_64()
x86/boot/32: Extract efi_pe_entry() from startup_32()
x86/lguest: Remove lguest support
x86/paravirt/xen: Remove xen_patch()
objtool: Fix objtool fallthrough detection with function padding
x86/xen/64: Fix the reported SS and CS in SYSCALL
objtool: Track DRAP separately from callee-saved registers
objtool: Fix validate_branch() return codes
x86: Clarify/fix no-op barriers for text_poke_bp()
x86/switch_to/64: Rewrite FS/GS switching yet again to fix AMD CPUs
selftests/x86/fsgsbase: Test selectors 1, 2, and 3
x86/fsgsbase/64: Report FSBASE and GSBASE correctly in core dumps
x86/fsgsbase/64: Fully initialize FS and GS state in start_thread_common
x86/asm: Fix UNWIND_HINT_REGS macro for older binutils
x86/asm/32: Fix regs_get_register() on segment registers
x86/xen/64: Rearrange the SYSCALL entries
x86/asm/32: Remove a bunch of '& 0xffff' from pt_regs segment reads
...
Diffstat (limited to 'tools/objtool/Documentation')
| -rw-r--r-- | tools/objtool/Documentation/stack-validation.txt | 56 |
1 files changed, 17 insertions, 39 deletions
diff --git a/tools/objtool/Documentation/stack-validation.txt b/tools/objtool/Documentation/stack-validation.txt index 17c1195f11f4..6a1af43862df 100644 --- a/tools/objtool/Documentation/stack-validation.txt +++ b/tools/objtool/Documentation/stack-validation.txt @@ -11,9 +11,6 @@ analyzes every .o file and ensures the validity of its stack metadata. It enforces a set of rules on asm code and C inline assembly code so that stack traces can be reliable. -Currently it only checks frame pointer usage, but there are plans to add -CFI validation for C files and CFI generation for asm files. - For each function, it recursively follows all possible code paths and validates the correct frame pointer state at each instruction. @@ -23,6 +20,10 @@ alternative execution paths to a given instruction (or set of instructions). Similarly, it knows how to follow switch statements, for which gcc sometimes uses jump tables. +(Objtool also has an 'orc generate' subcommand which generates debuginfo +for the ORC unwinder. See Documentation/x86/orc-unwinder.txt in the +kernel tree for more details.) + Why do we need stack metadata validation? ----------------------------------------- @@ -93,37 +94,14 @@ a) More reliable stack traces for frame pointer enabled kernels or at the very end of the function after the stack frame has been destroyed. This is an inherent limitation of frame pointers. -b) 100% reliable stack traces for DWARF enabled kernels - - (NOTE: This is not yet implemented) - - As an alternative to frame pointers, DWARF Call Frame Information - (CFI) metadata can be used to walk the stack. Unlike frame pointers, - CFI metadata is out of band. So it doesn't affect runtime - performance and it can be reliable even when interrupts or exceptions - are involved. - - For C code, gcc automatically generates DWARF CFI metadata. But for - asm code, generating CFI is a tedious manual approach which requires - manually placed .cfi assembler macros to be scattered throughout the - code. It's clumsy and very easy to get wrong, and it makes the real - code harder to read. - - Stacktool will improve this situation in several ways. For code - which already has CFI annotations, it will validate them. For code - which doesn't have CFI annotations, it will generate them. So an - architecture can opt to strip out all the manual .cfi annotations - from their asm code and have objtool generate them instead. +b) ORC (Oops Rewind Capability) unwind table generation - We might also add a runtime stack validation debug option where we - periodically walk the stack from schedule() and/or an NMI to ensure - that the stack metadata is sane and that we reach the bottom of the - stack. + An alternative to frame pointers and DWARF, ORC unwind data can be + used to walk the stack. Unlike frame pointers, ORC data is out of + band. So it doesn't affect runtime performance and it can be + reliable even when interrupts or exceptions are involved. - So the benefit of objtool here will be that external tooling should - always show perfect stack traces. And the same will be true for - kernel warning/oops traces if the architecture has a runtime DWARF - unwinder. + For more details, see Documentation/x86/orc-unwinder.txt. c) Higher live patching compatibility rate @@ -211,7 +189,7 @@ they mean, and suggestions for how to fix them. function, add proper frame pointer logic using the FRAME_BEGIN and FRAME_END macros. Otherwise, if it's not a callable function, remove its ELF function annotation by changing ENDPROC to END, and instead - use the manual CFI hint macros in asm/undwarf.h. + use the manual unwind hint macros in asm/unwind_hints.h. If it's a GCC-compiled .c file, the error may be because the function uses an inline asm() statement which has a "call" instruction. An @@ -231,8 +209,8 @@ they mean, and suggestions for how to fix them. If the error is for an asm file, and the instruction is inside (or reachable from) a callable function, the function should be annotated with the ENTRY/ENDPROC macros (ENDPROC is the important one). - Otherwise, the code should probably be annotated with the CFI hint - macros in asm/undwarf.h so objtool and the unwinder can know the + Otherwise, the code should probably be annotated with the unwind hint + macros in asm/unwind_hints.h so objtool and the unwinder can know the stack state associated with the code. If you're 100% sure the code won't affect stack traces, or if you're @@ -258,7 +236,7 @@ they mean, and suggestions for how to fix them. instructions aren't allowed in a callable function, and are most likely part of the kernel entry code. They should usually not have the callable function annotation (ENDPROC) and should always be - annotated with the CFI hint macros in asm/undwarf.h. + annotated with the unwind hint macros in asm/unwind_hints.h. 6. file.o: warning: objtool: func()+0x26: sibling call from callable instruction with modified stack frame @@ -272,7 +250,7 @@ they mean, and suggestions for how to fix them. If the instruction is not actually in a callable function (e.g. kernel entry code), change ENDPROC to END and annotate manually with - the CFI hint macros in asm/undwarf.h. + the unwind hint macros in asm/unwind_hints.h. 7. file: warning: objtool: func()+0x5c: stack state mismatch @@ -288,8 +266,8 @@ they mean, and suggestions for how to fix them. Another possibility is that the code has some asm or inline asm which does some unusual things to the stack or the frame pointer. In such - cases it's probably appropriate to use the CFI hint macros in - asm/undwarf.h. + cases it's probably appropriate to use the unwind hint macros in + asm/unwind_hints.h. 8. file.o: warning: objtool: funcA() falls through to next function funcB() |
