summaryrefslogtreecommitdiff
path: root/arch/powerpc/kernel/trace/ftrace.c
AgeCommit message (Collapse)Author
2024-11-23Merge tag 'powerpc-6.13-1' of ↵Linus Torvalds
git://git.kernel.org/pub/scm/linux/kernel/git/powerpc/linux Pull powerpc updates from Michael Ellerman: - Rework kfence support for the HPT MMU to work on systems with >= 16TB of RAM. - Remove the powerpc "maple" platform, used by the "Yellow Dog Powerstation". - Add support for DYNAMIC_FTRACE_WITH_CALL_OPS, DYNAMIC_FTRACE_WITH_DIRECT_CALLS & BPF Trampolines. - Add support for running KVM nested guests on Power11. - Other small features, cleanups and fixes. Thanks to Amit Machhiwal, Arnd Bergmann, Christophe Leroy, Costa Shulyupin, David Hunter, David Wang, Disha Goel, Gautam Menghani, Geert Uytterhoeven, Hari Bathini, Julia Lawall, Kajol Jain, Keith Packard, Lukas Bulwahn, Madhavan Srinivasan, Markus Elfring, Michal Suchanek, Ming Lei, Mukesh Kumar Chaurasiya, Nathan Chancellor, Naveen N Rao, Nicholas Piggin, Nysal Jan K.A, Paulo Miguel Almeida, Pavithra Prakash, Ritesh Harjani (IBM), Rob Herring (Arm), Sachin P Bappalige, Shen Lichuan, Simon Horman, Sourabh Jain, Thomas Weißschuh, Thorsten Blum, Thorsten Leemhuis, Venkat Rao Bagalkote, Zhang Zekun, and zhang jiao. * tag 'powerpc-6.13-1' of git://git.kernel.org/pub/scm/linux/kernel/git/powerpc/linux: (89 commits) EDAC/powerpc: Remove PPC_MAPLE drivers powerpc/perf: Add per-task/process monitoring to vpa_pmu driver powerpc/kvm: Add vpa latency counters to kvm_vcpu_arch docs: ABI: sysfs-bus-event_source-devices-vpa-pmu: Document sysfs event format entries for vpa_pmu powerpc/perf: Add perf interface to expose vpa counters MAINTAINERS: powerpc: Mark Maddy as "M" powerpc/Makefile: Allow overriding CPP powerpc-km82xx.c: replace of_node_put() with __free ps3: Correct some typos in comments powerpc/kexec: Fix return of uninitialized variable macintosh: Use common error handling code in via_pmu_led_init() powerpc/powermac: Use of_property_match_string() in pmac_has_backlight_type() powerpc: remove dead config options for MPC85xx platform support powerpc/xive: Use cpumask_intersects() selftests/powerpc: Remove the path after initialization. powerpc/xmon: symbol lookup length fixed powerpc/ep8248e: Use %pa to format resource_size_t powerpc/ps3: Reorganize kerneldoc parameter names KVM: PPC: Book3S HV: Fix kmv -> kvm typo powerpc/sstep: make emulate_vsx_load and emulate_vsx_store static ...
2024-11-23Merge tag 'mm-stable-2024-11-18-19-27' of ↵Linus Torvalds
git://git.kernel.org/pub/scm/linux/kernel/git/akpm/mm Pull MM updates from Andrew Morton: - The series "zram: optimal post-processing target selection" from Sergey Senozhatsky improves zram's post-processing selection algorithm. This leads to improved memory savings. - Wei Yang has gone to town on the mapletree code, contributing several series which clean up the implementation: - "refine mas_mab_cp()" - "Reduce the space to be cleared for maple_big_node" - "maple_tree: simplify mas_push_node()" - "Following cleanup after introduce mas_wr_store_type()" - "refine storing null" - The series "selftests/mm: hugetlb_fault_after_madv improvements" from David Hildenbrand fixes this selftest for s390. - The series "introduce pte_offset_map_{ro|rw}_nolock()" from Qi Zheng implements some rationaizations and cleanups in the page mapping code. - The series "mm: optimize shadow entries removal" from Shakeel Butt optimizes the file truncation code by speeding up the handling of shadow entries. - The series "Remove PageKsm()" from Matthew Wilcox completes the migration of this flag over to being a folio-based flag. - The series "Unify hugetlb into arch_get_unmapped_area functions" from Oscar Salvador implements a bunch of consolidations and cleanups in the hugetlb code. - The series "Do not shatter hugezeropage on wp-fault" from Dev Jain takes away the wp-fault time practice of turning a huge zero page into small pages. Instead we replace the whole thing with a THP. More consistent cleaner and potentiall saves a large number of pagefaults. - The series "percpu: Add a test case and fix for clang" from Andy Shevchenko enhances and fixes the kernel's built in percpu test code. - The series "mm/mremap: Remove extra vma tree walk" from Liam Howlett optimizes mremap() by avoiding doing things which we didn't need to do. - The series "Improve the tmpfs large folio read performance" from Baolin Wang teaches tmpfs to copy data into userspace at the folio size rather than as individual pages. A 20% speedup was observed. - The series "mm/damon/vaddr: Fix issue in damon_va_evenly_split_region()" fro Zheng Yejian fixes DAMON splitting. - The series "memcg-v1: fully deprecate charge moving" from Shakeel Butt removes the long-deprecated memcgv2 charge moving feature. - The series "fix error handling in mmap_region() and refactor" from Lorenzo Stoakes cleanup up some of the mmap() error handling and addresses some potential performance issues. - The series "x86/module: use large ROX pages for text allocations" from Mike Rapoport teaches x86 to use large pages for read-only-execute module text. - The series "page allocation tag compression" from Suren Baghdasaryan is followon maintenance work for the new page allocation profiling feature. - The series "page->index removals in mm" from Matthew Wilcox remove most references to page->index in mm/. A slow march towards shrinking struct page. - The series "damon/{self,kunit}tests: minor fixups for DAMON debugfs interface tests" from Andrew Paniakin performs maintenance work for DAMON's self testing code. - The series "mm: zswap swap-out of large folios" from Kanchana Sridhar improves zswap's batching of compression and decompression. It is a step along the way towards using Intel IAA hardware acceleration for this zswap operation. - The series "kasan: migrate the last module test to kunit" from Sabyrzhan Tasbolatov completes the migration of the KASAN built-in tests over to the KUnit framework. - The series "implement lightweight guard pages" from Lorenzo Stoakes permits userapace to place fault-generating guard pages within a single VMA, rather than requiring that multiple VMAs be created for this. Improved efficiencies for userspace memory allocators are expected. - The series "memcg: tracepoint for flushing stats" from JP Kobryn uses tracepoints to provide increased visibility into memcg stats flushing activity. - The series "zram: IDLE flag handling fixes" from Sergey Senozhatsky fixes a zram buglet which potentially affected performance. - The series "mm: add more kernel parameters to control mTHP" from Maíra Canal enhances our ability to control/configuremultisize THP from the kernel boot command line. - The series "kasan: few improvements on kunit tests" from Sabyrzhan Tasbolatov has a couple of fixups for the KASAN KUnit tests. - The series "mm/list_lru: Split list_lru lock into per-cgroup scope" from Kairui Song optimizes list_lru memory utilization when lockdep is enabled. * tag 'mm-stable-2024-11-18-19-27' of git://git.kernel.org/pub/scm/linux/kernel/git/akpm/mm: (215 commits) cma: enforce non-zero pageblock_order during cma_init_reserved_mem() mm/kfence: add a new kunit test test_use_after_free_read_nofault() zram: fix NULL pointer in comp_algorithm_show() memcg/hugetlb: add hugeTLB counters to memcg vmstat: call fold_vm_zone_numa_events() before show per zone NUMA event mm: mmap_lock: check trace_mmap_lock_$type_enabled() instead of regcount zram: ZRAM_DEF_COMP should depend on ZRAM MAINTAINERS/MEMORY MANAGEMENT: add document files for mm Docs/mm/damon: recommend academic papers to read and/or cite mm: define general function pXd_init() kmemleak: iommu/iova: fix transient kmemleak false positive mm/list_lru: simplify the list_lru walk callback function mm/list_lru: split the lock to per-cgroup scope mm/list_lru: simplify reparenting and initial allocation mm/list_lru: code clean up for reparenting mm/list_lru: don't export list_lru_add mm/list_lru: don't pass unnecessary key parameters kasan: add kunit tests for kmalloc_track_caller, kmalloc_node_track_caller kasan: change kasan_atomics kunit test as KUNIT_CASE_SLOW kasan: use EXPORT_SYMBOL_IF_KUNIT to export symbols ...
2024-11-07asm-generic: introduce text-patching.hMike Rapoport (Microsoft)
Several architectures support text patching, but they name the header files that declare patching functions differently. Make all such headers consistently named text-patching.h and add an empty header in asm-generic for architectures that do not support text patching. Link: https://lkml.kernel.org/r/20241023162711.2579610-4-rppt@kernel.org Signed-off-by: Mike Rapoport (Microsoft) <rppt@kernel.org> Reviewed-by: Christoph Hellwig <hch@lst.de> Acked-by: Geert Uytterhoeven <geert@linux-m68k.org> # m68k Acked-by: Arnd Bergmann <arnd@arndb.de> Reviewed-by: Luis Chamberlain <mcgrof@kernel.org> Tested-by: kdevops <kdevops@lists.linux.dev> Cc: Andreas Larsson <andreas@gaisler.com> Cc: Andy Lutomirski <luto@kernel.org> Cc: Ard Biesheuvel <ardb@kernel.org> Cc: Borislav Petkov (AMD) <bp@alien8.de> Cc: Brian Cain <bcain@quicinc.com> Cc: Catalin Marinas <catalin.marinas@arm.com> Cc: Christophe Leroy <christophe.leroy@csgroup.eu> Cc: Dave Hansen <dave.hansen@linux.intel.com> Cc: Dinh Nguyen <dinguyen@kernel.org> Cc: Guo Ren <guoren@kernel.org> Cc: Helge Deller <deller@gmx.de> Cc: Huacai Chen <chenhuacai@kernel.org> Cc: Ingo Molnar <mingo@redhat.com> Cc: Johannes Berg <johannes@sipsolutions.net> Cc: John Paul Adrian Glaubitz <glaubitz@physik.fu-berlin.de> Cc: Kent Overstreet <kent.overstreet@linux.dev> Cc: Liam R. Howlett <Liam.Howlett@Oracle.com> Cc: Mark Rutland <mark.rutland@arm.com> Cc: Masami Hiramatsu (Google) <mhiramat@kernel.org> Cc: Matt Turner <mattst88@gmail.com> Cc: Max Filippov <jcmvbkbc@gmail.com> Cc: Michael Ellerman <mpe@ellerman.id.au> Cc: Michal Simek <monstr@monstr.eu> Cc: Oleg Nesterov <oleg@redhat.com> Cc: Palmer Dabbelt <palmer@dabbelt.com> Cc: Peter Zijlstra <peterz@infradead.org> Cc: Richard Weinberger <richard@nod.at> Cc: Russell King <linux@armlinux.org.uk> Cc: Song Liu <song@kernel.org> Cc: Stafford Horne <shorne@gmail.com> Cc: Steven Rostedt (Google) <rostedt@goodmis.org> Cc: Suren Baghdasaryan <surenb@google.com> Cc: Thomas Bogendoerfer <tsbogend@alpha.franken.de> Cc: Thomas Gleixner <tglx@linutronix.de> Cc: Uladzislau Rezki (Sony) <urezki@gmail.com> Cc: Vineet Gupta <vgupta@kernel.org> Cc: Will Deacon <will@kernel.org> Signed-off-by: Andrew Morton <akpm@linux-foundation.org>
2024-10-31powerpc/ftrace: Add support for DYNAMIC_FTRACE_WITH_DIRECT_CALLSNaveen N Rao
Add support for DYNAMIC_FTRACE_WITH_DIRECT_CALLS similar to the arm64 implementation. ftrace direct calls allow custom trampolines to be called into directly from function ftrace call sites, bypassing the ftrace trampoline completely. This functionality is currently utilized by BPF trampolines to hook into kernel function entries. Since we have limited relative branch range, we support ftrace direct calls through support for DYNAMIC_FTRACE_WITH_CALL_OPS. In this approach, ftrace trampoline is not entirely bypassed. Rather, it is re-purposed into a stub that reads direct_call field from the associated ftrace_ops structure and branches into that, if it is not NULL. For this, it is sufficient if we can ensure that the ftrace trampoline is reachable from all traceable functions. When multiple ftrace_ops are associated with a call site, we utilize a call back to set pt_regs->orig_gpr3 that can then be tested on the return path from the ftrace trampoline to branch into the direct caller. Signed-off-by: Naveen N Rao <naveen@kernel.org> Signed-off-by: Michael Ellerman <mpe@ellerman.id.au> Link: https://patch.msgid.link/20241030070850.1361304-16-hbathini@linux.ibm.com
2024-10-31powerpc/ftrace: Add support for DYNAMIC_FTRACE_WITH_CALL_OPSNaveen N Rao
Implement support for DYNAMIC_FTRACE_WITH_CALL_OPS similar to the arm64 implementation. This works by patching-in a pointer to an associated ftrace_ops structure before each traceable function. If multiple ftrace_ops are associated with a call site, then a special ftrace_list_ops is used to enable iterating over all the registered ftrace_ops. If no ftrace_ops are associated with a call site, then a special ftrace_nop_ops structure is used to render the ftrace call as a no-op. ftrace trampoline can then read the associated ftrace_ops for a call site by loading from an offset from the LR, and branch directly to the associated function. The primary advantage with this approach is that we don't have to iterate over all the registered ftrace_ops for call sites that have a single ftrace_ops registered. This is the equivalent of implementing support for dynamic ftrace trampolines, which set up a special ftrace trampoline for each registered ftrace_ops and have individual call sites branch into those directly. A secondary advantage is that this gives us a way to add support for direct ftrace callers without having to resort to using stubs. The address of the direct call trampoline can be loaded from the ftrace_ops structure. To support this, we reserve a nop before each function on 32-bit powerpc. For 64-bit powerpc, two nops are reserved before each out-of-line stub. During ftrace activation, we update this location with the associated ftrace_ops pointer. Then, on ftrace entry, we load from this location and call into ftrace_ops->func(). For 64-bit powerpc, we ensure that the out-of-line stub area is doubleword aligned so that ftrace_ops address can be updated atomically. Signed-off-by: Naveen N Rao <naveen@kernel.org> Signed-off-by: Michael Ellerman <mpe@ellerman.id.au> Link: https://patch.msgid.link/20241030070850.1361304-15-hbathini@linux.ibm.com
2024-10-31powerpc64/ftrace: Support .text larger than 32MB with out-of-line stubsNaveen N Rao
We are restricted to a .text size of ~32MB when using out-of-line function profile sequence. Allow this to be extended up to the previous limit of ~64MB by reserving space in the middle of .text. A new config option CONFIG_PPC_FTRACE_OUT_OF_LINE_NUM_RESERVE is introduced to specify the number of function stubs that are reserved in .text. On boot, ftrace utilizes stubs from this area first before using the stub area at the end of .text. A ppc64le defconfig has ~44k functions that can be traced. A more conservative value of 32k functions is chosen as the default value of PPC_FTRACE_OUT_OF_LINE_NUM_RESERVE so that we do not allot more space than necessary by default. If building a kernel that only has 32k trace-able functions, we won't allot any more space at the end of .text during the pass on vmlinux.o. Otherwise, only the remaining functions get space for stubs at the end of .text. This default value should help cover a .text size of ~48MB in total (including space reserved at the end of .text which can cover up to 32MB), which should be sufficient for most common builds. For a very small kernel build, this can be set to 0. Or, this can be bumped up to a larger value to support vmlinux .text size up to ~64MB. Signed-off-by: Naveen N Rao <naveen@kernel.org> Signed-off-by: Hari Bathini <hbathini@linux.ibm.com> Signed-off-by: Michael Ellerman <mpe@ellerman.id.au> Link: https://patch.msgid.link/20241030070850.1361304-14-hbathini@linux.ibm.com
2024-10-31powerpc64/ftrace: Move ftrace sequence out of lineNaveen N Rao
Function profile sequence on powerpc includes two instructions at the beginning of each function: mflr r0 bl ftrace_caller The call to ftrace_caller() gets nop'ed out during kernel boot and is patched in when ftrace is enabled. Given the sequence, we cannot return from ftrace_caller with 'blr' as we need to keep LR and r0 intact. This results in link stack (return address predictor) imbalance when ftrace is enabled. To address that, we would like to use a three instruction sequence: mflr r0 bl ftrace_caller mtlr r0 Further more, to support DYNAMIC_FTRACE_WITH_CALL_OPS, we need to reserve two instruction slots before the function. This results in a total of five instruction slots to be reserved for ftrace use on each function that is traced. Move the function profile sequence out-of-line to minimize its impact. To do this, we reserve a single nop at function entry using -fpatchable-function-entry=1 and add a pass on vmlinux.o to determine the total number of functions that can be traced. This is then used to generate a .S file reserving the appropriate amount of space for use as ftrace stubs, which is built and linked into vmlinux. On bootup, the stub space is split into separate stubs per function and populated with the proper instruction sequence. A pointer to the associated stub is maintained in dyn_arch_ftrace. For modules, space for ftrace stubs is reserved from the generic module stub space. This is restricted to and enabled by default only on 64-bit powerpc, though there are some changes to accommodate 32-bit powerpc. This is done so that 32-bit powerpc could choose to opt into this based on further tests and benchmarks. As an example, after this patch, kernel functions will have a single nop at function entry: <kernel_clone>: addis r2,r12,467 addi r2,r2,-16028 nop mfocrf r11,8 ... When ftrace is enabled, the nop is converted to an unconditional branch to the stub associated with that function: <kernel_clone>: addis r2,r12,467 addi r2,r2,-16028 b ftrace_ool_stub_text_end+0x11b28 mfocrf r11,8 ... The associated stub: <ftrace_ool_stub_text_end+0x11b28>: mflr r0 bl ftrace_caller mtlr r0 b kernel_clone+0xc ... This change showed an improvement of ~10% in null_syscall benchmark on a Power 10 system with ftrace enabled. Signed-off-by: Naveen N Rao <naveen@kernel.org> Signed-off-by: Hari Bathini <hbathini@linux.ibm.com> Signed-off-by: Michael Ellerman <mpe@ellerman.id.au> Link: https://patch.msgid.link/20241030070850.1361304-13-hbathini@linux.ibm.com
2024-10-31powerpc/ftrace: Skip instruction patching if the instructions are the sameNaveen N Rao
To simplify upcoming changes to ftrace, add a check to skip actual instruction patching if the old and new instructions are the same. We still validate that the instruction is what we expect, but don't actually patch the same instruction again. Signed-off-by: Naveen N Rao <naveen@kernel.org> Signed-off-by: Michael Ellerman <mpe@ellerman.id.au> Link: https://patch.msgid.link/20241030070850.1361304-8-hbathini@linux.ibm.com
2024-10-31powerpc/ftrace: Remove pointer to struct module from dyn_arch_ftraceNaveen N Rao
Pointer to struct module is only relevant for ftrace records belonging to kernel modules. Having this field in dyn_arch_ftrace wastes memory for all ftrace records belonging to the kernel. Remove the same in favour of looking up the module from the ftrace record address, similar to other architectures. Reviewed-by: Nicholas Piggin <npiggin@gmail.com> Signed-off-by: Naveen N Rao <naveen@kernel.org> Signed-off-by: Michael Ellerman <mpe@ellerman.id.au> Link: https://patch.msgid.link/20241030070850.1361304-7-hbathini@linux.ibm.com
2024-10-31powerpc32/ftrace: Unify 32-bit and 64-bit ftrace entry codeNaveen N Rao
On 32-bit powerpc, gcc generates a three instruction sequence for function profiling: mflr r0 stw r0, 4(r1) bl _mcount On kernel boot, the call to _mcount() is nop-ed out, to be patched back in when ftrace is actually enabled. The 'stw' instruction therefore is not necessary unless ftrace is enabled. Nop it out during ftrace init. When ftrace is enabled, we want the 'stw' so that stack unwinding works properly. Perform the same within the ftrace handler, similar to 64-bit powerpc. Reviewed-by: Nicholas Piggin <npiggin@gmail.com> Signed-off-by: Naveen N Rao <naveen@kernel.org> Signed-off-by: Michael Ellerman <mpe@ellerman.id.au> Link: https://patch.msgid.link/20241030070850.1361304-5-hbathini@linux.ibm.com
2024-10-31powerpc64/ftrace: Nop out additional 'std' instruction emitted by gcc v5.xNaveen N Rao
Gcc v5.x emits a 3-instruction sequence for -mprofile-kernel: mflr r0 std r0, 16(r1) bl _mcount Gcc v6.x moved to a simpler 2-instruction sequence by removing the 'std' instruction. The store saved the return address in the LR save area in the caller stack frame for stack unwinding. However, with dynamic ftrace, we no longer have a call to _mcount on kernel boot when ftrace is not enabled. When ftrace is enabled, that store is performed within ftrace_caller(). As such, the additional 'std' instruction is redundant. Nop it out on kernel boot. With this change, we now use the same 2-instruction profiling sequence with both -mprofile-kernel, as well as -fpatchable-function-entry on 64-bit powerpc. Signed-off-by: Naveen N Rao <naveen@kernel.org> Signed-off-by: Michael Ellerman <mpe@ellerman.id.au> Link: https://patch.msgid.link/20241030070850.1361304-4-hbathini@linux.ibm.com
2024-10-10ftrace: Make ftrace_regs abstract from direct useSteven Rostedt
ftrace_regs was created to hold registers that store information to save function parameters, return value and stack. Since it is a subset of pt_regs, it should only be used by its accessor functions. But because pt_regs can easily be taken from ftrace_regs (on most archs), it is tempting to use it directly. But when running on other architectures, it may fail to build or worse, build but crash the kernel! Instead, make struct ftrace_regs an empty structure and have the architectures define __arch_ftrace_regs and all the accessor functions will typecast to it to get to the actual fields. This will help avoid usage of ftrace_regs directly. Link: https://lore.kernel.org/all/20241007171027.629bdafd@gandalf.local.home/ Cc: "linux-arch@vger.kernel.org" <linux-arch@vger.kernel.org> Cc: "x86@kernel.org" <x86@kernel.org> Cc: Mathieu Desnoyers <mathieu.desnoyers@efficios.com> Cc: Mark Rutland <mark.rutland@arm.com> Cc: Will Deacon <will@kernel.org> Cc: Huacai Chen <chenhuacai@kernel.org> Cc: WANG Xuerui <kernel@xen0n.name> Cc: Michael Ellerman <mpe@ellerman.id.au> Cc: Nicholas Piggin <npiggin@gmail.com> Cc: Christophe Leroy <christophe.leroy@csgroup.eu> Cc: Naveen N Rao <naveen@kernel.org> Cc: Madhavan Srinivasan <maddy@linux.ibm.com> Cc: Paul Walmsley <paul.walmsley@sifive.com> Cc: Palmer Dabbelt <palmer@dabbelt.com> Cc: Albert Ou <aou@eecs.berkeley.edu> Cc: Heiko Carstens <hca@linux.ibm.com> Cc: Vasily Gorbik <gor@linux.ibm.com> Cc: Alexander Gordeev <agordeev@linux.ibm.com> Cc: Christian Borntraeger <borntraeger@linux.ibm.com> Cc: Sven Schnelle <svens@linux.ibm.com> Cc: Thomas Gleixner <tglx@linutronix.de> Cc: Ingo Molnar <mingo@kernel.org> Cc: Borislav Petkov <bp@alien8.de> Cc: Dave Hansen <dave.hansen@linux.intel.com> Link: https://lore.kernel.org/20241008230628.958778821@goodmis.org Acked-by: Catalin Marinas <catalin.marinas@arm.com> Signed-off-by: Steven Rostedt (Google) <rostedt@goodmis.org> Acked-by: Masami Hiramatsu (Google) <mhiramat@kernel.org> Acked-by: Heiko Carstens <hca@linux.ibm.com> # s390 Signed-off-by: Steven Rostedt (Google) <rostedt@goodmis.org>
2024-02-14powerpc/ftrace: Ignore ftrace locations in exit text sectionsNaveen N Rao
Michael reported that we are seeing an ftrace bug on bootup when KASAN is enabled and we are using -fpatchable-function-entry: ftrace: allocating 47780 entries in 18 pages ftrace-powerpc: 0xc0000000020b3d5c: No module provided for non-kernel address ------------[ ftrace bug ]------------ ftrace faulted on modifying [<c0000000020b3d5c>] 0xc0000000020b3d5c Initializing ftrace call sites ftrace record flags: 0 (0) expected tramp: c00000000008cef4 ------------[ cut here ]------------ WARNING: CPU: 0 PID: 0 at kernel/trace/ftrace.c:2180 ftrace_bug+0x3c0/0x424 Modules linked in: CPU: 0 PID: 0 Comm: swapper Not tainted 6.5.0-rc3-00120-g0f71dcfb4aef #860 Hardware name: IBM pSeries (emulated by qemu) POWER9 (raw) 0x4e1202 0xf000005 of:SLOF,HEAD hv:linux,kvm pSeries NIP: c0000000003aa81c LR: c0000000003aa818 CTR: 0000000000000000 REGS: c0000000033cfab0 TRAP: 0700 Not tainted (6.5.0-rc3-00120-g0f71dcfb4aef) MSR: 8000000002021033 <SF,VEC,ME,IR,DR,RI,LE> CR: 28028240 XER: 00000000 CFAR: c0000000002781a8 IRQMASK: 3 ... NIP [c0000000003aa81c] ftrace_bug+0x3c0/0x424 LR [c0000000003aa818] ftrace_bug+0x3bc/0x424 Call Trace: ftrace_bug+0x3bc/0x424 (unreliable) ftrace_process_locs+0x5f4/0x8a0 ftrace_init+0xc0/0x1d0 start_kernel+0x1d8/0x484 With CONFIG_FTRACE_MCOUNT_USE_PATCHABLE_FUNCTION_ENTRY=y and CONFIG_KASAN=y, compiler emits nops in functions that it generates for registering and unregistering global variables (unlike with -pg and -mprofile-kernel where calls to _mcount() are not generated in those functions). Those functions then end up in INIT_TEXT and EXIT_TEXT respectively. We don't expect to see any profiled functions in EXIT_TEXT, so ftrace_init_nop() assumes that all addresses that aren't in the core kernel text belongs to a module. Since these functions do not match that criteria, we see the above bug. Address this by having ftrace ignore all locations in the text exit sections of vmlinux. Fixes: 0f71dcfb4aef ("powerpc/ftrace: Add support for -fpatchable-function-entry") Cc: stable@vger.kernel.org # v6.6+ Reported-by: Michael Ellerman <mpe@ellerman.id.au> Signed-off-by: Naveen N Rao <naveen@kernel.org> Reviewed-by: Benjamin Gray <bgray@linux.ibm.com> Signed-off-by: Michael Ellerman <mpe@ellerman.id.au> Link: https://msgid.link/20240213175410.1091313-1-naveen@kernel.org
2023-08-22powerpc/ftrace: Add support for -fpatchable-function-entryNaveen N Rao
GCC v13.1 updated support for -fpatchable-function-entry on ppc64le to emit nops after the local entry point, rather than before it. This allows us to use this in the kernel for ftrace purposes. A new script is added under arch/powerpc/tools/ to help detect if nops are emitted after the function local entry point, or before the global entry point. With -fpatchable-function-entry, we no longer have the profiling instructions generated at function entry, so we only need to validate the presence of two nops at the ftrace location in ftrace_init_nop(). We patch the preceding instruction with 'mflr r0' to match the -mprofile-kernel ABI for subsequent ftrace use. This changes the profiling instructions used on ppc32. The default -pg option emits an additional 'stw' instruction after 'mflr r0' and before the branch to _mcount 'bl _mcount'. This is very similar to the original -mprofile-kernel implementation on ppc64le, where an additional 'std' instruction was used to save LR to its save location in the caller's stackframe. Subsequently, this additional store was removed in later compiler versions for performance reasons. The same reasons apply for ppc32 so we only patch in a 'mflr r0'. Signed-off-by: Naveen N Rao <naveen@kernel.org> Reviewed-by: Christophe Leroy <christophe.leroy@csgroup.eu> Signed-off-by: Michael Ellerman <mpe@ellerman.id.au> Link: https://msgid.link/68586d22981a2c3bb45f27a2b621173d10a7d092.1687166935.git.naveen@kernel.org
2023-08-22powerpc/ftrace: Implement ftrace_replace_code()Naveen N Rao
Implement ftrace_replace_code() to consolidate logic from the different ftrace patching routines: ftrace_make_nop(), ftrace_make_call() and ftrace_modify_call(). Note that ftrace_make_call() is still required primarily to handle patching modules during their load time. The other two routines should no longer be called. This lays the groundwork to enable better control in patching ftrace locations, including the ability to nop-out preceding profiling instructions when ftrace is disabled. Signed-off-by: Naveen N Rao <naveen@kernel.org> Reviewed-by: Christophe Leroy <christophe.leroy@csgroup.eu> Signed-off-by: Michael Ellerman <mpe@ellerman.id.au> Link: https://msgid.link/c28f852225646b0561bbf3c1d22d03f041ace8e0.1687166935.git.naveen@kernel.org
2023-08-22powerpc/ftrace: Replace use of ftrace_call_replace() with ↵Naveen N Rao
ftrace_create_branch_inst() ftrace_create_branch_inst() is clearer about its intent than ftrace_call_replace(). Signed-off-by: Naveen N Rao <naveen@kernel.org> Reviewed-by: Christophe Leroy <christophe.leroy@csgroup.eu> Signed-off-by: Michael Ellerman <mpe@ellerman.id.au> Link: https://msgid.link/953513b88fa922ba7a66d772dc1310710efe9177.1687166935.git.naveen@kernel.org
2023-08-22powerpc/ftrace: Simplify ftrace_modify_call()Naveen N Rao
Now that we validate the ftrace location during initialization in ftrace_init_nop(), we can simplify ftrace_modify_call() to patch-in the updated branch instruction without worrying about the instructions surrounding the ftrace location. Note that we continue to ensure we have the expected branch instruction at the ftrace location before patching it with the updated branch destination. Signed-off-by: Naveen N Rao <naveen@kernel.org> Reviewed-by: Christophe Leroy <christophe.leroy@csgroup.eu> Signed-off-by: Michael Ellerman <mpe@ellerman.id.au> Link: https://msgid.link/06275720939f8ee4c2f61c9e9a3e89b1fa3c441d.1687166935.git.naveen@kernel.org
2023-08-22powerpc/ftrace: Simplify ftrace_make_call()Naveen N Rao
Now that we validate the ftrace location during initialization in ftrace_init_nop(), we can simplify ftrace_make_call() to replace the nop without worrying about the instructions surrounding the ftrace location. Note that we continue to ensure that we have a nop at the ftrace location before patching it. Signed-off-by: Naveen N Rao <naveen@kernel.org> Reviewed-by: Christophe Leroy <christophe.leroy@csgroup.eu> Signed-off-by: Michael Ellerman <mpe@ellerman.id.au> Link: https://msgid.link/2d28866d2f556488a663981abe5621511efb207b.1687166935.git.naveen@kernel.org
2023-08-22powerpc/ftrace: Simplify ftrace_make_nop()Naveen N Rao
Now that we validate the ftrace location during initialization in ftrace_init_nop(), we can simplify ftrace_make_nop() to patch-in the nop without worrying about the instructions surrounding the ftrace location. Note that we continue to ensure that we have a bl to ftrace_[regs_]caller at the ftrace location before nop-ing it out. Signed-off-by: Naveen N Rao <naveen@kernel.org> Reviewed-by: Christophe Leroy <christophe.leroy@csgroup.eu> Signed-off-by: Michael Ellerman <mpe@ellerman.id.au> Link: https://msgid.link/e12ccbf28c50c3a07fb614f4d392e55f7098a729.1687166935.git.naveen@kernel.org
2023-08-22powerpc/ftrace: Add separate ftrace_init_nop() with additional validationNaveen N Rao
Currently, we validate instructions around the ftrace location every time we have to enable/disable ftrace. Introduce ftrace_init_nop() to instead perform all the validation during ftrace initialization. This allows us to simply patch the necessary instructions during enabling/disabling ftrace. Signed-off-by: Naveen N Rao <naveen@kernel.org> Reviewed-by: Christophe Leroy <christophe.leroy@csgroup.eu> Signed-off-by: Michael Ellerman <mpe@ellerman.id.au> Link: https://msgid.link/f373684081e8e98be09b7f44d2d93069768324dc.1687166935.git.naveen@kernel.org
2023-08-22powerpc/ftrace: Stop re-purposing linker generated long branches for ftraceNaveen N Rao
Commit 67361cf8071286 ("powerpc/ftrace: Handle large kernel configs") added ftrace support for ppc64 kernel images with a text section larger than 32MB. The patch did two things: 1. Add stubs at the end of .text to branch into ftrace_[regs_]caller for functions that were out of branch range. 2. Re-purpose linker-generated long branches to _mcount to instead branch to ftrace_[regs_]caller. Before that, we only supported kernel .text up to ~32MB. With the above, we now support up to ~96MB: - The first 32MB of kernel text can branch directly into ftrace_[regs_]caller since that symbol is usually at the beginning. - The modified long_branch from (2) above is used by the next 32MB of kernel text. - The next 32MB of kernel text can use the stub at the end of text to branch back to ftrace_[regs_]caller. While re-purposing the long branch works in practice, it still restricts ftrace to kernel text up to ~96MB. The stub at the end of kernel text from (1) already enables us to extend ftrace support for kernel text up to 64MB, which fulfils the original requirement. Further, once we switch to -fpatchable-function-entry, there will not be a long branch that we can use. Stop re-purposing the linker-generated long branches for ftrace to simplify the code. If there are good reasons to support ftrace on kernels beyond 64MB, we can consider adding support by using -fpatchable-function-entry. Signed-off-by: Naveen N Rao <naveen@kernel.org> Reviewed-by: Christophe Leroy <christophe.leroy@csgroup.eu> Signed-off-by: Michael Ellerman <mpe@ellerman.id.au> Link: https://msgid.link/33fa3be97f8e1f2171254ef2e1b0d5c8836c11fd.1687166935.git.naveen@kernel.org
2023-08-22powerpc/ftrace: Refactor ftrace_modify_code()Naveen N Rao
Split up ftrace_modify_code() into a few helpers for future use. Also update error messages accordingly. Signed-off-by: Naveen N Rao <naveen@kernel.org> Reviewed-by: Christophe Leroy <christophe.leroy@csgroup.eu> Signed-off-by: Michael Ellerman <mpe@ellerman.id.au> Link: https://msgid.link/a8daa49712b44ff539e6c22a2ea649a540386798.1687166935.git.naveen@kernel.org
2023-08-22powerpc/ftrace: Extend ftrace support for large kernels to ppc32Naveen N Rao
Commit 67361cf8071286 ("powerpc/ftrace: Handle large kernel configs") added ftrace support for ppc64 kernel images with a text section larger than 32MB. The approach itself isn't specific to ppc64, so extend the same to also work on ppc32. While at it, reduce the space reserved for the stub from 64 bytes to 32 bytes since the different stub variants are all less than 8 instructions. To reduce use of #ifdef, a stub implementation is provided for kernel_toc_address() and -SZ_2G is cast to 'long long' to prevent errors on ppc32. Signed-off-by: Naveen N Rao <naveen@kernel.org> Reviewed-by: Christophe Leroy <christophe.leroy@csgroup.eu> Signed-off-by: Michael Ellerman <mpe@ellerman.id.au> Link: https://msgid.link/9fa3258cbb9105cf8a0a8135214d44ffbc75fe84.1687166935.git.naveen@kernel.org
2023-08-22powerpc/ftrace: Use FTRACE_REGS_ADDR to identify the correct ftrace trampolineNaveen N Rao
Instead of keying off DYNAMIC_FTRACE_WITH_REGS, use FTRACE_REGS_ADDR to identify the proper ftrace trampoline address to use. Signed-off-by: Naveen N Rao <naveen@kernel.org> Reviewed-by: Christophe Leroy <christophe.leroy@csgroup.eu> Signed-off-by: Michael Ellerman <mpe@ellerman.id.au> Link: https://msgid.link/6045a280a57a7ea937a5bb13ccac747026dbfb07.1687166935.git.naveen@kernel.org
2023-08-22powerpc/ftrace: Simplify function_graph support in ftrace.cNaveen N Rao
Since we now support DYNAMIC_FTRACE_WITH_ARGS across ppc32 and ppc64 ELFv2, we can simplify function_graph tracer support code in ftrace.c Signed-off-by: Naveen N Rao <naveen@kernel.org> Reviewed-by: Christophe Leroy <christophe.leroy@csgroup.eu> Signed-off-by: Michael Ellerman <mpe@ellerman.id.au> Link: https://msgid.link/4dc92c4b1ed444dc62b748ae7327acdb9e096864.1687166935.git.naveen@kernel.org
2023-08-22powerpc64/ftrace: Move ELFv1 and -pg support code into a separate fileNaveen N Rao
ELFv1 support is deprecated and on the way out. Pre -mprofile-kernel ftrace support (-pg only) is very limited and is retained primarily for clang builds. It won't be necessary once clang lands support for -fpatchable-function-entry. Copy the existing ftrace code supporting these into ftrace_pg.c. ftrace.c can then be refactored and enhanced with a focus on ppc32 and ppc64 ELFv2. Signed-off-by: Naveen N Rao <naveen@kernel.org> Reviewed-by: Christophe Leroy <christophe.leroy@csgroup.eu> Signed-off-by: Michael Ellerman <mpe@ellerman.id.au> Link: https://msgid.link/1eb6cc6c3141ddb77a2a25f8a9e83d83ff312b02.1687166935.git.naveen@kernel.org
2023-04-20powerpc/64: vmlinux support building with PCREL addresingNicholas Piggin
PC-Relative or PCREL addressing is an extension to the ELF ABI which uses Power ISA v3.1 PC-relative instructions to calculate addresses, rather than the traditional TOC scheme. Add an option to build vmlinux using pcrel addressing. Modules continue to use TOC addressing. - TOC address helpers and r2 are poisoned with -1 when running vmlinux. r2 could be used for something useful once things are ironed out. - Assembly must call C functions with @notoc annotation, or the linker complains aobut a missing nop after the call. This is done with the CFUNC macro introduced earlier. - Boot: with the exception of prom_init, the execution branches to the kernel virtual address early in boot, before any addresses are generated, which ensures 34-bit pcrel addressing does not miss the high PAGE_OFFSET bits. TOC relative addressing has a similar requirement. prom_init does not go to the virtual address and its addresses should not carry over to the post-prom kernel. - Ftrace trampolines are converted from TOC addressing to pcrel addressing, including module ftrace trampolines that currently use the kernel TOC to find ftrace target functions. - BPF function prologue and function calling generation are converted from TOC to pcrel. - copypage_64.S has an interesting problem, prefixed instructions have alignment restrictions so the linker can add padding, which makes the assembler treat the difference between two local labels as non-constant even if alignment is arranged so padding is not required. This may need toolchain help to solve nicely, for now move the prefix instruction out of the alternate patch section to work around it. This reduces kernel text size by about 6%. Signed-off-by: Nicholas Piggin <npiggin@gmail.com> Signed-off-by: Michael Ellerman <mpe@ellerman.id.au> Link: https://msgid.link/20230408021752.862660-6-npiggin@gmail.com
2023-04-20powerpc/64: Add support to build with prefixed instructionsNicholas Piggin
Add an option to build kernel and module with prefixed instructions if the CPU and toolchain support it. This is not related to kernel support for userspace execution of prefixed instructions. Building with prefixed instructions breaks some extended inline asm memory addressing, for example it will provide immediates that exceed the range of simple load/store displacement. Whether this is a toolchain or a kernel asm problem remains to be seen. For now, these are replaced with simpler and less efficient direct register addressing when compiling with prefixed. Signed-off-by: Nicholas Piggin <npiggin@gmail.com> Signed-off-by: Michael Ellerman <mpe@ellerman.id.au> Link: https://msgid.link/20230408021752.862660-4-npiggin@gmail.com
2022-08-10powerpc64/ftrace: Fix ftrace for clang buildsNaveen N. Rao
Clang doesn't support -mprofile-kernel ABI, so guard the checks against CONFIG_DYNAMIC_FTRACE_WITH_REGS, rather than the elf ABI version. Fixes: 23b44fc248f4 ("powerpc/ftrace: Make __ftrace_make_{nop/call}() common to PPC32 and PPC64") Cc: stable@vger.kernel.org # v5.19+ Reported-by: Nick Desaulniers <ndesaulniers@google.com> Reported-by: Ondrej Mosnacek <omosnacek@gmail.com> Signed-off-by: Naveen N. Rao <naveen.n.rao@linux.vnet.ibm.com> Tested-by: Ondrej Mosnacek <omosnacek@gmail.com> Acked-by: Nick Desaulniers <ndesaulniers@google.com> Signed-off-by: Michael Ellerman <mpe@ellerman.id.au> Link: https://github.com/llvm/llvm-project/issues/57031 Link: https://github.com/ClangBuiltLinux/linux/issues/1682 Link: https://lore.kernel.org/r/20220809095907.418764-1-naveen.n.rao@linux.vnet.ibm.com
2022-07-25powerpc: Fix all occurences of duplicate wordsMichael Ellerman
Since commit 87c78b612f4f ("powerpc: Fix all occurences of "the the"") fixed "the the", there's now a steady stream of patches fixing other duplicate words. Just fix them all at once, to save the overhead of dealing with individual patches for each case. This leaves a few cases of "that that", which in some contexts is correct. Signed-off-by: Michael Ellerman <mpe@ellerman.id.au> Link: https://lore.kernel.org/r/20220718095158.326606-1-mpe@ellerman.id.au
2022-06-29powerpc/64: Drop ppc_inst_as_str()Michael Ellerman
The ppc_inst_as_str() macro tries to make printing variable length, aka "prefixed", instructions convenient. It mostly succeeds, but it does hide an on-stack buffer, which triggers stack protector. More problematically it doesn't compile at all with GCC 12, with -Wdangling-pointer, due to the fact that it returns the char buffer declared inside the macro: arch/powerpc/kernel/trace/ftrace.c: In function '__ftrace_modify_call': ./include/linux/printk.h:475:44: error: using a dangling pointer to '__str' [-Werror=dangling-pointer=] 475 | #define printk(fmt, ...) printk_index_wrap(_printk, fmt, ##__VA_ARGS__) ... arch/powerpc/kernel/trace/ftrace.c:567:17: note: in expansion of macro 'pr_err' 567 | pr_err("Not expected bl: opcode is %s\n", ppc_inst_as_str(op)); | ^~~~~~ ./arch/powerpc/include/asm/inst.h:156:14: note: '__str' declared here 156 | char __str[PPC_INST_STR_LEN]; \ | ^~~~~ This could be fixed by having the caller declare the buffer, but in some places there'd need to be two buffers. In all cases where ppc_inst_as_str() is used the output is not really meant for user consumption, it's almost always indicative of a kernel bug. A simpler solution is to just print the value as an unsigned long. For normal instructions the output is identical. For prefixed instructions the value is printed as a single 64-bit quantity, whereas previously the low half was printed first. But that is good enough for debug output, especially as prefixed instructions will be rare in kernel code in practice. Old: c000000000111170 60420000 ori r2,r2,0 c000000000111174 04100001 e580fb00 .long 0xe580fb0004100001 New: c00000000010f90c 60420000 ori r2,r2,0 c00000000010f910 e580fb0004100001 .long 0xe580fb0004100001 Reported-by: Bagas Sanjaya <bagasdotme@gmail.com> Reported-by: Petr Mladek <pmladek@suse.com> Signed-off-by: Michael Ellerman <mpe@ellerman.id.au> Tested-by: Bagas Sanjaya <bagasdotme@gmail.com> Link: https://lore.kernel.org/r/20220531065936.3674348-1-mpe@ellerman.id.au
2022-05-22powerpc/ftrace: Remove ftrace init tramp once kernel init is completeNaveen N. Rao
Stop using the ftrace trampoline for init section once kernel init is complete. Fixes: 67361cf8071286 ("powerpc/ftrace: Handle large kernel configs") Cc: stable@vger.kernel.org # v4.20+ Signed-off-by: Naveen N. Rao <naveen.n.rao@linux.vnet.ibm.com> Signed-off-by: Michael Ellerman <mpe@ellerman.id.au> Link: https://lore.kernel.org/r/20220516071422.463738-1-naveen.n.rao@linux.vnet.ibm.com
2022-05-22powerpc/ftrace: Minimise number of #ifdefsChristophe Leroy
A lot of #ifdefs can be replaced by IS_ENABLED() Do so. Signed-off-by: Christophe Leroy <christophe.leroy@csgroup.eu> [mpe: Fold in changes suggested by Naveen and Christophe on list] Signed-off-by: Michael Ellerman <mpe@ellerman.id.au> Link: https://lore.kernel.org/r/18ce6708d6f8c71d87436f9c6019f04df4125128.1652074503.git.christophe.leroy@csgroup.eu
2022-05-22powerpc/ftrace: Simplify expected_nop_sequence()Christophe Leroy
Avoid ifdefs around expected_nop_sequence(). While at it make it a bool. Signed-off-by: Christophe Leroy <christophe.leroy@csgroup.eu> Signed-off-by: Michael Ellerman <mpe@ellerman.id.au> Link: https://lore.kernel.org/r/305d22472f1f92127fba09692df6bb5d079a8cd0.1652074503.git.christophe.leroy@csgroup.eu
2022-05-22powerpc/ftrace: Use size macro instead of opencodingChristophe Leroy
0x80000000 is SZ_2G. Use it. Signed-off-by: Christophe Leroy <christophe.leroy@csgroup.eu> [mpe: Fix comparison against unsigned -SZ_2G] Signed-off-by: Michael Ellerman <mpe@ellerman.id.au> Link: https://lore.kernel.org/r/bb6626e884acffe87b58736291df57db3deaa9b9.1652074503.git.christophe.leroy@csgroup.eu
2022-05-19powerpc/ftrace: Use PPC_RAW_xxx() macros instead of opencoding.Christophe Leroy
PPC_RAW_xxx() macros are self explanatory and less error prone than open coding. Use them in ftrace.c Signed-off-by: Christophe Leroy <christophe.leroy@csgroup.eu> Signed-off-by: Michael Ellerman <mpe@ellerman.id.au> Link: https://lore.kernel.org/r/9292094c9a69cef6d29ee83f435a557b59c45065.1652074503.git.christophe.leroy@csgroup.eu
2022-05-19powerpc/ftrace: Use BRANCH_SET_LINK instead of value 1Christophe Leroy
To make it explicit, use BRANCH_SET_LINK instead of value 1 when calling create_branch(). Signed-off-by: Christophe Leroy <christophe.leroy@csgroup.eu> Signed-off-by: Michael Ellerman <mpe@ellerman.id.au> Link: https://lore.kernel.org/r/d57847063ac93660a5af620d4df1847f10edf61a.1652074503.git.christophe.leroy@csgroup.eu
2022-05-19powerpc/ftrace: Remove ftrace_plt_tramps[]Christophe Leroy
ftrace_plt_tramps table is never filled so it is useless. Remove it. Signed-off-by: Christophe Leroy <christophe.leroy@csgroup.eu> Signed-off-by: Michael Ellerman <mpe@ellerman.id.au> Link: https://lore.kernel.org/r/daeeb618a6619e3a7e3f82f1bd83ca7c25af6330.1652074503.git.christophe.leroy@csgroup.eu
2022-05-19powerpc/ftrace: Use CONFIG_FUNCTION_TRACER instead of CONFIG_DYNAMIC_FTRACEChristophe Leroy
Since commit 0c0c52306f47 ("powerpc: Only support DYNAMIC_FTRACE not static"), CONFIG_DYNAMIC_FTRACE is always selected when CONFIG_FUNCTION_TRACER is selected. To avoid confusion and have the reader wonder what's happen when CONFIG_FUNCTION_TRACER is selected and CONFIG_DYNAMIC_FTRACE is not, use CONFIG_FUNCTION_TRACER in ifdefs instead of CONFIG_DYNAMIC_FTRACE. As CONFIG_FUNCTION_GRAPH_TRACER depends on CONFIG_FUNCTION_TRACER, ftrace.o doesn't need to appear for both symbols in Makefile. Then as ftrace.o is built only when CONFIG_FUNCTION_TRACER is selected ifdef CONFIG_FUNCTION_TRACER is not needed in ftrace.c, and since it implies CONFIG_DYNAMIC_FTRACE, CONFIG_DYNAMIC_FTRACE is not needed in ftrace.c Signed-off-by: Christophe Leroy <christophe.leroy@csgroup.eu> Signed-off-by: Michael Ellerman <mpe@ellerman.id.au> Link: https://lore.kernel.org/r/628d357503eb90b4a034f99b7df516caaff4d279.1652074503.git.christophe.leroy@csgroup.eu
2022-05-19powerpc/ftrace: Make __ftrace_make_{nop/call}() common to PPC32 and PPC64Christophe Leroy
Since c93d4f6ecf4b ("powerpc/ftrace: Add module_trampoline_target() for PPC32"), __ftrace_make_nop() for PPC32 is very similar to the one for PPC64. Same for __ftrace_make_call(). Make them common. Signed-off-by: Christophe Leroy <christophe.leroy@csgroup.eu> Signed-off-by: Michael Ellerman <mpe@ellerman.id.au> Link: https://lore.kernel.org/r/96f53c237316dab4b1b8c682685266faa92da816.1652074503.git.christophe.leroy@csgroup.eu
2022-05-19powerpc: Replace PPC64_ELF_ABI_v{1/2} by CONFIG_PPC64_ELF_ABI_V{1/2}Christophe Leroy
Replace all uses of PPC64_ELF_ABI_v1 and PPC64_ELF_ABI_v2 by resp CONFIG_PPC64_ELF_ABI_V1 and CONFIG_PPC64_ELF_ABI_V2. Signed-off-by: Christophe Leroy <christophe.leroy@csgroup.eu> Signed-off-by: Michael Ellerman <mpe@ellerman.id.au> Link: https://lore.kernel.org/r/ba13d59e8c50bc9aa6328f1c7f0c0d0278e0a3a7.1652074503.git.christophe.leroy@csgroup.eu
2022-05-19powerpc/ftrace: Use patch_instruction() return directlyChristophe Leroy
Instead of returning -EPERM when patch_instruction() fails, just return what patch_instruction returns. That simplifies ftrace_modify_code(): 0: 94 21 ff c0 stwu r1,-64(r1) 4: 93 e1 00 3c stw r31,60(r1) 8: 7c 7f 1b 79 mr. r31,r3 c: 40 80 00 30 bge 3c <ftrace_modify_code+0x3c> 10: 93 c1 00 38 stw r30,56(r1) 14: 7c 9e 23 78 mr r30,r4 18: 7c a4 2b 78 mr r4,r5 1c: 80 bf 00 00 lwz r5,0(r31) 20: 7c 1e 28 40 cmplw r30,r5 24: 40 82 00 34 bne 58 <ftrace_modify_code+0x58> 28: 83 c1 00 38 lwz r30,56(r1) 2c: 7f e3 fb 78 mr r3,r31 30: 83 e1 00 3c lwz r31,60(r1) 34: 38 21 00 40 addi r1,r1,64 38: 48 00 00 00 b 38 <ftrace_modify_code+0x38> 38: R_PPC_REL24 patch_instruction Before: 0: 94 21 ff c0 stwu r1,-64(r1) 4: 93 e1 00 3c stw r31,60(r1) 8: 7c 7f 1b 79 mr. r31,r3 c: 40 80 00 4c bge 58 <ftrace_modify_code+0x58> 10: 93 c1 00 38 stw r30,56(r1) 14: 7c 9e 23 78 mr r30,r4 18: 7c a4 2b 78 mr r4,r5 1c: 80 bf 00 00 lwz r5,0(r31) 20: 7c 08 02 a6 mflr r0 24: 90 01 00 44 stw r0,68(r1) 28: 7c 1e 28 40 cmplw r30,r5 2c: 40 82 00 48 bne 74 <ftrace_modify_code+0x74> 30: 7f e3 fb 78 mr r3,r31 34: 48 00 00 01 bl 34 <ftrace_modify_code+0x34> 34: R_PPC_REL24 patch_instruction 38: 80 01 00 44 lwz r0,68(r1) 3c: 20 63 00 00 subfic r3,r3,0 40: 83 c1 00 38 lwz r30,56(r1) 44: 7c 63 19 10 subfe r3,r3,r3 48: 7c 08 03 a6 mtlr r0 4c: 83 e1 00 3c lwz r31,60(r1) 50: 38 21 00 40 addi r1,r1,64 54: 4e 80 00 20 blr It improves ftrace activation/deactivation duration by about 3%. Modify patch_instruction() return on failure to -EPERM in order to match with ftrace expectations. Other users of patch_instruction() do not care about the exact error value returned. Signed-off-by: Christophe Leroy <christophe.leroy@csgroup.eu> Signed-off-by: Michael Ellerman <mpe@ellerman.id.au> Link: https://lore.kernel.org/r/49a8597230713e2633e7d9d7b56140787c4a7e20.1652074503.git.christophe.leroy@csgroup.eu
2022-05-19powerpc/ftrace: Inline ftrace_modify_code()Christophe Leroy
Inlining ftrace_modify_code(), it increases a bit the size of ftrace code but brings 5% improvment on ftrace activation. Usually in C files we let gcc decide what to do but here it really help to 'help' gcc to decide to inline, thought we don't want to force it with an __always_inline that would be too much for CONFIG_CC_OPTIMIZE_FOR_SIZE. Signed-off-by: Christophe Leroy <christophe.leroy@csgroup.eu> Signed-off-by: Michael Ellerman <mpe@ellerman.id.au> Link: https://lore.kernel.org/r/1597a06d57cfc80e6853838c4066e799bf6c7977.1652074503.git.christophe.leroy@csgroup.eu
2022-05-19powerpc/ftrace: Use is_offset_in_branch_range()Christophe Leroy
Use is_offset_in_branch_range() instead of create_branch() to check if a target is within branch range. This patch together with the previous one improves ftrace activation time by 7% Signed-off-by: Christophe Leroy <christophe.leroy@csgroup.eu> Signed-off-by: Michael Ellerman <mpe@ellerman.id.au> Link: https://lore.kernel.org/r/912ae51782f5a53c44e435497c8c3fb5cc632387.1652074503.git.christophe.leroy@csgroup.eu
2022-05-19powerpc/ftrace: Remove redundant create_branch() callsChristophe Leroy
Since commit d5937db114e4 ("powerpc/code-patching: Fix patch_branch() return on out-of-range failure") patch_branch() fails with -ERANGE when trying to branch out of range. No need to perform the test twice. Remove redundant create_branch() calls. Signed-off-by: Christophe Leroy <christophe.leroy@csgroup.eu> Signed-off-by: Michael Ellerman <mpe@ellerman.id.au> Link: https://lore.kernel.org/r/aa45fbad0b4b7493080835d8276c0cb4ce146503.1652074503.git.christophe.leroy@csgroup.eu
2022-05-19powerpc/ftrace: Refactor prepare_ftrace_return()Christophe Leroy
When we have CONFIG_DYNAMIC_FTRACE_WITH_ARGS, prepare_ftrace_return() is called by ftrace_graph_func() otherwise prepare_ftrace_return() is called from assembly. Refactor prepare_ftrace_return() into a static __prepare_ftrace_return() that will be called by both prepare_ftrace_return() and ftrace_graph_func(). It will allow GCC to fold __prepare_ftrace_return() inside ftrace_graph_func(). Signed-off-by: Christophe Leroy <christophe.leroy@csgroup.eu> Signed-off-by: Michael Ellerman <mpe@ellerman.id.au> Link: https://lore.kernel.org/r/0d42deafe353980c66cf19d3132638c05ba9f4a9.1652074503.git.christophe.leroy@csgroup.eu
2022-03-08powerpc: Move C prototypes out of asm-prototypes.hChristophe Leroy
We originally added asm-prototypes.h in commit 42f5b4cacd78 ("powerpc: Introduce asm-prototypes.h"). It's purpose was for prototypes of C functions that are only called from asm, in order to fix sparse warnings about missing prototypes. A few months later Nick added a different use case in commit 4efca4ed05cb ("kbuild: modversions for EXPORT_SYMBOL() for asm") for C prototypes for exported asm functions. This is basically the inverse of our original usage. Since then we've added various prototypes to asm-prototypes.h for both reasons, meaning we now need to unstitch it all. Dispatch prototypes of C functions into relevant headers and keep only the prototypes for functions defined in assembly. For the time being, leave prom_init() there because moving it into asm/prom.h or asm/setup.h conflicts with drivers/gpu/drm/nouveau/nvkm/subdev/bios/shadowrom.o This will be fixed later by untaggling asm/pci.h and asm/prom.h or by renaming the function in shadowrom.c Signed-off-by: Christophe Leroy <christophe.leroy@csgroup.eu> Signed-off-by: Michael Ellerman <mpe@ellerman.id.au> Link: https://lore.kernel.org/r/62d46904eca74042097acf4cb12c175e3067f3d1.1646413435.git.christophe.leroy@csgroup.eu
2022-02-16powerpc/ftrace: Add recursion protection in prepare_ftrace_return()Christophe Leroy
The function_graph_enter() does not provide any recursion protection. Add a protection in prepare_ftrace_return() in case function_graph_enter() calls something that gets function graph traced. Fixes: 830213786c49 ("powerpc/ftrace: directly call of function graph tracer by ftrace caller") Reported-by: Naveen N. Rao <naveen.n.rao@linux.vnet.ibm.com> Signed-off-by: Christophe Leroy <christophe.leroy@csgroup.eu> Signed-off-by: Michael Ellerman <mpe@ellerman.id.au> Link: https://lore.kernel.org/r/74edf2ff0a60e66b0d9225a137100a86a0557032.1644949750.git.christophe.leroy@csgroup.eu
2022-02-07powerpc/ftrace: directly call of function graph tracer by ftrace callerChristophe Leroy
Modify function graph tracer to be handled directly by the standard ftrace caller. This is made possible as powerpc now supports CONFIG_DYNAMIC_FTRACE_WITH_ARGS. This change simplifies the call of function graph ftrace. Signed-off-by: Christophe Leroy <christophe.leroy@csgroup.eu> Signed-off-by: Michael Ellerman <mpe@ellerman.id.au> Link: https://lore.kernel.org/r/04d196585ff81bde06a000bd9c633a33a5b21130.1640017960.git.christophe.leroy@csgroup.eu
2022-02-07powerpc/ftrace: Refactor ftrace_{en/dis}able_ftrace_graph_callerChristophe Leroy
ftrace_enable_ftrace_graph_caller() and ftrace_disable_ftrace_graph_caller() have common code. They will have even more common code after following patch. Refactor into a single ftrace_modify_ftrace_graph_caller() function. Signed-off-by: Christophe Leroy <christophe.leroy@csgroup.eu> Signed-off-by: Michael Ellerman <mpe@ellerman.id.au> Link: https://lore.kernel.org/r/f37785a531f1a8f201e1b3da45997a5c77e9d820.1640017960.git.christophe.leroy@csgroup.eu