summaryrefslogtreecommitdiff
path: root/arch/arm64/kernel/vmlinux.lds.S
AgeCommit message (Collapse)Author
2024-08-19runtime constants: move list of constants to vmlinux.lds.hJann Horn
Refactor the list of constant variables into a macro. This should make it easier to add more constants in the future. Signed-off-by: Jann Horn <jannh@google.com> Reviewed-by: Alexander Gordeev <agordeev@linux.ibm.com> Reviewed-by: Thomas Gleixner <tglx@linutronix.de> Acked-by: Arnd Bergmann <arnd@arndb.de> Acked-by: Will Deacon <will@kernel.org> Signed-off-by: Arnd Bergmann <arnd@arndb.de>
2024-07-09arm64: add 'runtime constant' supportLinus Torvalds
This implements the runtime constant infrastructure for arm64, allowing the dcache d_hash() function to be generated using as a constant for hash table address followed by shift by a constant of the hash index. [ Fixed up to deal with the big-endian case as per Mark Rutland ] Signed-off-by: Linus Torvalds <torvalds@linux-foundation.org>
2024-02-16arm64: head: Move early kernel mapping routines into C codeArd Biesheuvel
The asm version of the kernel mapping code works fine for creating a coarse grained identity map, but for mapping the kernel down to its exact boundaries with the right attributes, it is not suitable. This is why we create a preliminary RWX kernel mapping first, and then rebuild it from scratch later on. So let's reimplement this in C, in a way that will make it unnecessary to create the kernel page tables yet another time in paging_init(). Signed-off-by: Ard Biesheuvel <ardb@kernel.org> Link: https://lore.kernel.org/r/20240214122845.2033971-63-ardb+git@google.com Signed-off-by: Catalin Marinas <catalin.marinas@arm.com>
2024-02-16arm64: head: Run feature override detection before mapping the kernelArd Biesheuvel
To permit the feature overrides to be taken into account before the KASLR init code runs and the kernel mapping is created, move the detection code to an earlier stage in the boot. In a subsequent patch, this will be taken advantage of by merging the preliminary and permanent mappings of the kernel text and data into a single one that gets created and relocated before start_kernel() is called. Signed-off-by: Ard Biesheuvel <ardb@kernel.org> Link: https://lore.kernel.org/r/20240214122845.2033971-53-ardb+git@google.com Signed-off-by: Catalin Marinas <catalin.marinas@arm.com>
2024-02-16arm64: head: Clear BSS and the kernel page tables in one goArd Biesheuvel
We will move the CPU feature overrides into BSS in a subsequent patch, and this requires that BSS is zeroed before the feature override detection code runs. So let's map BSS read-write in the ID map, and zero it via this mapping. Since the kernel page tables are right next to it, and also zeroed via the ID map, let's drop the separate clear_page_tables() function, and just zero everything in one go. Signed-off-by: Ard Biesheuvel <ardb@kernel.org> Link: https://lore.kernel.org/r/20240214122845.2033971-51-ardb+git@google.com Signed-off-by: Catalin Marinas <catalin.marinas@arm.com>
2024-02-16arm64: head: move relocation handling to C codeArd Biesheuvel
Now that we have a mini C runtime before the kernel mapping is up, we can move the non-trivial relocation processing code out of head.S and reimplement it in C. Signed-off-by: Ard Biesheuvel <ardb@kernel.org> Link: https://lore.kernel.org/r/20240214122845.2033971-48-ardb+git@google.com Signed-off-by: Catalin Marinas <catalin.marinas@arm.com>
2023-05-02arm64: lds: move .got section out of .textFangrui Song
Currently, the .got section is placed within the output section .text. However, when .got is non-empty, the SHF_WRITE flag is set for .text when linked by lld. GNU ld recognizes .text as a special section and ignores the SHF_WRITE flag. By renaming .text, we can also get the SHF_WRITE flag. The kernel has performed R_AARCH64_RELATIVE resolving very early, and can then assume that .got is read-only. Let's move .got to the vmlinux_rodata pseudo-segment. As Ard Biesheuvel notes: "This matters to consumers of the vmlinux ELF representation of the kernel image, such as syzkaller, which disregards writable PT_LOAD segments when resolving code symbols. The kernel itself does not care about this distinction, but given that the GOT contains data and not code, it does not require executable permissions, and therefore does not belong in .text to begin with." Reviewed-by: Ard Biesheuvel <ardb@kernel.org> Signed-off-by: Fangrui Song <maskray@google.com> Link: https://lore.kernel.org/r/20230502074105.1541926-1-maskray@google.com Signed-off-by: Will Deacon <will@kernel.org>
2023-02-21Merge tag 'arm64-upstream' of ↵Linus Torvalds
git://git.kernel.org/pub/scm/linux/kernel/git/arm64/linux Pull arm64 updates from Catalin Marinas: - Support for arm64 SME 2 and 2.1. SME2 introduces a new 512-bit architectural register (ZT0, for the look-up table feature) that Linux needs to save/restore - Include TPIDR2 in the signal context and add the corresponding kselftests - Perf updates: Arm SPEv1.2 support, HiSilicon uncore PMU updates, ACPI support to the Marvell DDR and TAD PMU drivers, reset DTM_PMU_CONFIG (ARM CMN) at probe time - Support for DYNAMIC_FTRACE_WITH_CALL_OPS on arm64 - Permit EFI boot with MMU and caches on. Instead of cleaning the entire loaded kernel image to the PoC and disabling the MMU and caches before branching to the kernel bare metal entry point, leave the MMU and caches enabled and rely on EFI's cacheable 1:1 mapping of all of system RAM to populate the initial page tables - Expose the AArch32 (compat) ELF_HWCAP features to user in an arm64 kernel (the arm32 kernel only defines the values) - Harden the arm64 shadow call stack pointer handling: stash the shadow stack pointer in the task struct on interrupt, load it directly from this structure - Signal handling cleanups to remove redundant validation of size information and avoid reading the same data from userspace twice - Refactor the hwcap macros to make use of the automatically generated ID registers. It should make new hwcaps writing less error prone - Further arm64 sysreg conversion and some fixes - arm64 kselftest fixes and improvements - Pointer authentication cleanups: don't sign leaf functions, unify asm-arch manipulation - Pseudo-NMI code generation optimisations - Minor fixes for SME and TPIDR2 handling - Miscellaneous updates: ARCH_FORCE_MAX_ORDER is now selectable, replace strtobool() to kstrtobool() in the cpufeature.c code, apply dynamic shadow call stack in two passes, intercept pfn changes in set_pte_at() without the required break-before-make sequence, attempt to dump all instructions on unhandled kernel faults * tag 'arm64-upstream' of git://git.kernel.org/pub/scm/linux/kernel/git/arm64/linux: (130 commits) arm64: fix .idmap.text assertion for large kernels kselftest/arm64: Don't require FA64 for streaming SVE+ZA tests kselftest/arm64: Copy whole EXTRA context arm64: kprobes: Drop ID map text from kprobes blacklist perf: arm_spe: Print the version of SPE detected perf: arm_spe: Add support for SPEv1.2 inverted event filtering perf: Add perf_event_attr::config3 arm64/sme: Fix __finalise_el2 SMEver check drivers/perf: fsl_imx8_ddr_perf: Remove set-but-not-used variable arm64/signal: Only read new data when parsing the ZT context arm64/signal: Only read new data when parsing the ZA context arm64/signal: Only read new data when parsing the SVE context arm64/signal: Avoid rereading context frame sizes arm64/signal: Make interface for restore_fpsimd_context() consistent arm64/signal: Remove redundant size validation from parse_user_sigframe() arm64/signal: Don't redundantly verify FPSIMD magic arm64/cpufeature: Use helper macros to specify hwcaps arm64/cpufeature: Always use symbolic name for feature value in hwcaps arm64/sysreg: Initial unsigned annotations for ID registers arm64/sysreg: Initial annotation of signed ID registers ...
2023-02-10Merge branches 'for-next/sysreg', 'for-next/sme', 'for-next/kselftest', ↵Catalin Marinas
'for-next/misc', 'for-next/sme2', 'for-next/tpidr2', 'for-next/scs', 'for-next/compat-hwcap', 'for-next/ftrace', 'for-next/efi-boot-mmu-on', 'for-next/ptrauth' and 'for-next/pseudo-nmi', remote-tracking branch 'arm64/for-next/perf' into for-next/core * arm64/for-next/perf: perf: arm_spe: Print the version of SPE detected perf: arm_spe: Add support for SPEv1.2 inverted event filtering perf: Add perf_event_attr::config3 drivers/perf: fsl_imx8_ddr_perf: Remove set-but-not-used variable perf: arm_spe: Support new SPEv1.2/v8.7 'not taken' event perf: arm_spe: Use new PMSIDR_EL1 register enums perf: arm_spe: Drop BIT() and use FIELD_GET/PREP accessors arm64/sysreg: Convert SPE registers to automatic generation arm64: Drop SYS_ from SPE register defines perf: arm_spe: Use feature numbering for PMSEVFR_EL1 defines perf/marvell: Add ACPI support to TAD uncore driver perf/marvell: Add ACPI support to DDR uncore driver perf/arm-cmn: Reset DTM_PMU_CONFIG at probe drivers/perf: hisi: Extract initialization of "cpa_pmu->pmu" drivers/perf: hisi: Simplify the parameters of hisi_pmu_init() drivers/perf: hisi: Advertise the PERF_PMU_CAP_NO_EXCLUDE capability * for-next/sysreg: : arm64 sysreg and cpufeature fixes/updates KVM: arm64: Use symbolic definition for ISR_EL1.A arm64/sysreg: Add definition of ISR_EL1 arm64/sysreg: Add definition for ICC_NMIAR1_EL1 arm64/cpufeature: Remove 4 bit assumption in ARM64_FEATURE_MASK() arm64/sysreg: Fix errors in 32 bit enumeration values arm64/cpufeature: Fix field sign for DIT hwcap detection * for-next/sme: : SME-related updates arm64/sme: Optimise SME exit on syscall entry arm64/sme: Don't use streaming mode to probe the maximum SME VL arm64/ptrace: Use system_supports_tpidr2() to check for TPIDR2 support * for-next/kselftest: (23 commits) : arm64 kselftest fixes and improvements kselftest/arm64: Don't require FA64 for streaming SVE+ZA tests kselftest/arm64: Copy whole EXTRA context kselftest/arm64: Fix enumeration of systems without 128 bit SME for SSVE+ZA kselftest/arm64: Fix enumeration of systems without 128 bit SME kselftest/arm64: Don't require FA64 for streaming SVE tests kselftest/arm64: Limit the maximum VL we try to set via ptrace kselftest/arm64: Correct buffer size for SME ZA storage kselftest/arm64: Remove the local NUM_VL definition kselftest/arm64: Verify simultaneous SSVE and ZA context generation kselftest/arm64: Verify that SSVE signal context has SVE_SIG_FLAG_SM set kselftest/arm64: Remove spurious comment from MTE test Makefile kselftest/arm64: Support build of MTE tests with clang kselftest/arm64: Initialise current at build time in signal tests kselftest/arm64: Don't pass headers to the compiler as source kselftest/arm64: Remove redundant _start labels from FP tests kselftest/arm64: Fix .pushsection for strings in FP tests kselftest/arm64: Run BTI selftests on systems without BTI kselftest/arm64: Fix test numbering when skipping tests kselftest/arm64: Skip non-power of 2 SVE vector lengths in fp-stress kselftest/arm64: Only enumerate power of two VLs in syscall-abi ... * for-next/misc: : Miscellaneous arm64 updates arm64/mm: Intercept pfn changes in set_pte_at() Documentation: arm64: correct spelling arm64: traps: attempt to dump all instructions arm64: Apply dynamic shadow call stack patching in two passes arm64: el2_setup.h: fix spelling typo in comments arm64: Kconfig: fix spelling arm64: cpufeature: Use kstrtobool() instead of strtobool() arm64: Avoid repeated AA64MMFR1_EL1 register read on pagefault path arm64: make ARCH_FORCE_MAX_ORDER selectable * for-next/sme2: (23 commits) : Support for arm64 SME 2 and 2.1 arm64/sme: Fix __finalise_el2 SMEver check kselftest/arm64: Remove redundant _start labels from zt-test kselftest/arm64: Add coverage of SME 2 and 2.1 hwcaps kselftest/arm64: Add coverage of the ZT ptrace regset kselftest/arm64: Add SME2 coverage to syscall-abi kselftest/arm64: Add test coverage for ZT register signal frames kselftest/arm64: Teach the generic signal context validation about ZT kselftest/arm64: Enumerate SME2 in the signal test utility code kselftest/arm64: Cover ZT in the FP stress test kselftest/arm64: Add a stress test program for ZT0 arm64/sme: Add hwcaps for SME 2 and 2.1 features arm64/sme: Implement ZT0 ptrace support arm64/sme: Implement signal handling for ZT arm64/sme: Implement context switching for ZT0 arm64/sme: Provide storage for ZT0 arm64/sme: Add basic enumeration for SME2 arm64/sme: Enable host kernel to access ZT0 arm64/sme: Manually encode ZT0 load and store instructions arm64/esr: Document ISS for ZT0 being disabled arm64/sme: Document SME 2 and SME 2.1 ABI ... * for-next/tpidr2: : Include TPIDR2 in the signal context kselftest/arm64: Add test case for TPIDR2 signal frame records kselftest/arm64: Add TPIDR2 to the set of known signal context records arm64/signal: Include TPIDR2 in the signal context arm64/sme: Document ABI for TPIDR2 signal information * for-next/scs: : arm64: harden shadow call stack pointer handling arm64: Stash shadow stack pointer in the task struct on interrupt arm64: Always load shadow stack pointer directly from the task struct * for-next/compat-hwcap: : arm64: Expose compat ARMv8 AArch32 features (HWCAPs) arm64: Add compat hwcap SSBS arm64: Add compat hwcap SB arm64: Add compat hwcap I8MM arm64: Add compat hwcap ASIMDBF16 arm64: Add compat hwcap ASIMDFHM arm64: Add compat hwcap ASIMDDP arm64: Add compat hwcap FPHP and ASIMDHP * for-next/ftrace: : Add arm64 support for DYNAMICE_FTRACE_WITH_CALL_OPS arm64: avoid executing padding bytes during kexec / hibernation arm64: Implement HAVE_DYNAMIC_FTRACE_WITH_CALL_OPS arm64: ftrace: Update stale comment arm64: patching: Add aarch64_insn_write_literal_u64() arm64: insn: Add helpers for BTI arm64: Extend support for CONFIG_FUNCTION_ALIGNMENT ACPI: Don't build ACPICA with '-Os' Compiler attributes: GCC cold function alignment workarounds ftrace: Add DYNAMIC_FTRACE_WITH_CALL_OPS * for-next/efi-boot-mmu-on: : Permit arm64 EFI boot with MMU and caches on arm64: kprobes: Drop ID map text from kprobes blacklist arm64: head: Switch endianness before populating the ID map efi: arm64: enter with MMU and caches enabled arm64: head: Clean the ID map and the HYP text to the PoC if needed arm64: head: avoid cache invalidation when entering with the MMU on arm64: head: record the MMU state at primary entry arm64: kernel: move identity map out of .text mapping arm64: head: Move all finalise_el2 calls to after __enable_mmu * for-next/ptrauth: : arm64 pointer authentication cleanup arm64: pauth: don't sign leaf functions arm64: unify asm-arch manipulation * for-next/pseudo-nmi: : Pseudo-NMI code generation optimisations arm64: irqflags: use alternative branches for pseudo-NMI logic arm64: add ARM64_HAS_GIC_PRIO_RELAXED_SYNC cpucap arm64: make ARM64_HAS_GIC_PRIO_MASKING depend on ARM64_HAS_GIC_CPUIF_SYSREGS arm64: rename ARM64_HAS_IRQ_PRIO_MASKING to ARM64_HAS_GIC_PRIO_MASKING arm64: rename ARM64_HAS_SYSREG_GIC_CPUIF to ARM64_HAS_GIC_CPUIF_SYSREGS
2023-01-27arm64: avoid executing padding bytes during kexec / hibernationMark Rutland
Currently we rely on the HIBERNATE_TEXT section starting with the entry point to swsusp_arch_suspend_exit, and the KEXEC_TEXT section starting with the entry point to arm64_relocate_new_kernel. In both cases we copy the entire section into a dynamically-allocated page, and then later branch to the start of this page. SYM_FUNC_START() will align the function entry points to CONFIG_FUNCTION_ALIGNMENT, and when the linker later processes the assembled code it will place padding bytes before the function entry point if the location counter was not already sufficiently aligned. The linker happens to use the value zero for these padding bytes. This padding may end up being applied whenever CONFIG_FUNCTION_ALIGNMENT is greater than 4, which can be the case with CONFIG_DEBUG_FORCE_FUNCTION_ALIGN_64B=y or CONFIG_DYNAMIC_FTRACE_WITH_CALL_OPS=y. When such padding is applied, attempting to kexec or resume from hibernate will result ina crash: the kernel will branch to the padding bytes as the start of the dynamically-allocated page, and as those bytes are zero they will decode as UDF #0, which reliably triggers an UNDEFINED exception. For example: | # ./kexec --reuse-cmdline -f Image | [ 46.965800] kexec_core: Starting new kernel | [ 47.143641] psci: CPU1 killed (polled 0 ms) | [ 47.233653] psci: CPU2 killed (polled 0 ms) | [ 47.323465] psci: CPU3 killed (polled 0 ms) | [ 47.324776] Bye! | [ 47.327072] Internal error: Oops - Undefined instruction: 0000000002000000 [#1] SMP | [ 47.328510] Modules linked in: | [ 47.329086] CPU: 0 PID: 259 Comm: kexec Not tainted 6.2.0-rc5+ #3 | [ 47.330223] Hardware name: QEMU KVM Virtual Machine, BIOS 0.0.0 02/06/2015 | [ 47.331497] pstate: 604003c5 (nZCv DAIF +PAN -UAO -TCO -DIT -SSBS BTYPE=--) | [ 47.332782] pc : 0x43a95000 | [ 47.333338] lr : machine_kexec+0x190/0x1e0 | [ 47.334169] sp : ffff80000d293b70 | [ 47.334845] x29: ffff80000d293b70 x28: ffff000002cc0000 x27: 0000000000000000 | [ 47.336292] x26: 0000000000000000 x25: 0000000000000000 x24: 0000000000000000 | [ 47.337744] x23: ffff80000a837858 x22: 0000000048ec9000 x21: 0000000000000010 | [ 47.339192] x20: 00000000adc83000 x19: ffff000000827000 x18: 0000000000000006 | [ 47.340638] x17: ffff800075a61000 x16: ffff800008000000 x15: ffff80000d293658 | [ 47.342085] x14: 0000000000000000 x13: ffff80000d2937f7 x12: ffff80000a7ff6e0 | [ 47.343530] x11: 00000000ffffdfff x10: ffff80000a8ef8e0 x9 : ffff80000813ef00 | [ 47.344976] x8 : 000000000002ffe8 x7 : c0000000ffffdfff x6 : 00000000000affa8 | [ 47.346431] x5 : 0000000000001fff x4 : 0000000000000001 x3 : ffff80000a0a3008 | [ 47.347877] x2 : ffff80000a8220f8 x1 : 0000000043a95000 x0 : ffff000000827000 | [ 47.349334] Call trace: | [ 47.349834] 0x43a95000 | [ 47.350338] kernel_kexec+0x88/0x100 | [ 47.351070] __do_sys_reboot+0x108/0x268 | [ 47.351873] __arm64_sys_reboot+0x2c/0x40 | [ 47.352689] invoke_syscall+0x78/0x108 | [ 47.353458] el0_svc_common.constprop.0+0x4c/0x100 | [ 47.354426] do_el0_svc+0x34/0x50 | [ 47.355102] el0_svc+0x34/0x108 | [ 47.355747] el0t_64_sync_handler+0xf4/0x120 | [ 47.356617] el0t_64_sync+0x194/0x198 | [ 47.357374] Code: bad PC value | [ 47.357999] ---[ end trace 0000000000000000 ]--- | [ 47.358937] Kernel panic - not syncing: Oops - Undefined instruction: Fatal exception | [ 47.360515] Kernel Offset: disabled | [ 47.361230] CPU features: 0x002000,00050108,c8004203 | [ 47.362232] Memory Limit: none Note: Unfortunately the code dump reports "bad PC value" as it attempts to dump some instructions prior to the UDF (i.e. before the start of the page), and terminates early upon a fault, obscuring the problem. This patch fixes this issue by aligning the section starter markes to CONFIG_FUNCTION_ALIGNMENT using the ALIGN_FUNCTION() helper, which ensures that the linker never needs to place padding bytes within the section. Assertions are added to verify each section begins with the function we expect, making our implicit requirement explicit. In future it might be nice to rework the kexec and hibernation code to decouple the section start from the entry point, but that involves much more significant changes that come with a higher risk of error, so I've tried to keep this fix as simple as possible for now. Fixes: 47a15aa54427 ("arm64: Extend support for CONFIG_FUNCTION_ALIGNMENT") Reported-by: CKI Project <cki-project@redhat.com> Link: https://lore.kernel.org/linux-arm-kernel/29992.123012504212600261@us-mta-139.us.mimecast.lan/ Signed-off-by: Mark Rutland <mark.rutland@arm.com> Cc: James Morse <james.morse@arm.com> Cc: Will Deacon <will@kernel.org> Reviewed-by: Ard Biesheuvel <ardb@kernel.org> Signed-off-by: Catalin Marinas <catalin.marinas@arm.com>
2023-01-24arm64: kernel: move identity map out of .text mappingArd Biesheuvel
Reorganize the ID map slightly so that only code that is executed with the MMU off or via the 1:1 mapping remains. This allows us to move the identity map out of the .text segment, as it will no longer need executable permissions via the kernel mapping. Signed-off-by: Ard Biesheuvel <ardb@kernel.org> Link: https://lore.kernel.org/r/20230111102236.1430401-3-ardb@kernel.org Signed-off-by: Catalin Marinas <catalin.marinas@arm.com>
2023-01-13objtool/idle: Validate __cpuidle code as noinstrPeter Zijlstra
Idle code is very like entry code in that RCU isn't available. As such, add a little validation. Signed-off-by: Peter Zijlstra (Intel) <peterz@infradead.org> Signed-off-by: Ingo Molnar <mingo@kernel.org> Tested-by: Tony Lindgren <tony@atomide.com> Tested-by: Ulf Hansson <ulf.hansson@linaro.org> Acked-by: Geert Uytterhoeven <geert@linux-m68k.org> Acked-by: Rafael J. Wysocki <rafael.j.wysocki@intel.com> Acked-by: Frederic Weisbecker <frederic@kernel.org> Link: https://lore.kernel.org/r/20230112195540.373461409@infradead.org
2022-11-09arm64: unwind: add asynchronous unwind tables to kernel and modulesArd Biesheuvel
Enable asynchronous unwind table generation for both the core kernel as well as modules, and emit the resulting .eh_frame sections as init code so we can use the unwind directives for code patching at boot or module load time. This will be used by dynamic shadow call stack support, which will rely on code patching rather than compiler codegen to emit the shadow call stack push and pop instructions. Signed-off-by: Ard Biesheuvel <ardb@kernel.org> Reviewed-by: Nick Desaulniers <ndesaulniers@google.com> Reviewed-by: Sami Tolvanen <samitolvanen@google.com> Tested-by: Sami Tolvanen <samitolvanen@google.com> Link: https://lore.kernel.org/r/20221027155908.1940624-2-ardb@kernel.org Signed-off-by: Will Deacon <will@kernel.org>
2022-07-25Merge branch 'for-next/boot' into for-next/coreWill Deacon
* for-next/boot: (34 commits) arm64: fix KASAN_INLINE arm64: Add an override for ID_AA64SMFR0_EL1.FA64 arm64: Add the arm64.nosve command line option arm64: Add the arm64.nosme command line option arm64: Expose a __check_override primitive for oddball features arm64: Allow the idreg override to deal with variable field width arm64: Factor out checking of a feature against the override into a macro arm64: Allow sticky E2H when entering EL1 arm64: Save state of HCR_EL2.E2H before switch to EL1 arm64: Rename the VHE switch to "finalise_el2" arm64: mm: fix booting with 52-bit address space arm64: head: remove __PHYS_OFFSET arm64: lds: use PROVIDE instead of conditional definitions arm64: setup: drop early FDT pointer helpers arm64: head: avoid relocating the kernel twice for KASLR arm64: kaslr: defer initialization to initcall where permitted arm64: head: record CPU boot mode after enabling the MMU arm64: head: populate kernel page tables with MMU and caches on arm64: head: factor out TTBR1 assignment into a macro arm64: idreg-override: use early FDT mapping in ID map ...
2022-06-24arm64: head: use relative references to the RELA and RELR tablesArd Biesheuvel
Formerly, we had to access the RELA and RELR tables via the kernel mapping that was being relocated, and so deriving the start and end addresses using ADRP/ADD references was not possible, as the relocation code runs from the ID map. Now that we map the entire kernel image via the ID map, we can simplify this, and just load the entries via the ID map as well. Signed-off-by: Ard Biesheuvel <ardb@kernel.org> Link: https://lore.kernel.org/r/20220624150651.1358849-14-ardb@kernel.org Signed-off-by: Will Deacon <will@kernel.org>
2022-06-24arm64: head: cover entire kernel image in initial ID mapArd Biesheuvel
As a first step towards avoiding the need to create, tear down and recreate the kernel virtual mapping with MMU and caches disabled, start by expanding the ID map so it covers the page tables as well as all executable code. This will allow us to populate the page tables with the MMU and caches on, and call KASLR init code before setting up the virtual mapping. Since this ID map is only needed at boot, create it as a temporary set of page tables, and populate the permanent ID map after enabling the MMU and caches. While at it, switch to read-only attributes for the where possible, as writable permissions are only needed for the initial kernel page tables. Note that on 4k granule configurations, the permanent ID map will now be reduced to a single page rather than a 2M block mapping. Signed-off-by: Ard Biesheuvel <ardb@kernel.org> Link: https://lore.kernel.org/r/20220624150651.1358849-13-ardb@kernel.org Signed-off-by: Will Deacon <will@kernel.org>
2022-06-24arm64: entry: simplify trampoline data pageArd Biesheuvel
Get rid of some clunky open coded arithmetic on section addresses, by emitting the trampoline data variables into a separate, dedicated r/o data section, and putting it at the next page boundary. This way, we can access the literals via single LDR instruction. While at it, get rid of other, implicit literals, and use ADRP/ADD or MOVZ/MOVK sequences, as appropriate. Note that the latter are only supported for CONFIG_RELOCATABLE=n (which is usually the case if CONFIG_RANDOMIZE_BASE=n), so update the CPP conditionals to reflect this. Acked-by: Mark Rutland <mark.rutland@arm.com> Signed-off-by: Ard Biesheuvel <ardb@kernel.org> Link: https://lore.kernel.org/r/20220622161010.3845775-1-ardb@kernel.org Signed-off-by: Will Deacon <will@kernel.org>
2022-05-17arm64: lds: move special code sections out of kernel exec segmentArd Biesheuvel
There are a few code sections that are emitted into the kernel's executable .text segment simply because they contain code, but are actually never executed via this mapping, so they can happily live in a region that gets mapped without executable permissions, reducing the risk of being gadgetized. Note that the kexec and hibernate region contents are always copied into a fresh page, and so there is no need to align them as long as the overall size of each is below 4 KiB. Signed-off-by: Ard Biesheuvel <ardb@kernel.org> Acked-by: Mark Rutland <mark.rutland@arm.com> Link: https://lore.kernel.org/r/20220429131347.3621090-2-ardb@kernel.org Signed-off-by: Catalin Marinas <catalin.marinas@arm.com>
2022-02-15arm64: entry: Allow the trampoline text to occupy multiple pagesJames Morse
Adding a second set of vectors to .entry.tramp.text will make it larger than a single 4K page. Allow the trampoline text to occupy up to three pages by adding two more fixmap slots. Previous changes to tramp_valias allowed it to reach beyond a single page. Reviewed-by: Catalin Marinas <catalin.marinas@arm.com> Signed-off-by: James Morse <james.morse@arm.com>
2021-10-29Merge branch 'for-next/kexec' into for-next/coreWill Deacon
* for-next/kexec: arm64: trans_pgd: remove trans_pgd_map_page() arm64: kexec: remove cpu-reset.h arm64: kexec: remove the pre-kexec PoC maintenance arm64: kexec: keep MMU enabled during kexec relocation arm64: kexec: install a copy of the linear-map arm64: kexec: use ld script for relocation function arm64: kexec: relocate in EL1 mode arm64: kexec: configure EL2 vectors for kexec arm64: kexec: pass kimage as the only argument to relocation function arm64: kexec: Use dcache ops macros instead of open-coding arm64: kexec: skip relocation code for inplace kexec arm64: kexec: flush image and lists during kexec load time arm64: hibernate: abstract ttrb0 setup function arm64: trans_pgd: hibernate: Add trans_pgd_copy_el2_vectors arm64: kernel: add helper for booted at EL2 and not VHE
2021-10-21arm64: vmlinux.lds.S: remove `.fixup` sectionMark Rutland
We no longer place anything into a `.fixup` section, so we no longer need to place those sections into the `.text` section in the main kernel Image. Remove the use of `.fixup`. Signed-off-by: Mark Rutland <mark.rutland@arm.com> Reviewed-by: Ard Biesheuvel <ardb@kernel.org> Cc: Catalin Marinas <catalin.marinas@arm.com> Cc: James Morse <james.morse@arm.com> Cc: Robin Murphy <robin.murphy@arm.com> Cc: Will Deacon <will@kernel.org> Link: https://lore.kernel.org/r/20211019160219.5202-14-mark.rutland@arm.com Signed-off-by: Will Deacon <will@kernel.org>
2021-10-21arm64: extable: add `type` and `data` fieldsMark Rutland
Subsequent patches will add specialized handlers for fixups, in addition to the simple PC fixup and BPF handlers we have today. In preparation, this patch adds a new `type` field to struct exception_table_entry, and uses this to distinguish the fixup and BPF cases. A `data` field is also added so that subsequent patches can associate data specific to each exception site (e.g. register numbers). Handlers are named ex_handler_*() for consistency, following the exmaple of x86. At the same time, get_ex_fixup() is split out into a helper so that it can be used by other ex_handler_*() functions ins subsequent patches. This patch will increase the size of the exception tables, which will be remedied by subsequent patches removing redundant fixup code. There should be no functional change as a result of this patch. Since each entry is now 12 bytes in size, we must reduce the alignment of each entry from `.align 3` (i.e. 8 bytes) to `.align 2` (i.e. 4 bytes), which is the natrual alignment of the `insn` and `fixup` fields. The current 8-byte alignment is a holdover from when the `insn` and `fixup` fields was 8 bytes, and while not harmful has not been necessary since commit: 6c94f27ac847ff8e ("arm64: switch to relative exception tables") Similarly, RO_EXCEPTION_TABLE_ALIGN is dropped to 4 bytes. Concurrently with this patch, x86's exception table entry format is being updated (similarly to a 12-byte format, with 32-bytes of absolute data). Once both have been merged it should be possible to unify the sorttable logic for the two. Signed-off-by: Mark Rutland <mark.rutland@arm.com> Reviewed-by: Ard Biesheuvel <ardb@kernel.org> Cc: Alexei Starovoitov <ast@kernel.org> Cc: Andrii Nakryiko <andrii@kernel.org> Cc: Catalin Marinas <catalin.marinas@arm.com> Cc: Daniel Borkmann <daniel@iogearbox.net> Cc: James Morse <james.morse@arm.com> Cc: Jean-Philippe Brucker <jean-philippe@linaro.org> Cc: Robin Murphy <robin.murphy@arm.com> Cc: Will Deacon <will@kernel.org> Link: https://lore.kernel.org/r/20211019160219.5202-11-mark.rutland@arm.com Signed-off-by: Will Deacon <will@kernel.org>
2021-10-01arm64: kexec: use ld script for relocation functionPasha Tatashin
Currently, relocation code declares start and end variables which are used to compute its size. The better way to do this is to use ld script, and put relocation function in its own section. Signed-off-by: Pasha Tatashin <pasha.tatashin@soleen.com> Acked-by: Catalin Marinas <catalin.marinas@arm.com> Link: https://lore.kernel.org/r/20210930143113.1502553-11-pasha.tatashin@soleen.com Signed-off-by: Will Deacon <will@kernel.org>
2021-08-04arm64: Move .hyp.rodata outside of the _sdata.._edata rangeMarc Zyngier
The HYP rodata section is currently lumped together with the BSS, which isn't exactly what is expected (it gets registered with kmemleak, for example). Move it away so that it is actually marked RO. As an added benefit, it isn't registered with kmemleak anymore. Fixes: 380e18ade4a5 ("KVM: arm64: Introduce a BSS section for use at Hyp") Suggested-by: Catalin Marinas <catalin.marinas@arm.com> Signed-off-by: Marc Zyngier <maz@kernel.org> Cc: stable@vger.kernel.org #5.13 Acked-by: Catalin Marinas <catalin.marinas@arm.com> Link: https://lore.kernel.org/r/20210802123830.2195174-2-maz@kernel.org
2021-03-19KVM: arm64: Page-align the .hyp sectionsQuentin Perret
We will soon unmap the .hyp sections from the host stage 2 in Protected nVHE mode, which obviously works with at least page granularity, so make sure to align them correctly. Acked-by: Will Deacon <will@kernel.org> Signed-off-by: Quentin Perret <qperret@google.com> Signed-off-by: Marc Zyngier <maz@kernel.org> Link: https://lore.kernel.org/r/20210319100146.1149909-37-qperret@google.com
2021-03-19KVM: arm64: Introduce a BSS section for use at HypQuentin Perret
Currently, the hyp code cannot make full use of a bss, as the kernel section is mapped read-only. While this mapping could simply be changed to read-write, it would intermingle even more the hyp and kernel state than they currently are. Instead, introduce a __hyp_bss section, that uses reserved pages, and create the appropriate RW hyp mappings during KVM init. Acked-by: Will Deacon <will@kernel.org> Signed-off-by: Quentin Perret <qperret@google.com> Signed-off-by: Marc Zyngier <maz@kernel.org> Link: https://lore.kernel.org/r/20210319100146.1149909-8-qperret@google.com
2021-02-21Merge tag 'for-linus' of git://git.kernel.org/pub/scm/virt/kvm/kvmLinus Torvalds
Pull KVM updates from Paolo Bonzini: "x86: - Support for userspace to emulate Xen hypercalls - Raise the maximum number of user memslots - Scalability improvements for the new MMU. Instead of the complex "fast page fault" logic that is used in mmu.c, tdp_mmu.c uses an rwlock so that page faults are concurrent, but the code that can run against page faults is limited. Right now only page faults take the lock for reading; in the future this will be extended to some cases of page table destruction. I hope to switch the default MMU around 5.12-rc3 (some testing was delayed due to Chinese New Year). - Cleanups for MAXPHYADDR checks - Use static calls for vendor-specific callbacks - On AMD, use VMLOAD/VMSAVE to save and restore host state - Stop using deprecated jump label APIs - Workaround for AMD erratum that made nested virtualization unreliable - Support for LBR emulation in the guest - Support for communicating bus lock vmexits to userspace - Add support for SEV attestation command - Miscellaneous cleanups PPC: - Support for second data watchpoint on POWER10 - Remove some complex workarounds for buggy early versions of POWER9 - Guest entry/exit fixes ARM64: - Make the nVHE EL2 object relocatable - Cleanups for concurrent translation faults hitting the same page - Support for the standard TRNG hypervisor call - A bunch of small PMU/Debug fixes - Simplification of the early init hypercall handling Non-KVM changes (with acks): - Detection of contended rwlocks (implemented only for qrwlocks, because KVM only needs it for x86) - Allow __DISABLE_EXPORTS from assembly code - Provide a saner follow_pfn replacements for modules" * tag 'for-linus' of git://git.kernel.org/pub/scm/virt/kvm/kvm: (192 commits) KVM: x86/xen: Explicitly pad struct compat_vcpu_info to 64 bytes KVM: selftests: Don't bother mapping GVA for Xen shinfo test KVM: selftests: Fix hex vs. decimal snafu in Xen test KVM: selftests: Fix size of memslots created by Xen tests KVM: selftests: Ignore recently added Xen tests' build output KVM: selftests: Add missing header file needed by xAPIC IPI tests KVM: selftests: Add operand to vmsave/vmload/vmrun in svm.c KVM: SVM: Make symbol 'svm_gp_erratum_intercept' static locking/arch: Move qrwlock.h include after qspinlock.h KVM: PPC: Book3S HV: Fix host radix SLB optimisation with hash guests KVM: PPC: Book3S HV: Ensure radix guest has no SLB entries KVM: PPC: Don't always report hash MMU capability for P9 < DD2.2 KVM: PPC: Book3S HV: Save and restore FSCR in the P9 path KVM: PPC: remove unneeded semicolon KVM: PPC: Book3S HV: Use POWER9 SLBIA IH=6 variant to clear SLB KVM: PPC: Book3S HV: No need to clear radix host SLB before loading HPT guest KVM: PPC: Book3S HV: Fix radix guest SLB side channel KVM: PPC: Book3S HV: Remove support for running HPT guest on RPT host without mixed mode support KVM: PPC: Book3S HV: Introduce new capability for 2nd DAWR KVM: PPC: Book3S HV: Add infrastructure to support 2nd DAWR ...
2021-02-03arm64: vmlinux.ld.S: add assertion for tramp_pg_dir offsetJoey Gouly
Add TRAMP_SWAPPER_OFFSET and use that instead of hardcoding the offset between swapper_pg_dir and tramp_pg_dir. Then use TRAMP_SWAPPER_OFFSET to assert that the offset is correct at link time. Signed-off-by: Joey Gouly <joey.gouly@arm.com> Reviewed-by: Mark Rutland <mark.rutland@arm.com> Tested-by: Mark Rutland <mark.rutland@arm.com> Link: https://lore.kernel.org/r/20210202123658.22308-3-joey.gouly@arm.com Signed-off-by: Will Deacon <will@kernel.org>
2021-02-03arm64: vmlinux.ld.S: add assertion for reserved_pg_dir offsetJoey Gouly
Add RESERVED_SWAPPER_OFFSET and use that instead of hardcoding the offset between swapper_pg_dir and reserved_pg_dir. Then use RESERVED_SWAPPER_OFFSET to assert that the offset is correct at link time. Signed-off-by: Joey Gouly <joey.gouly@arm.com> Reviewed-by: Mark Rutland <mark.rutland@arm.com> Tested-by: Mark Rutland <mark.rutland@arm.com> Link: https://lore.kernel.org/r/20210202123658.22308-2-joey.gouly@arm.com Signed-off-by: Will Deacon <will@kernel.org>
2021-01-23KVM: arm64: Generate hyp relocation dataDavid Brazdil
Add a post-processing step to compilation of KVM nVHE hyp code which calls a custom host tool (gen-hyprel) on the partially linked object file (hyp sections' names prefixed). The tool lists all R_AARCH64_ABS64 data relocations targeting hyp sections and generates an assembly file that will form a new section .hyp.reloc in the kernel binary. The new section contains an array of 32-bit offsets to the positions targeted by these relocations. Since these addresses of those positions will not be determined until linking of `vmlinux`, each 32-bit entry carries a R_AARCH64_PREL32 relocation with addend <section_base_sym> + <r_offset>. The linker of `vmlinux` will therefore fill the slot accordingly. This relocation data will be used at runtime to convert the kernel VAs at those positions to hyp VAs. Signed-off-by: David Brazdil <dbrazdil@google.com> Signed-off-by: Marc Zyngier <maz@kernel.org> Link: https://lore.kernel.org/r/20210105180541.65031-5-dbrazdil@google.com
2021-01-23KVM: arm64: Set up .hyp.rodata ELF sectionDavid Brazdil
We will need to recognize pointers in .rodata specific to hyp, so establish a .hyp.rodata ELF section. Merge it with the existing .hyp.data..ro_after_init as they are treated the same at runtime. Signed-off-by: David Brazdil <dbrazdil@google.com> Signed-off-by: Marc Zyngier <maz@kernel.org> Link: https://lore.kernel.org/r/20210105180541.65031-3-dbrazdil@google.com
2020-12-20Merge tag 'for-linus' of git://git.kernel.org/pub/scm/virt/kvm/kvmLinus Torvalds
Pull KVM updates from Paolo Bonzini: "Much x86 work was pushed out to 5.12, but ARM more than made up for it. ARM: - PSCI relay at EL2 when "protected KVM" is enabled - New exception injection code - Simplification of AArch32 system register handling - Fix PMU accesses when no PMU is enabled - Expose CSV3 on non-Meltdown hosts - Cache hierarchy discovery fixes - PV steal-time cleanups - Allow function pointers at EL2 - Various host EL2 entry cleanups - Simplification of the EL2 vector allocation s390: - memcg accouting for s390 specific parts of kvm and gmap - selftest for diag318 - new kvm_stat for when async_pf falls back to sync x86: - Tracepoints for the new pagetable code from 5.10 - Catch VFIO and KVM irqfd events before userspace - Reporting dirty pages to userspace with a ring buffer - SEV-ES host support - Nested VMX support for wait-for-SIPI activity state - New feature flag (AVX512 FP16) - New system ioctl to report Hyper-V-compatible paravirtualization features Generic: - Selftest improvements" * tag 'for-linus' of git://git.kernel.org/pub/scm/virt/kvm/kvm: (171 commits) KVM: SVM: fix 32-bit compilation KVM: SVM: Add AP_JUMP_TABLE support in prep for AP booting KVM: SVM: Provide support to launch and run an SEV-ES guest KVM: SVM: Provide an updated VMRUN invocation for SEV-ES guests KVM: SVM: Provide support for SEV-ES vCPU loading KVM: SVM: Provide support for SEV-ES vCPU creation/loading KVM: SVM: Update ASID allocation to support SEV-ES guests KVM: SVM: Set the encryption mask for the SVM host save area KVM: SVM: Add NMI support for an SEV-ES guest KVM: SVM: Guest FPU state save/restore not needed for SEV-ES guest KVM: SVM: Do not report support for SMM for an SEV-ES guest KVM: x86: Update __get_sregs() / __set_sregs() to support SEV-ES KVM: SVM: Add support for CR8 write traps for an SEV-ES guest KVM: SVM: Add support for CR4 write traps for an SEV-ES guest KVM: SVM: Add support for CR0 write traps for an SEV-ES guest KVM: SVM: Add support for EFER write traps for an SEV-ES guest KVM: SVM: Support string IO operations for an SEV-ES guest KVM: SVM: Support MMIO for an SEV-ES guest KVM: SVM: Create trace events for VMGEXIT MSR protocol processing KVM: SVM: Create trace events for VMGEXIT processing ...
2020-12-09Merge branch 'for-next/misc' into for-next/coreCatalin Marinas
* for-next/misc: : Miscellaneous patches arm64: vmlinux.lds.S: Drop redundant *.init.rodata.* kasan: arm64: set TCR_EL1.TBID1 when enabled arm64: mte: optimize asynchronous tag check fault flag check arm64/mm: add fallback option to allocate virtually contiguous memory arm64/smp: Drop the macro S(x,s) arm64: consistently use reserved_pg_dir arm64: kprobes: Remove redundant kprobe_step_ctx # Conflicts: # arch/arm64/kernel/vmlinux.lds.S
2020-12-09Merge branches 'for-next/kvm-build-fix', 'for-next/va-refactor', ↵Catalin Marinas
'for-next/lto', 'for-next/mem-hotplug', 'for-next/cppc-ffh', 'for-next/pad-image-header', 'for-next/zone-dma-default-32-bit', 'for-next/signal-tag-bits' and 'for-next/cmdline-extended' into for-next/core * for-next/kvm-build-fix: : Fix KVM build issues with 64K pages KVM: arm64: Fix build error in user_mem_abort() * for-next/va-refactor: : VA layout changes arm64: mm: don't assume struct page is always 64 bytes Documentation/arm64: fix RST layout of memory.rst arm64: mm: tidy up top of kernel VA space arm64: mm: make vmemmap region a projection of the linear region arm64: mm: extend linear region for 52-bit VA configurations * for-next/lto: : Upgrade READ_ONCE() to RCpc acquire on arm64 with LTO arm64: lto: Strengthen READ_ONCE() to acquire when CONFIG_LTO=y arm64: alternatives: Remove READ_ONCE() usage during patch operation arm64: cpufeatures: Add capability for LDAPR instruction arm64: alternatives: Split up alternative.h arm64: uaccess: move uao_* alternatives to asm-uaccess.h * for-next/mem-hotplug: : Memory hotplug improvements arm64/mm/hotplug: Ensure early memory sections are all online arm64/mm/hotplug: Enable MEM_OFFLINE event handling arm64/mm/hotplug: Register boot memory hot remove notifier earlier arm64: mm: account for hotplug memory when randomizing the linear region * for-next/cppc-ffh: : Add CPPC FFH support using arm64 AMU counters arm64: abort counter_read_on_cpu() when irqs_disabled() arm64: implement CPPC FFH support using AMUs arm64: split counter validation function arm64: wrap and generalise counter read functions * for-next/pad-image-header: : Pad Image header to 64KB and unmap it arm64: head: tidy up the Image header definition arm64/head: avoid symbol names pointing into first 64 KB of kernel image arm64: omit [_text, _stext) from permanent kernel mapping * for-next/zone-dma-default-32-bit: : Default to 32-bit wide ZONE_DMA (previously reduced to 1GB for RPi4) of: unittest: Fix build on architectures without CONFIG_OF_ADDRESS mm: Remove examples from enum zone_type comment arm64: mm: Set ZONE_DMA size based on early IORT scan arm64: mm: Set ZONE_DMA size based on devicetree's dma-ranges of: unittest: Add test for of_dma_get_max_cpu_address() of/address: Introduce of_dma_get_max_cpu_address() arm64: mm: Move zone_dma_bits initialization into zone_sizes_init() arm64: mm: Move reserve_crashkernel() into mem_init() arm64: Force NO_BLOCK_MAPPINGS if crashkernel reservation is required arm64: Ignore any DMA offsets in the max_zone_phys() calculation * for-next/signal-tag-bits: : Expose the FAR_EL1 tag bits in siginfo arm64: expose FAR_EL1 tag bits in siginfo signal: define the SA_EXPOSE_TAGBITS bit in sa_flags signal: define the SA_UNSUPPORTED bit in sa_flags arch: provide better documentation for the arch-specific SA_* flags signal: clear non-uapi flag bits when passing/returning sa_flags arch: move SA_* definitions to generic headers parisc: start using signal-defs.h parisc: Drop parisc special case for __sighandler_t * for-next/cmdline-extended: : Add support for CONFIG_CMDLINE_EXTENDED arm64: Extend the kernel command line from the bootloader arm64: kaslr: Refactor early init command line parsing
2020-12-04KVM: arm64: Add .hyp.data..ro_after_init ELF sectionDavid Brazdil
Add rules for renaming the .data..ro_after_init ELF section in KVM nVHE object files to .hyp.data..ro_after_init, linking it into the kernel and mapping it in hyp at runtime. The section is RW to the host, then mapped RO in hyp. The expectation is that the host populates the variables in the section and they are never changed by hyp afterwards. Signed-off-by: David Brazdil <dbrazdil@google.com> Signed-off-by: Marc Zyngier <maz@kernel.org> Link: https://lore.kernel.org/r/20201202184122.26046-13-dbrazdil@google.com
2020-11-27arm64: vmlinux.lds.S: Drop redundant *.init.rodata.*Youling Tang
We currently try to emit *.init.rodata.* twice, once in INIT_DATA, and once in the line immediately following it. As the two section definitions are identical, the latter is redundant and can be dropped. This patch drops the redundant *.init.rodata.* section definition. Signed-off-by: Youling Tang <tangyouling@loongson.cn> Acked-by: Will Deacon <will@kernel.org> Link: https://lore.kernel.org/r/1605750340-910-1-git-send-email-tangyouling@loongson.cn Signed-off-by: Catalin Marinas <catalin.marinas@arm.com>
2020-11-17arm64: omit [_text, _stext) from permanent kernel mappingArd Biesheuvel
In a previous patch, we increased the size of the EFI PE/COFF header to 64 KB, which resulted in the _stext symbol to appear at a fixed offset of 64 KB into the image. Since 64 KB is also the largest page size we support, this completely removes the need to map the first 64 KB of the kernel image, given that it only contains the arm64 Image header and the EFI header, neither of which we ever access again after booting the kernel. More importantly, we should avoid an executable mapping of non-executable and not entirely predictable data, to deal with the unlikely event that we inadvertently emitted something that looks like an opcode that could be used as a gadget for speculative execution. So let's limit the kernel mapping of .text to the [_stext, _etext) region, which matches the view of generic code (such as kallsyms) when it reasons about the boundaries of the kernel's .text section. Signed-off-by: Ard Biesheuvel <ardb@kernel.org> Acked-by: Will Deacon <will@kernel.org> Link: https://lore.kernel.org/r/20201117124729.12642-2-ardb@kernel.org Signed-off-by: Catalin Marinas <catalin.marinas@arm.com>
2020-11-10arm64: consistently use reserved_pg_dirMark Rutland
Depending on configuration options and specific code paths, we either use the empty_zero_page or the configuration-dependent reserved_ttbr0 as a reserved value for TTBR{0,1}_EL1. To simplify this code, let's always allocate and use the same reserved_pg_dir, replacing reserved_ttbr0. Note that this is allocated (and hence pre-zeroed), and is also marked as read-only in the kernel Image mapping. Keeping this separate from the empty_zero_page potentially helps with robustness as the empty_zero_page is used in a number of cases where a failure to map it read-only could allow it to become corrupted. The (presently unused) swapper_pg_end symbol is also removed, and comments are added wherever we rely on the offsets between the pre-allocated pg_dirs to keep these cases easily identifiable. Signed-off-by: Mark Rutland <mark.rutland@arm.com> Cc: Will Deacon <will@kernel.org> Link: https://lore.kernel.org/r/20201103102229.8542-1-mark.rutland@arm.com Signed-off-by: Catalin Marinas <catalin.marinas@arm.com>
2020-11-09arm64: lto: Strengthen READ_ONCE() to acquire when CONFIG_LTO=yWill Deacon
When building with LTO, there is an increased risk of the compiler converting an address dependency headed by a READ_ONCE() invocation into a control dependency and consequently allowing for harmful reordering by the CPU. Ensure that such transformations are harmless by overriding the generic READ_ONCE() definition with one that provides acquire semantics when building with LTO. Acked-by: Peter Zijlstra (Intel) <peterz@infradead.org> Acked-by: Mark Rutland <mark.rutland@arm.com> Signed-off-by: Will Deacon <will@kernel.org>
2020-10-28arm64: vmlinux.lds: account for spurious empty .igot.plt sectionsArd Biesheuvel
Now that we started making the linker warn about orphan sections (input sections that are not explicitly consumed by an output section), some configurations produce the following warning: aarch64-linux-gnu-ld: warning: orphan section `.igot.plt' from `arch/arm64/kernel/head.o' being placed in section `.igot.plt' It could be any file that triggers this - head.o is simply the first input file in the link - and the resulting .igot.plt section never actually appears in vmlinux as it turns out to be empty. So let's add .igot.plt to our collection of input sections to disregard unless they are empty. Signed-off-by: Ard Biesheuvel <ardb@kernel.org> Cc: Jessica Yu <jeyu@kernel.org> Cc: Kees Cook <keescook@chromium.org> Cc: Geert Uytterhoeven <geert@linux-m68k.org> Cc: Nick Desaulniers <ndesaulniers@google.com> Link: https://lore.kernel.org/r/20201028133332.5571-1-ardb@kernel.org Signed-off-by: Will Deacon <will@kernel.org>
2020-10-23Merge tag 'for-linus' of git://git.kernel.org/pub/scm/virt/kvm/kvmLinus Torvalds
Pull KVM updates from Paolo Bonzini: "For x86, there is a new alternative and (in the future) more scalable implementation of extended page tables that does not need a reverse map from guest physical addresses to host physical addresses. For now it is disabled by default because it is still lacking a few of the existing MMU's bells and whistles. However it is a very solid piece of work and it is already available for people to hammer on it. Other updates: ARM: - New page table code for both hypervisor and guest stage-2 - Introduction of a new EL2-private host context - Allow EL2 to have its own private per-CPU variables - Support of PMU event filtering - Complete rework of the Spectre mitigation PPC: - Fix for running nested guests with in-kernel IRQ chip - Fix race condition causing occasional host hard lockup - Minor cleanups and bugfixes x86: - allow trapping unknown MSRs to userspace - allow userspace to force #GP on specific MSRs - INVPCID support on AMD - nested AMD cleanup, on demand allocation of nested SVM state - hide PV MSRs and hypercalls for features not enabled in CPUID - new test for MSR_IA32_TSC writes from host and guest - cleanups: MMU, CPUID, shared MSRs - LAPIC latency optimizations ad bugfixes" * tag 'for-linus' of git://git.kernel.org/pub/scm/virt/kvm/kvm: (232 commits) kvm: x86/mmu: NX largepage recovery for TDP MMU kvm: x86/mmu: Don't clear write flooding count for direct roots kvm: x86/mmu: Support MMIO in the TDP MMU kvm: x86/mmu: Support write protection for nesting in tdp MMU kvm: x86/mmu: Support disabling dirty logging for the tdp MMU kvm: x86/mmu: Support dirty logging for the TDP MMU kvm: x86/mmu: Support changed pte notifier in tdp MMU kvm: x86/mmu: Add access tracking for tdp_mmu kvm: x86/mmu: Support invalidate range MMU notifier for TDP MMU kvm: x86/mmu: Allocate struct kvm_mmu_pages for all pages in TDP MMU kvm: x86/mmu: Add TDP MMU PF handler kvm: x86/mmu: Remove disallowed_hugepage_adjust shadow_walk_iterator arg kvm: x86/mmu: Support zapping SPTEs in the TDP MMU KVM: Cache as_id in kvm_memory_slot kvm: x86/mmu: Add functions to handle changed TDP SPTEs kvm: x86/mmu: Allocate and free TDP MMU roots kvm: x86/mmu: Init / Uninit the TDP MMU kvm: x86/mmu: Introduce tdp_iter KVM: mmu: extract spte.h and spte.c KVM: mmu: Separate updating a PTE from kvm_set_pte_rmapp ...
2020-10-12Merge tag 'core-build-2020-10-12' of ↵Linus Torvalds
git://git.kernel.org/pub/scm/linux/kernel/git/tip/tip Pull orphan section checking from Ingo Molnar: "Orphan link sections were a long-standing source of obscure bugs, because the heuristics that various linkers & compilers use to handle them (include these bits into the output image vs discarding them silently) are both highly idiosyncratic and also version dependent. Instead of this historically problematic mess, this tree by Kees Cook (et al) adds build time asserts and build time warnings if there's any orphan section in the kernel or if a section is not sized as expected. And because we relied on so many silent assumptions in this area, fix a metric ton of dependencies and some outright bugs related to this, before we can finally enable the checks on the x86, ARM and ARM64 platforms" * tag 'core-build-2020-10-12' of git://git.kernel.org/pub/scm/linux/kernel/git/tip/tip: (36 commits) x86/boot/compressed: Warn on orphan section placement x86/build: Warn on orphan section placement arm/boot: Warn on orphan section placement arm/build: Warn on orphan section placement arm64/build: Warn on orphan section placement x86/boot/compressed: Add missing debugging sections to output x86/boot/compressed: Remove, discard, or assert for unwanted sections x86/boot/compressed: Reorganize zero-size section asserts x86/build: Add asserts for unwanted sections x86/build: Enforce an empty .got.plt section x86/asm: Avoid generating unused kprobe sections arm/boot: Handle all sections explicitly arm/build: Assert for unwanted sections arm/build: Add missing sections arm/build: Explicitly keep .ARM.attributes sections arm/build: Refactor linker script headers arm64/build: Assert for unwanted sections arm64/build: Add missing DWARF sections arm64/build: Use common DISCARDS in linker script arm64/build: Remove .eh_frame* sections due to unwind tables ...
2020-09-30kvm: arm64: Set up hyp percpu data for nVHEDavid Brazdil
Add hyp percpu section to linker script and rename the corresponding ELF sections of hyp/nvhe object files. This moves all nVHE-specific percpu variables to the new hyp percpu section. Allocate sufficient amount of memory for all percpu hyp regions at global KVM init time and create corresponding hyp mappings. The base addresses of hyp percpu regions are kept in a dynamically allocated array in the kernel. Add NULL checks in PMU event-reset code as it may run before KVM memory is initialized. Signed-off-by: David Brazdil <dbrazdil@google.com> Signed-off-by: Marc Zyngier <maz@kernel.org> Acked-by: Will Deacon <will@kernel.org> Link: https://lore.kernel.org/r/20200922204910.7265-10-dbrazdil@google.com
2020-09-30kvm: arm64: Only define __kvm_ex_table for CONFIG_KVMDavid Brazdil
Minor cleanup that only creates __kvm_ex_table ELF section and related symbols if CONFIG_KVM is enabled. Also useful as more hyp-specific sections will be added. Signed-off-by: David Brazdil <dbrazdil@google.com> Signed-off-by: Marc Zyngier <maz@kernel.org> Acked-by: Will Deacon <will@kernel.org> Link: https://lore.kernel.org/r/20200922204910.7265-4-dbrazdil@google.com
2020-09-30kvm: arm64: Move nVHE hyp namespace macros to hyp_image.hDavid Brazdil
Minor cleanup to move all macros related to prefixing nVHE hyp section and symbol names into one place: hyp_image.h. Signed-off-by: David Brazdil <dbrazdil@google.com> Signed-off-by: Marc Zyngier <maz@kernel.org> Acked-by: Will Deacon <will@kernel.org> Link: https://lore.kernel.org/r/20200922204910.7265-3-dbrazdil@google.com
2020-09-07arm64: get rid of TEXT_OFFSETArd Biesheuvel
TEXT_OFFSET serves no purpose, and for this reason, it was redefined as 0x0 in the v5.8 timeframe. Since this does not appear to have caused any issues that require us to revisit that decision, let's get rid of the macro entirely, along with any references to it. Signed-off-by: Ard Biesheuvel <ardb@kernel.org> Link: https://lore.kernel.org/r/20200825135440.11288-1-ardb@kernel.org Signed-off-by: Will Deacon <will@kernel.org>
2020-09-01arm64/build: Assert for unwanted sectionsKees Cook
In preparation for warning on orphan sections, discard unwanted non-zero-sized generated sections, and enforce other expected-to-be-zero-sized sections (since discarding them might hide problems with them suddenly gaining unexpected entries). Suggested-by: Ard Biesheuvel <ardb@kernel.org> Signed-off-by: Kees Cook <keescook@chromium.org> Signed-off-by: Ingo Molnar <mingo@kernel.org> Cc: Catalin Marinas <catalin.marinas@arm.com> Cc: Will Deacon <will@kernel.org> Link: https://lore.kernel.org/r/20200821194310.3089815-14-keescook@chromium.org
2020-09-01arm64/build: Add missing DWARF sectionsKees Cook
Explicitly include DWARF sections when they're present in the build. Signed-off-by: Kees Cook <keescook@chromium.org> Signed-off-by: Ingo Molnar <mingo@kernel.org> Cc: Catalin Marinas <catalin.marinas@arm.com> Cc: Will Deacon <will@kernel.org> Link: https://lore.kernel.org/r/20200821194310.3089815-13-keescook@chromium.org
2020-09-01arm64/build: Use common DISCARDS in linker scriptKees Cook
Use the common DISCARDS rule for the linker script in an effort to regularize the linker script to prepare for warning on orphaned sections. Additionally clean up left-over no-op macros. Signed-off-by: Kees Cook <keescook@chromium.org> Signed-off-by: Ingo Molnar <mingo@kernel.org> Acked-by: Will Deacon <will@kernel.org> Link: https://lore.kernel.org/r/20200821194310.3089815-12-keescook@chromium.org
2020-09-01arm64/build: Remove .eh_frame* sections due to unwind tablesKees Cook
Avoid .eh_frame* section generation by making sure both CFLAGS and AFLAGS contain -fno-asychronous-unwind-tables and -fno-unwind-tables. With all sources of .eh_frame now removed from the build, drop this DISCARD so we can be alerted in the future if it returns unexpectedly once orphan section warnings have been enabled. Suggested-by: Ard Biesheuvel <ardb@kernel.org> Signed-off-by: Kees Cook <keescook@chromium.org> Signed-off-by: Ingo Molnar <mingo@kernel.org> Acked-by: Will Deacon <will@kernel.org> Link: https://lore.kernel.org/r/20200821194310.3089815-11-keescook@chromium.org