From 368ccecd4e4a0ed8180ad76764fd6e6d516e151c Mon Sep 17 00:00:00 2001 From: Sebastian Reichel Date: Thu, 15 Dec 2022 15:19:14 +0100 Subject: ARM: 9281/1: improve Cortex A8/A9 errata help text Document that !ARCH_MULTIPLATFORM is necessary because accessing the the errata workaround registers may not work in non-secure mode and mention that these erratas should be applied by the bootloader instead. Reviewed-by: AngeloGioacchino Del Regno Reviewed-by: Arnd Bergmann Signed-off-by: Sebastian Reichel Signed-off-by: Russell King (Oracle) --- arch/arm/Kconfig | 26 +++++++++++++++++++++----- 1 file changed, 21 insertions(+), 5 deletions(-) (limited to 'arch/arm') diff --git a/arch/arm/Kconfig b/arch/arm/Kconfig index 43c7773b89ae..45af77e67973 100644 --- a/arch/arm/Kconfig +++ b/arch/arm/Kconfig @@ -661,7 +661,9 @@ config ARM_ERRATA_458693 hazard might then cause a processor deadlock. The workaround enables the L1 caching of the NEON accesses and disables the PLD instruction in the ACTLR register. Note that setting specific bits in the ACTLR - register may not be available in non-secure mode. + register may not be available in non-secure mode and thus is not + available on a multiplatform kernel. This should be applied by the + bootloader instead. config ARM_ERRATA_460075 bool "ARM errata: Data written to the L2 cache can be overwritten with stale data" @@ -674,7 +676,9 @@ config ARM_ERRATA_460075 and overwritten with stale memory contents from external memory. The workaround disables the write-allocate mode for the L2 cache via the ACTLR register. Note that setting specific bits in the ACTLR register - may not be available in non-secure mode. + may not be available in non-secure mode and thus is not available on + a multiplatform kernel. This should be applied by the bootloader + instead. config ARM_ERRATA_742230 bool "ARM errata: DMB operation may be faulty" @@ -687,7 +691,10 @@ config ARM_ERRATA_742230 ordering of the two writes. This workaround sets a specific bit in the diagnostic register of the Cortex-A9 which causes the DMB instruction to behave as a DSB, ensuring the correct behaviour of - the two writes. + the two writes. Note that setting specific bits in the diagnostics + register may not be available in non-secure mode and thus is not + available on a multiplatform kernel. This should be applied by the + bootloader instead. config ARM_ERRATA_742231 bool "ARM errata: Incorrect hazard handling in the SCU may lead to data corruption" @@ -702,7 +709,10 @@ config ARM_ERRATA_742231 replaced from one of the CPUs at the same time as another CPU is accessing it. This workaround sets specific bits in the diagnostic register of the Cortex-A9 which reduces the linefill issuing - capabilities of the processor. + capabilities of the processor. Note that setting specific bits in the + diagnostics register may not be available in non-secure mode and thus + is not available on a multiplatform kernel. This should be applied by + the bootloader instead. config ARM_ERRATA_643719 bool "ARM errata: LoUIS bit field in CLIDR register is incorrect" @@ -739,7 +749,9 @@ config ARM_ERRATA_743622 register of the Cortex-A9 which disables the Store Buffer optimisation, preventing the defect from occurring. This has no visible impact on the overall performance or power consumption of the - processor. + processor. Note that setting specific bits in the diagnostics register + may not be available in non-secure mode and thus is not available on a + multiplatform kernel. This should be applied by the bootloader instead. config ARM_ERRATA_751472 bool "ARM errata: Interrupted ICIALLUIS may prevent completion of broadcasted operation" @@ -751,6 +763,10 @@ config ARM_ERRATA_751472 completion of a following broadcasted operation if the second operation is received by a CPU before the ICIALLUIS has completed, potentially leading to corrupted entries in the cache or TLB. + Note that setting specific bits in the diagnostics register may + not be available in non-secure mode and thus is not available on + a multiplatform kernel. This should be applied by the bootloader + instead. config ARM_ERRATA_754322 bool "ARM errata: possible faulty MMU translations following an ASID switch" -- cgit v1.2.3 From 62b95a7b44d1a30b3a967f5107ce2b4341531426 Mon Sep 17 00:00:00 2001 From: Ard Biesheuvel Date: Thu, 22 Dec 2022 18:49:51 +0100 Subject: ARM: 9282/1: vfp: Manipulate task VFP state with softirqs disabled In a subsequent patch, we will relax the kernel mode NEON policy, and permit kernel mode NEON to be used not only from task context, as is permitted today, but also from softirq context. Given that softirqs may trigger over the back of any IRQ unless they are explicitly disabled, we need to address the resulting races in the VFP state handling, by disabling softirq processing in two distinct but related cases: - kernel mode NEON will leave the FPU disabled after it completes, so any kernel code sequence that enables the FPU and subsequently accesses its registers needs to disable softirqs until it completes; - kernel_neon_begin() will preserve the userland VFP state in memory, and if it interrupts the ordinary VFP state preserve sequence, the latter will resume execution with the VFP registers corrupted, and happily continue saving them to memory. Given that disabling softirqs also disables preemption, we can replace the existing preempt_disable/enable occurrences in the VFP state handling asm code with new macros that dis/enable softirqs instead. In the VFP state handling C code, add local_bh_disable/enable() calls in those places where the VFP state is preserved. One thing to keep in mind is that, once we allow NEON use in softirq context, the result of any such interruption is that the FPEXC_EN bit in the FPEXC register will be cleared, and vfp_current_hw_state[cpu] will be NULL. This means that any sequence that [conditionally] clears FPEXC_EN and/or sets vfp_current_hw_state[cpu] to NULL does not need to run with softirqs disabled, as the result will be the same. Furthermore, the handling of THREAD_NOTIFY_SWITCH is guaranteed to run with IRQs disabled, and so it does not need protection from softirq interruptions either. Tested-by: Martin Willi Reviewed-by: Linus Walleij Signed-off-by: Ard Biesheuvel Signed-off-by: Russell King (Oracle) --- arch/arm/include/asm/assembler.h | 19 ++++++++++++------- arch/arm/kernel/asm-offsets.c | 1 + arch/arm/vfp/entry.S | 4 ++-- arch/arm/vfp/vfphw.S | 4 ++-- arch/arm/vfp/vfpmodule.c | 8 +++++++- 5 files changed, 24 insertions(+), 12 deletions(-) (limited to 'arch/arm') diff --git a/arch/arm/include/asm/assembler.h b/arch/arm/include/asm/assembler.h index 28e18f79c300..06b48ce23e1c 100644 --- a/arch/arm/include/asm/assembler.h +++ b/arch/arm/include/asm/assembler.h @@ -236,21 +236,26 @@ THUMB( fpreg .req r7 ) sub \tmp, \tmp, #1 @ decrement it str \tmp, [\ti, #TI_PREEMPT] .endm - - .macro dec_preempt_count_ti, ti, tmp - get_thread_info \ti - dec_preempt_count \ti, \tmp - .endm #else .macro inc_preempt_count, ti, tmp .endm .macro dec_preempt_count, ti, tmp .endm +#endif + + .macro local_bh_disable, ti, tmp + ldr \tmp, [\ti, #TI_PREEMPT] + add \tmp, \tmp, #SOFTIRQ_DISABLE_OFFSET + str \tmp, [\ti, #TI_PREEMPT] + .endm - .macro dec_preempt_count_ti, ti, tmp + .macro local_bh_enable_ti, ti, tmp + get_thread_info \ti + ldr \tmp, [\ti, #TI_PREEMPT] + sub \tmp, \tmp, #SOFTIRQ_DISABLE_OFFSET + str \tmp, [\ti, #TI_PREEMPT] .endm -#endif #define USERL(l, x...) \ 9999: x; \ diff --git a/arch/arm/kernel/asm-offsets.c b/arch/arm/kernel/asm-offsets.c index 2c8d76fd7c66..38121c59cbc2 100644 --- a/arch/arm/kernel/asm-offsets.c +++ b/arch/arm/kernel/asm-offsets.c @@ -56,6 +56,7 @@ int main(void) DEFINE(VFP_CPU, offsetof(union vfp_state, hard.cpu)); #endif #endif + DEFINE(SOFTIRQ_DISABLE_OFFSET,SOFTIRQ_DISABLE_OFFSET); #ifdef CONFIG_ARM_THUMBEE DEFINE(TI_THUMBEE_STATE, offsetof(struct thread_info, thumbee_state)); #endif diff --git a/arch/arm/vfp/entry.S b/arch/arm/vfp/entry.S index 27b0a1f27fbd..9a89264cdcc0 100644 --- a/arch/arm/vfp/entry.S +++ b/arch/arm/vfp/entry.S @@ -22,7 +22,7 @@ @ IRQs enabled. @ ENTRY(do_vfp) - inc_preempt_count r10, r4 + local_bh_disable r10, r4 ldr r4, .LCvfp ldr r11, [r10, #TI_CPU] @ CPU number add r10, r10, #TI_VFPSTATE @ r10 = workspace @@ -30,7 +30,7 @@ ENTRY(do_vfp) ENDPROC(do_vfp) ENTRY(vfp_null_entry) - dec_preempt_count_ti r10, r4 + local_bh_enable_ti r10, r4 ret lr ENDPROC(vfp_null_entry) diff --git a/arch/arm/vfp/vfphw.S b/arch/arm/vfp/vfphw.S index 6f7926c9c179..26c4f61ecfa3 100644 --- a/arch/arm/vfp/vfphw.S +++ b/arch/arm/vfp/vfphw.S @@ -175,7 +175,7 @@ vfp_hw_state_valid: @ else it's one 32-bit instruction, so @ always subtract 4 from the following @ instruction address. - dec_preempt_count_ti r10, r4 + local_bh_enable_ti r10, r4 ret r9 @ we think we have handled things @@ -200,7 +200,7 @@ skip: @ not recognised by VFP DBGSTR "not VFP" - dec_preempt_count_ti r10, r4 + local_bh_enable_ti r10, r4 ret lr process_exception: diff --git a/arch/arm/vfp/vfpmodule.c b/arch/arm/vfp/vfpmodule.c index 281110423871..86fb0be41ae1 100644 --- a/arch/arm/vfp/vfpmodule.c +++ b/arch/arm/vfp/vfpmodule.c @@ -416,7 +416,7 @@ void VFP_bounce(u32 trigger, u32 fpexc, struct pt_regs *regs) if (exceptions) vfp_raise_exceptions(exceptions, trigger, orig_fpscr, regs); exit: - preempt_enable(); + local_bh_enable(); } static void vfp_enable(void *unused) @@ -517,6 +517,8 @@ void vfp_sync_hwstate(struct thread_info *thread) { unsigned int cpu = get_cpu(); + local_bh_disable(); + if (vfp_state_in_hw(cpu, thread)) { u32 fpexc = fmrx(FPEXC); @@ -528,6 +530,7 @@ void vfp_sync_hwstate(struct thread_info *thread) fmxr(FPEXC, fpexc); } + local_bh_enable(); put_cpu(); } @@ -717,6 +720,8 @@ void kernel_neon_begin(void) unsigned int cpu; u32 fpexc; + local_bh_disable(); + /* * Kernel mode NEON is only allowed outside of interrupt context * with preemption disabled. This will make sure that the kernel @@ -739,6 +744,7 @@ void kernel_neon_begin(void) vfp_save_state(vfp_current_hw_state[cpu], fpexc); #endif vfp_current_hw_state[cpu] = NULL; + local_bh_enable(); } EXPORT_SYMBOL(kernel_neon_begin); -- cgit v1.2.3 From c79f81631142ee2dc4c743732427f23d18cd2dec Mon Sep 17 00:00:00 2001 From: Ard Biesheuvel Date: Thu, 22 Dec 2022 18:52:23 +0100 Subject: ARM: 9283/1: permit non-nested kernel mode NEON in softirq context We currently only permit kernel mode NEON in process context, to avoid the need to preserve/restore the NEON register file when taking an exception while running in the kernel. Like we did on arm64, we can relax this restriction substantially, by permitting kernel mode NEON from softirq context, while ensuring that softirq processing is disabled when the NEON is being used in task context. This guarantees that only NEON context belonging to user space needs to be preserved and restored, which is already taken care of. This is especially relevant for network encryption, where incoming frames are typically handled in softirq context, and deferring software decryption to a kernel thread or falling back to C code are both undesirable from a performance PoV. Tested-by: Martin Willi Reviewed-by: Linus Walleij Signed-off-by: Ard Biesheuvel Signed-off-by: Russell King (Oracle) --- arch/arm/include/asm/simd.h | 8 ++++++++ arch/arm/vfp/vfpmodule.c | 13 ++++++------- 2 files changed, 14 insertions(+), 7 deletions(-) create mode 100644 arch/arm/include/asm/simd.h (limited to 'arch/arm') diff --git a/arch/arm/include/asm/simd.h b/arch/arm/include/asm/simd.h new file mode 100644 index 000000000000..82191dbd7e78 --- /dev/null +++ b/arch/arm/include/asm/simd.h @@ -0,0 +1,8 @@ +/* SPDX-License-Identifier: GPL-2.0 */ + +#include + +static __must_check inline bool may_use_simd(void) +{ + return IS_ENABLED(CONFIG_KERNEL_MODE_NEON) && !in_hardirq(); +} diff --git a/arch/arm/vfp/vfpmodule.c b/arch/arm/vfp/vfpmodule.c index 86fb0be41ae1..01bc48d73847 100644 --- a/arch/arm/vfp/vfpmodule.c +++ b/arch/arm/vfp/vfpmodule.c @@ -723,12 +723,12 @@ void kernel_neon_begin(void) local_bh_disable(); /* - * Kernel mode NEON is only allowed outside of interrupt context - * with preemption disabled. This will make sure that the kernel - * mode NEON register contents never need to be preserved. + * Kernel mode NEON is only allowed outside of hardirq context with + * preemption and softirq processing disabled. This will make sure that + * the kernel mode NEON register contents never need to be preserved. */ - BUG_ON(in_interrupt()); - cpu = get_cpu(); + BUG_ON(in_hardirq()); + cpu = __smp_processor_id(); fpexc = fmrx(FPEXC) | FPEXC_EN; fmxr(FPEXC, fpexc); @@ -744,7 +744,6 @@ void kernel_neon_begin(void) vfp_save_state(vfp_current_hw_state[cpu], fpexc); #endif vfp_current_hw_state[cpu] = NULL; - local_bh_enable(); } EXPORT_SYMBOL(kernel_neon_begin); @@ -752,7 +751,7 @@ void kernel_neon_end(void) { /* Disable the NEON/VFP unit. */ fmxr(FPEXC, fmrx(FPEXC) & ~FPEXC_EN); - put_cpu(); + local_bh_enable(); } EXPORT_SYMBOL(kernel_neon_end); -- cgit v1.2.3 From cdc3116f191ad4250f992d750b2d6f72fb98afde Mon Sep 17 00:00:00 2001 From: Masahiro Yamada Date: Tue, 10 Jan 2023 09:01:29 +0100 Subject: ARM: 9285/1: remove meaningless arch/arm/mach-rda/Makefile You do not need to put Makefile if there is nothing to compile. Signed-off-by: Masahiro Yamada Signed-off-by: Russell King (Oracle) --- arch/arm/Makefile | 1 - arch/arm/mach-rda/Makefile | 2 -- 2 files changed, 3 deletions(-) delete mode 100644 arch/arm/mach-rda/Makefile (limited to 'arch/arm') diff --git a/arch/arm/Makefile b/arch/arm/Makefile index 4067f5169144..233496bfbe19 100644 --- a/arch/arm/Makefile +++ b/arch/arm/Makefile @@ -213,7 +213,6 @@ machine-$(CONFIG_ARCH_OMAP2PLUS) += omap2 machine-$(CONFIG_ARCH_ORION5X) += orion5x machine-$(CONFIG_ARCH_PXA) += pxa machine-$(CONFIG_ARCH_QCOM) += qcom -machine-$(CONFIG_ARCH_RDA) += rda machine-$(CONFIG_ARCH_REALTEK) += realtek machine-$(CONFIG_ARCH_ROCKCHIP) += rockchip machine-$(CONFIG_ARCH_RPC) += rpc diff --git a/arch/arm/mach-rda/Makefile b/arch/arm/mach-rda/Makefile deleted file mode 100644 index f126d00ecd53..000000000000 --- a/arch/arm/mach-rda/Makefile +++ /dev/null @@ -1,2 +0,0 @@ -# SPDX-License-Identifier: GPL-2.0-only -obj- += dummy.o -- cgit v1.2.3 From b575b5a1e625b589ba1b1eb36c05fcca588cbc85 Mon Sep 17 00:00:00 2001 From: Ard Biesheuvel Date: Mon, 16 Jan 2023 12:01:48 +0100 Subject: ARM: 9286/1: crypto: Implement fused AES-CTR/GHASH version of GCM On 32-bit ARM, AES in GCM mode takes full advantage of the ARMv8 Crypto Extensions when available, resulting in a performance of 6-7 cycles per byte for typical IPsec frames on cores such as Cortex-A53, using the generic GCM template encapsulating the accelerated AES-CTR and GHASH implementations. At such high rates, any time spent copying data or doing other poorly optimized work in the generic layer hurts disproportionately, and we can get a significant performance improvement by combining the optimized AES-CTR and GHASH implementations into a single GCM driver. On Cortex-A53, this results in a performance improvement of around 75%, and AES-256-GCM-128 with RFC4106 encapsulation runs in 4 cycles per byte. Note that this code takes advantage of the fact that kernel mode NEON is now supported in softirq context as well, and therefore does not provide a non-NEON fallback path at all. (AEADs are only callable in process or softirq context) Acked-by: Herbert Xu Signed-off-by: Ard Biesheuvel Signed-off-by: Russell King (Oracle) --- arch/arm/crypto/Kconfig | 2 + arch/arm/crypto/ghash-ce-core.S | 382 ++++++++++++++++++++++++++++++++++-- arch/arm/crypto/ghash-ce-glue.c | 423 +++++++++++++++++++++++++++++++++++++++- 3 files changed, 790 insertions(+), 17 deletions(-) (limited to 'arch/arm') diff --git a/arch/arm/crypto/Kconfig b/arch/arm/crypto/Kconfig index 7b2b7d043d9b..847b7a003356 100644 --- a/arch/arm/crypto/Kconfig +++ b/arch/arm/crypto/Kconfig @@ -16,8 +16,10 @@ config CRYPTO_CURVE25519_NEON config CRYPTO_GHASH_ARM_CE tristate "Hash functions: GHASH (PMULL/NEON/ARMv8 Crypto Extensions)" depends on KERNEL_MODE_NEON + select CRYPTO_AEAD select CRYPTO_HASH select CRYPTO_CRYPTD + select CRYPTO_LIB_AES select CRYPTO_LIB_GF128MUL help GCM GHASH function (NIST SP800-38D) diff --git a/arch/arm/crypto/ghash-ce-core.S b/arch/arm/crypto/ghash-ce-core.S index 9f51e3fa4526..858c0d66798b 100644 --- a/arch/arm/crypto/ghash-ce-core.S +++ b/arch/arm/crypto/ghash-ce-core.S @@ -2,7 +2,8 @@ /* * Accelerated GHASH implementation with NEON/ARMv8 vmull.p8/64 instructions. * - * Copyright (C) 2015 - 2017 Linaro Ltd. + * Copyright (C) 2015 - 2017 Linaro Ltd. + * Copyright (C) 2023 Google LLC. */ #include @@ -44,7 +45,7 @@ t2q .req q7 t3q .req q8 t4q .req q9 - T2 .req q9 + XH2 .req q9 s1l .req d20 s1h .req d21 @@ -80,7 +81,7 @@ XL2 .req q5 XM2 .req q6 - XH2 .req q7 + T2 .req q7 T3 .req q8 XL2_L .req d10 @@ -192,9 +193,10 @@ vshr.u64 XL, XL, #1 .endm - .macro ghash_update, pn + .macro ghash_update, pn, enc, aggregate=1, head=1 vld1.64 {XL}, [r1] + .if \head /* do the head block first, if supplied */ ldr ip, [sp] teq ip, #0 @@ -202,13 +204,32 @@ vld1.64 {T1}, [ip] teq r0, #0 b 3f + .endif 0: .ifc \pn, p64 + .if \aggregate tst r0, #3 // skip until #blocks is a bne 2f // round multiple of 4 vld1.8 {XL2-XM2}, [r2]! -1: vld1.8 {T3-T2}, [r2]! +1: vld1.8 {T2-T3}, [r2]! + + .ifnb \enc + \enc\()_4x XL2, XM2, T2, T3 + + add ip, r3, #16 + vld1.64 {HH}, [ip, :128]! + vld1.64 {HH3-HH4}, [ip, :128] + + veor SHASH2_p64, SHASH_L, SHASH_H + veor SHASH2_H, HH_L, HH_H + veor HH34_L, HH3_L, HH3_H + veor HH34_H, HH4_L, HH4_H + + vmov.i8 MASK, #0xe1 + vshl.u64 MASK, MASK, #57 + .endif + vrev64.8 XL2, XL2 vrev64.8 XM2, XM2 @@ -218,8 +239,8 @@ veor XL2_H, XL2_H, XL_L veor XL, XL, T1 - vrev64.8 T3, T3 - vrev64.8 T1, T2 + vrev64.8 T1, T3 + vrev64.8 T3, T2 vmull.p64 XH, HH4_H, XL_H // a1 * b1 veor XL2_H, XL2_H, XL_H @@ -267,14 +288,22 @@ b 1b .endif + .endif + +2: vld1.8 {T1}, [r2]! + + .ifnb \enc + \enc\()_1x T1 + veor SHASH2_p64, SHASH_L, SHASH_H + vmov.i8 MASK, #0xe1 + vshl.u64 MASK, MASK, #57 + .endif -2: vld1.64 {T1}, [r2]! subs r0, r0, #1 3: /* multiply XL by SHASH in GF(2^128) */ -#ifndef CONFIG_CPU_BIG_ENDIAN vrev64.8 T1, T1 -#endif + vext.8 IN1, T1, T1, #8 veor T1_L, T1_L, XL_H veor XL, XL, IN1 @@ -293,9 +322,6 @@ veor XL, XL, T1 bne 0b - - vst1.64 {XL}, [r1] - bx lr .endm /* @@ -316,6 +342,9 @@ ENTRY(pmull_ghash_update_p64) vshl.u64 MASK, MASK, #57 ghash_update p64 + vst1.64 {XL}, [r1] + + bx lr ENDPROC(pmull_ghash_update_p64) ENTRY(pmull_ghash_update_p8) @@ -336,4 +365,331 @@ ENTRY(pmull_ghash_update_p8) vmov.i64 k48, #0xffffffffffff ghash_update p8 + vst1.64 {XL}, [r1] + + bx lr ENDPROC(pmull_ghash_update_p8) + + e0 .req q9 + e1 .req q10 + e2 .req q11 + e3 .req q12 + e0l .req d18 + e0h .req d19 + e2l .req d22 + e2h .req d23 + e3l .req d24 + e3h .req d25 + ctr .req q13 + ctr0 .req d26 + ctr1 .req d27 + + ek0 .req q14 + ek1 .req q15 + + .macro round, rk:req, regs:vararg + .irp r, \regs + aese.8 \r, \rk + aesmc.8 \r, \r + .endr + .endm + + .macro aes_encrypt, rkp, rounds, regs:vararg + vld1.8 {ek0-ek1}, [\rkp, :128]! + cmp \rounds, #12 + blt .L\@ // AES-128 + + round ek0, \regs + vld1.8 {ek0}, [\rkp, :128]! + round ek1, \regs + vld1.8 {ek1}, [\rkp, :128]! + + beq .L\@ // AES-192 + + round ek0, \regs + vld1.8 {ek0}, [\rkp, :128]! + round ek1, \regs + vld1.8 {ek1}, [\rkp, :128]! + +.L\@: .rept 4 + round ek0, \regs + vld1.8 {ek0}, [\rkp, :128]! + round ek1, \regs + vld1.8 {ek1}, [\rkp, :128]! + .endr + + round ek0, \regs + vld1.8 {ek0}, [\rkp, :128] + + .irp r, \regs + aese.8 \r, ek1 + .endr + .irp r, \regs + veor \r, \r, ek0 + .endr + .endm + +pmull_aes_encrypt: + add ip, r5, #4 + vld1.8 {ctr0}, [r5] // load 12 byte IV + vld1.8 {ctr1}, [ip] + rev r8, r7 + vext.8 ctr1, ctr1, ctr1, #4 + add r7, r7, #1 + vmov.32 ctr1[1], r8 + vmov e0, ctr + + add ip, r3, #64 + aes_encrypt ip, r6, e0 + bx lr +ENDPROC(pmull_aes_encrypt) + +pmull_aes_encrypt_4x: + add ip, r5, #4 + vld1.8 {ctr0}, [r5] + vld1.8 {ctr1}, [ip] + rev r8, r7 + vext.8 ctr1, ctr1, ctr1, #4 + add r7, r7, #1 + vmov.32 ctr1[1], r8 + rev ip, r7 + vmov e0, ctr + add r7, r7, #1 + vmov.32 ctr1[1], ip + rev r8, r7 + vmov e1, ctr + add r7, r7, #1 + vmov.32 ctr1[1], r8 + rev ip, r7 + vmov e2, ctr + add r7, r7, #1 + vmov.32 ctr1[1], ip + vmov e3, ctr + + add ip, r3, #64 + aes_encrypt ip, r6, e0, e1, e2, e3 + bx lr +ENDPROC(pmull_aes_encrypt_4x) + +pmull_aes_encrypt_final: + add ip, r5, #4 + vld1.8 {ctr0}, [r5] + vld1.8 {ctr1}, [ip] + rev r8, r7 + vext.8 ctr1, ctr1, ctr1, #4 + mov r7, #1 << 24 // BE #1 for the tag + vmov.32 ctr1[1], r8 + vmov e0, ctr + vmov.32 ctr1[1], r7 + vmov e1, ctr + + add ip, r3, #64 + aes_encrypt ip, r6, e0, e1 + bx lr +ENDPROC(pmull_aes_encrypt_final) + + .macro enc_1x, in0 + bl pmull_aes_encrypt + veor \in0, \in0, e0 + vst1.8 {\in0}, [r4]! + .endm + + .macro dec_1x, in0 + bl pmull_aes_encrypt + veor e0, e0, \in0 + vst1.8 {e0}, [r4]! + .endm + + .macro enc_4x, in0, in1, in2, in3 + bl pmull_aes_encrypt_4x + + veor \in0, \in0, e0 + veor \in1, \in1, e1 + veor \in2, \in2, e2 + veor \in3, \in3, e3 + + vst1.8 {\in0-\in1}, [r4]! + vst1.8 {\in2-\in3}, [r4]! + .endm + + .macro dec_4x, in0, in1, in2, in3 + bl pmull_aes_encrypt_4x + + veor e0, e0, \in0 + veor e1, e1, \in1 + veor e2, e2, \in2 + veor e3, e3, \in3 + + vst1.8 {e0-e1}, [r4]! + vst1.8 {e2-e3}, [r4]! + .endm + + /* + * void pmull_gcm_encrypt(int blocks, u64 dg[], const char *src, + * struct gcm_key const *k, char *dst, + * char *iv, int rounds, u32 counter) + */ +ENTRY(pmull_gcm_encrypt) + push {r4-r8, lr} + ldrd r4, r5, [sp, #24] + ldrd r6, r7, [sp, #32] + + vld1.64 {SHASH}, [r3] + + ghash_update p64, enc, head=0 + vst1.64 {XL}, [r1] + + pop {r4-r8, pc} +ENDPROC(pmull_gcm_encrypt) + + /* + * void pmull_gcm_decrypt(int blocks, u64 dg[], const char *src, + * struct gcm_key const *k, char *dst, + * char *iv, int rounds, u32 counter) + */ +ENTRY(pmull_gcm_decrypt) + push {r4-r8, lr} + ldrd r4, r5, [sp, #24] + ldrd r6, r7, [sp, #32] + + vld1.64 {SHASH}, [r3] + + ghash_update p64, dec, head=0 + vst1.64 {XL}, [r1] + + pop {r4-r8, pc} +ENDPROC(pmull_gcm_decrypt) + + /* + * void pmull_gcm_enc_final(int bytes, u64 dg[], char *tag, + * struct gcm_key const *k, char *head, + * char *iv, int rounds, u32 counter) + */ +ENTRY(pmull_gcm_enc_final) + push {r4-r8, lr} + ldrd r4, r5, [sp, #24] + ldrd r6, r7, [sp, #32] + + bl pmull_aes_encrypt_final + + cmp r0, #0 + beq .Lenc_final + + mov_l ip, .Lpermute + sub r4, r4, #16 + add r8, ip, r0 + add ip, ip, #32 + add r4, r4, r0 + sub ip, ip, r0 + + vld1.8 {e3}, [r8] // permute vector for key stream + vld1.8 {e2}, [ip] // permute vector for ghash input + + vtbl.8 e3l, {e0}, e3l + vtbl.8 e3h, {e0}, e3h + + vld1.8 {e0}, [r4] // encrypt tail block + veor e0, e0, e3 + vst1.8 {e0}, [r4] + + vtbl.8 T1_L, {e0}, e2l + vtbl.8 T1_H, {e0}, e2h + + vld1.64 {XL}, [r1] +.Lenc_final: + vld1.64 {SHASH}, [r3, :128] + vmov.i8 MASK, #0xe1 + veor SHASH2_p64, SHASH_L, SHASH_H + vshl.u64 MASK, MASK, #57 + mov r0, #1 + bne 3f // process head block first + ghash_update p64, aggregate=0, head=0 + + vrev64.8 XL, XL + vext.8 XL, XL, XL, #8 + veor XL, XL, e1 + + sub r2, r2, #16 // rewind src pointer + vst1.8 {XL}, [r2] // store tag + + pop {r4-r8, pc} +ENDPROC(pmull_gcm_enc_final) + + /* + * int pmull_gcm_dec_final(int bytes, u64 dg[], char *tag, + * struct gcm_key const *k, char *head, + * char *iv, int rounds, u32 counter, + * const char *otag, int authsize) + */ +ENTRY(pmull_gcm_dec_final) + push {r4-r8, lr} + ldrd r4, r5, [sp, #24] + ldrd r6, r7, [sp, #32] + + bl pmull_aes_encrypt_final + + cmp r0, #0 + beq .Ldec_final + + mov_l ip, .Lpermute + sub r4, r4, #16 + add r8, ip, r0 + add ip, ip, #32 + add r4, r4, r0 + sub ip, ip, r0 + + vld1.8 {e3}, [r8] // permute vector for key stream + vld1.8 {e2}, [ip] // permute vector for ghash input + + vtbl.8 e3l, {e0}, e3l + vtbl.8 e3h, {e0}, e3h + + vld1.8 {e0}, [r4] + + vtbl.8 T1_L, {e0}, e2l + vtbl.8 T1_H, {e0}, e2h + + veor e0, e0, e3 + vst1.8 {e0}, [r4] + + vld1.64 {XL}, [r1] +.Ldec_final: + vld1.64 {SHASH}, [r3] + vmov.i8 MASK, #0xe1 + veor SHASH2_p64, SHASH_L, SHASH_H + vshl.u64 MASK, MASK, #57 + mov r0, #1 + bne 3f // process head block first + ghash_update p64, aggregate=0, head=0 + + vrev64.8 XL, XL + vext.8 XL, XL, XL, #8 + veor XL, XL, e1 + + mov_l ip, .Lpermute + ldrd r2, r3, [sp, #40] // otag and authsize + vld1.8 {T1}, [r2] + add ip, ip, r3 + vceq.i8 T1, T1, XL // compare tags + vmvn T1, T1 // 0 for eq, -1 for ne + + vld1.8 {e0}, [ip] + vtbl.8 XL_L, {T1}, e0l // keep authsize bytes only + vtbl.8 XL_H, {T1}, e0h + + vpmin.s8 XL_L, XL_L, XL_H // take the minimum s8 across the vector + vpmin.s8 XL_L, XL_L, XL_L + vmov.32 r0, XL_L[0] // fail if != 0x0 + + pop {r4-r8, pc} +ENDPROC(pmull_gcm_dec_final) + + .section ".rodata", "a", %progbits + .align 5 +.Lpermute: + .byte 0xff, 0xff, 0xff, 0xff, 0xff, 0xff, 0xff, 0xff + .byte 0xff, 0xff, 0xff, 0xff, 0xff, 0xff, 0xff, 0xff + .byte 0x00, 0x01, 0x02, 0x03, 0x04, 0x05, 0x06, 0x07 + .byte 0x08, 0x09, 0x0a, 0x0b, 0x0c, 0x0d, 0x0e, 0x0f + .byte 0xff, 0xff, 0xff, 0xff, 0xff, 0xff, 0xff, 0xff + .byte 0xff, 0xff, 0xff, 0xff, 0xff, 0xff, 0xff, 0xff diff --git a/arch/arm/crypto/ghash-ce-glue.c b/arch/arm/crypto/ghash-ce-glue.c index f13401f3e669..3ddf05b4234d 100644 --- a/arch/arm/crypto/ghash-ce-glue.c +++ b/arch/arm/crypto/ghash-ce-glue.c @@ -2,36 +2,53 @@ /* * Accelerated GHASH implementation with ARMv8 vmull.p64 instructions. * - * Copyright (C) 2015 - 2018 Linaro Ltd. + * Copyright (C) 2015 - 2018 Linaro Ltd. + * Copyright (C) 2023 Google LLC. */ #include #include #include #include +#include +#include #include #include +#include #include #include +#include #include +#include #include #include #include #include MODULE_DESCRIPTION("GHASH hash function using ARMv8 Crypto Extensions"); -MODULE_AUTHOR("Ard Biesheuvel "); -MODULE_LICENSE("GPL v2"); +MODULE_AUTHOR("Ard Biesheuvel "); +MODULE_LICENSE("GPL"); MODULE_ALIAS_CRYPTO("ghash"); +MODULE_ALIAS_CRYPTO("gcm(aes)"); +MODULE_ALIAS_CRYPTO("rfc4106(gcm(aes))"); #define GHASH_BLOCK_SIZE 16 #define GHASH_DIGEST_SIZE 16 +#define RFC4106_NONCE_SIZE 4 + struct ghash_key { be128 k; u64 h[][2]; }; +struct gcm_key { + u64 h[4][2]; + u32 rk[AES_MAX_KEYLENGTH_U32]; + int rounds; + u8 nonce[]; // for RFC4106 nonce +}; + struct ghash_desc_ctx { u64 digest[GHASH_DIGEST_SIZE/sizeof(u64)]; u8 buf[GHASH_BLOCK_SIZE]; @@ -344,6 +361,393 @@ static struct ahash_alg ghash_async_alg = { }, }; + +void pmull_gcm_encrypt(int blocks, u64 dg[], const char *src, + struct gcm_key const *k, char *dst, + const char *iv, int rounds, u32 counter); + +void pmull_gcm_enc_final(int blocks, u64 dg[], char *tag, + struct gcm_key const *k, char *head, + const char *iv, int rounds, u32 counter); + +void pmull_gcm_decrypt(int bytes, u64 dg[], const char *src, + struct gcm_key const *k, char *dst, + const char *iv, int rounds, u32 counter); + +int pmull_gcm_dec_final(int bytes, u64 dg[], char *tag, + struct gcm_key const *k, char *head, + const char *iv, int rounds, u32 counter, + const char *otag, int authsize); + +static int gcm_aes_setkey(struct crypto_aead *tfm, const u8 *inkey, + unsigned int keylen) +{ + struct gcm_key *ctx = crypto_aead_ctx(tfm); + struct crypto_aes_ctx aes_ctx; + be128 h, k; + int ret; + + ret = aes_expandkey(&aes_ctx, inkey, keylen); + if (ret) + return -EINVAL; + + aes_encrypt(&aes_ctx, (u8 *)&k, (u8[AES_BLOCK_SIZE]){}); + + memcpy(ctx->rk, aes_ctx.key_enc, sizeof(ctx->rk)); + ctx->rounds = 6 + keylen / 4; + + memzero_explicit(&aes_ctx, sizeof(aes_ctx)); + + ghash_reflect(ctx->h[0], &k); + + h = k; + gf128mul_lle(&h, &k); + ghash_reflect(ctx->h[1], &h); + + gf128mul_lle(&h, &k); + ghash_reflect(ctx->h[2], &h); + + gf128mul_lle(&h, &k); + ghash_reflect(ctx->h[3], &h); + + return 0; +} + +static int gcm_aes_setauthsize(struct crypto_aead *tfm, unsigned int authsize) +{ + return crypto_gcm_check_authsize(authsize); +} + +static void gcm_update_mac(u64 dg[], const u8 *src, int count, u8 buf[], + int *buf_count, struct gcm_key *ctx) +{ + if (*buf_count > 0) { + int buf_added = min(count, GHASH_BLOCK_SIZE - *buf_count); + + memcpy(&buf[*buf_count], src, buf_added); + + *buf_count += buf_added; + src += buf_added; + count -= buf_added; + } + + if (count >= GHASH_BLOCK_SIZE || *buf_count == GHASH_BLOCK_SIZE) { + int blocks = count / GHASH_BLOCK_SIZE; + + pmull_ghash_update_p64(blocks, dg, src, ctx->h, + *buf_count ? buf : NULL); + + src += blocks * GHASH_BLOCK_SIZE; + count %= GHASH_BLOCK_SIZE; + *buf_count = 0; + } + + if (count > 0) { + memcpy(buf, src, count); + *buf_count = count; + } +} + +static void gcm_calculate_auth_mac(struct aead_request *req, u64 dg[], u32 len) +{ + struct crypto_aead *aead = crypto_aead_reqtfm(req); + struct gcm_key *ctx = crypto_aead_ctx(aead); + u8 buf[GHASH_BLOCK_SIZE]; + struct scatter_walk walk; + int buf_count = 0; + + scatterwalk_start(&walk, req->src); + + do { + u32 n = scatterwalk_clamp(&walk, len); + u8 *p; + + if (!n) { + scatterwalk_start(&walk, sg_next(walk.sg)); + n = scatterwalk_clamp(&walk, len); + } + + p = scatterwalk_map(&walk); + gcm_update_mac(dg, p, n, buf, &buf_count, ctx); + scatterwalk_unmap(p); + + if (unlikely(len / SZ_4K > (len - n) / SZ_4K)) { + kernel_neon_end(); + kernel_neon_begin(); + } + + len -= n; + scatterwalk_advance(&walk, n); + scatterwalk_done(&walk, 0, len); + } while (len); + + if (buf_count) { + memset(&buf[buf_count], 0, GHASH_BLOCK_SIZE - buf_count); + pmull_ghash_update_p64(1, dg, buf, ctx->h, NULL); + } +} + +static int gcm_encrypt(struct aead_request *req, const u8 *iv, u32 assoclen) +{ + struct crypto_aead *aead = crypto_aead_reqtfm(req); + struct gcm_key *ctx = crypto_aead_ctx(aead); + struct skcipher_walk walk; + u8 buf[AES_BLOCK_SIZE]; + u32 counter = 2; + u64 dg[2] = {}; + be128 lengths; + const u8 *src; + u8 *tag, *dst; + int tail, err; + + if (WARN_ON_ONCE(!may_use_simd())) + return -EBUSY; + + err = skcipher_walk_aead_encrypt(&walk, req, false); + + kernel_neon_begin(); + + if (assoclen) + gcm_calculate_auth_mac(req, dg, assoclen); + + src = walk.src.virt.addr; + dst = walk.dst.virt.addr; + + while (walk.nbytes >= AES_BLOCK_SIZE) { + int nblocks = walk.nbytes / AES_BLOCK_SIZE; + + pmull_gcm_encrypt(nblocks, dg, src, ctx, dst, iv, + ctx->rounds, counter); + counter += nblocks; + + if (walk.nbytes == walk.total) { + src += nblocks * AES_BLOCK_SIZE; + dst += nblocks * AES_BLOCK_SIZE; + break; + } + + kernel_neon_end(); + + err = skcipher_walk_done(&walk, + walk.nbytes % AES_BLOCK_SIZE); + if (err) + return err; + + src = walk.src.virt.addr; + dst = walk.dst.virt.addr; + + kernel_neon_begin(); + } + + + lengths.a = cpu_to_be64(assoclen * 8); + lengths.b = cpu_to_be64(req->cryptlen * 8); + + tag = (u8 *)&lengths; + tail = walk.nbytes % AES_BLOCK_SIZE; + + /* + * Bounce via a buffer unless we are encrypting in place and src/dst + * are not pointing to the start of the walk buffer. In that case, we + * can do a NEON load/xor/store sequence in place as long as we move + * the plain/ciphertext and keystream to the start of the register. If + * not, do a memcpy() to the end of the buffer so we can reuse the same + * logic. + */ + if (unlikely(tail && (tail == walk.nbytes || src != dst))) + src = memcpy(buf + sizeof(buf) - tail, src, tail); + + pmull_gcm_enc_final(tail, dg, tag, ctx, (u8 *)src, iv, + ctx->rounds, counter); + kernel_neon_end(); + + if (unlikely(tail && src != dst)) + memcpy(dst, src, tail); + + if (walk.nbytes) { + err = skcipher_walk_done(&walk, 0); + if (err) + return err; + } + + /* copy authtag to end of dst */ + scatterwalk_map_and_copy(tag, req->dst, req->assoclen + req->cryptlen, + crypto_aead_authsize(aead), 1); + + return 0; +} + +static int gcm_decrypt(struct aead_request *req, const u8 *iv, u32 assoclen) +{ + struct crypto_aead *aead = crypto_aead_reqtfm(req); + struct gcm_key *ctx = crypto_aead_ctx(aead); + int authsize = crypto_aead_authsize(aead); + struct skcipher_walk walk; + u8 otag[AES_BLOCK_SIZE]; + u8 buf[AES_BLOCK_SIZE]; + u32 counter = 2; + u64 dg[2] = {}; + be128 lengths; + const u8 *src; + u8 *tag, *dst; + int tail, err, ret; + + if (WARN_ON_ONCE(!may_use_simd())) + return -EBUSY; + + scatterwalk_map_and_copy(otag, req->src, + req->assoclen + req->cryptlen - authsize, + authsize, 0); + + err = skcipher_walk_aead_decrypt(&walk, req, false); + + kernel_neon_begin(); + + if (assoclen) + gcm_calculate_auth_mac(req, dg, assoclen); + + src = walk.src.virt.addr; + dst = walk.dst.virt.addr; + + while (walk.nbytes >= AES_BLOCK_SIZE) { + int nblocks = walk.nbytes / AES_BLOCK_SIZE; + + pmull_gcm_decrypt(nblocks, dg, src, ctx, dst, iv, + ctx->rounds, counter); + counter += nblocks; + + if (walk.nbytes == walk.total) { + src += nblocks * AES_BLOCK_SIZE; + dst += nblocks * AES_BLOCK_SIZE; + break; + } + + kernel_neon_end(); + + err = skcipher_walk_done(&walk, + walk.nbytes % AES_BLOCK_SIZE); + if (err) + return err; + + src = walk.src.virt.addr; + dst = walk.dst.virt.addr; + + kernel_neon_begin(); + } + + lengths.a = cpu_to_be64(assoclen * 8); + lengths.b = cpu_to_be64((req->cryptlen - authsize) * 8); + + tag = (u8 *)&lengths; + tail = walk.nbytes % AES_BLOCK_SIZE; + + if (unlikely(tail && (tail == walk.nbytes || src != dst))) + src = memcpy(buf + sizeof(buf) - tail, src, tail); + + ret = pmull_gcm_dec_final(tail, dg, tag, ctx, (u8 *)src, iv, + ctx->rounds, counter, otag, authsize); + kernel_neon_end(); + + if (unlikely(tail && src != dst)) + memcpy(dst, src, tail); + + if (walk.nbytes) { + err = skcipher_walk_done(&walk, 0); + if (err) + return err; + } + + return ret ? -EBADMSG : 0; +} + +static int gcm_aes_encrypt(struct aead_request *req) +{ + return gcm_encrypt(req, req->iv, req->assoclen); +} + +static int gcm_aes_decrypt(struct aead_request *req) +{ + return gcm_decrypt(req, req->iv, req->assoclen); +} + +static int rfc4106_setkey(struct crypto_aead *tfm, const u8 *inkey, + unsigned int keylen) +{ + struct gcm_key *ctx = crypto_aead_ctx(tfm); + int err; + + keylen -= RFC4106_NONCE_SIZE; + err = gcm_aes_setkey(tfm, inkey, keylen); + if (err) + return err; + + memcpy(ctx->nonce, inkey + keylen, RFC4106_NONCE_SIZE); + return 0; +} + +static int rfc4106_setauthsize(struct crypto_aead *tfm, unsigned int authsize) +{ + return crypto_rfc4106_check_authsize(authsize); +} + +static int rfc4106_encrypt(struct aead_request *req) +{ + struct crypto_aead *aead = crypto_aead_reqtfm(req); + struct gcm_key *ctx = crypto_aead_ctx(aead); + u8 iv[GCM_AES_IV_SIZE]; + + memcpy(iv, ctx->nonce, RFC4106_NONCE_SIZE); + memcpy(iv + RFC4106_NONCE_SIZE, req->iv, GCM_RFC4106_IV_SIZE); + + return crypto_ipsec_check_assoclen(req->assoclen) ?: + gcm_encrypt(req, iv, req->assoclen - GCM_RFC4106_IV_SIZE); +} + +static int rfc4106_decrypt(struct aead_request *req) +{ + struct crypto_aead *aead = crypto_aead_reqtfm(req); + struct gcm_key *ctx = crypto_aead_ctx(aead); + u8 iv[GCM_AES_IV_SIZE]; + + memcpy(iv, ctx->nonce, RFC4106_NONCE_SIZE); + memcpy(iv + RFC4106_NONCE_SIZE, req->iv, GCM_RFC4106_IV_SIZE); + + return crypto_ipsec_check_assoclen(req->assoclen) ?: + gcm_decrypt(req, iv, req->assoclen - GCM_RFC4106_IV_SIZE); +} + +static struct aead_alg gcm_aes_algs[] = {{ + .ivsize = GCM_AES_IV_SIZE, + .chunksize = AES_BLOCK_SIZE, + .maxauthsize = AES_BLOCK_SIZE, + .setkey = gcm_aes_setkey, + .setauthsize = gcm_aes_setauthsize, + .encrypt = gcm_aes_encrypt, + .decrypt = gcm_aes_decrypt, + + .base.cra_name = "gcm(aes)", + .base.cra_driver_name = "gcm-aes-ce", + .base.cra_priority = 400, + .base.cra_blocksize = 1, + .base.cra_ctxsize = sizeof(struct gcm_key), + .base.cra_module = THIS_MODULE, +}, { + .ivsize = GCM_RFC4106_IV_SIZE, + .chunksize = AES_BLOCK_SIZE, + .maxauthsize = AES_BLOCK_SIZE, + .setkey = rfc4106_setkey, + .setauthsize = rfc4106_setauthsize, + .encrypt = rfc4106_encrypt, + .decrypt = rfc4106_decrypt, + + .base.cra_name = "rfc4106(gcm(aes))", + .base.cra_driver_name = "rfc4106-gcm-aes-ce", + .base.cra_priority = 400, + .base.cra_blocksize = 1, + .base.cra_ctxsize = sizeof(struct gcm_key) + RFC4106_NONCE_SIZE, + .base.cra_module = THIS_MODULE, +}}; + static int __init ghash_ce_mod_init(void) { int err; @@ -352,13 +756,17 @@ static int __init ghash_ce_mod_init(void) return -ENODEV; if (elf_hwcap2 & HWCAP2_PMULL) { + err = crypto_register_aeads(gcm_aes_algs, + ARRAY_SIZE(gcm_aes_algs)); + if (err) + return err; ghash_alg.base.cra_ctxsize += 3 * sizeof(u64[2]); static_branch_enable(&use_p64); } err = crypto_register_shash(&ghash_alg); if (err) - return err; + goto err_aead; err = crypto_register_ahash(&ghash_async_alg); if (err) goto err_shash; @@ -367,6 +775,10 @@ static int __init ghash_ce_mod_init(void) err_shash: crypto_unregister_shash(&ghash_alg); +err_aead: + if (elf_hwcap2 & HWCAP2_PMULL) + crypto_unregister_aeads(gcm_aes_algs, + ARRAY_SIZE(gcm_aes_algs)); return err; } @@ -374,6 +786,9 @@ static void __exit ghash_ce_mod_exit(void) { crypto_unregister_ahash(&ghash_async_alg); crypto_unregister_shash(&ghash_alg); + if (elf_hwcap2 & HWCAP2_PMULL) + crypto_unregister_aeads(gcm_aes_algs, + ARRAY_SIZE(gcm_aes_algs)); } module_init(ghash_ce_mod_init); -- cgit v1.2.3 From cfb1076d1549f6d500da65a4593a67cba19cb041 Mon Sep 17 00:00:00 2001 From: Randy Dunlap Date: Tue, 24 Jan 2023 19:16:18 +0100 Subject: ARM: 9288/1: Kconfigs: fix spelling & grammar Fix spelling (reported by codespell) and grammar in Arm Kconfig files. Signed-off-by: Randy Dunlap Cc: linux-arm-kernel@lists.infradead.org Cc: patches@armlinux.org.uk Signed-off-by: Russell King (Oracle) --- arch/arm/Kconfig.debug | 4 ++-- arch/arm/mm/Kconfig | 2 +- 2 files changed, 3 insertions(+), 3 deletions(-) (limited to 'arch/arm') diff --git a/arch/arm/Kconfig.debug b/arch/arm/Kconfig.debug index c345775f035b..a1af9eb11d6f 100644 --- a/arch/arm/Kconfig.debug +++ b/arch/arm/Kconfig.debug @@ -1276,8 +1276,8 @@ choice depends on MACH_STM32MP157 select DEBUG_STM32_UART help - Say Y here if you want kernel low-level debugging support - on STM32MP1 based platforms, wich default UART is wired on + Say Y here if you want kernel low-level debugging support on + STM32MP1-based platforms, where the default UART is wired to UART4, but another UART instance can be selected by modifying CONFIG_DEBUG_UART_PHYS and CONFIG_DEBUG_UART_VIRT. diff --git a/arch/arm/mm/Kconfig b/arch/arm/mm/Kconfig index fc439c2c16f8..c5bbae86f725 100644 --- a/arch/arm/mm/Kconfig +++ b/arch/arm/mm/Kconfig @@ -743,7 +743,7 @@ config SWP_EMULATE If unsure, say Y. choice - prompt "CPU Endianess" + prompt "CPU Endianness" default CPU_LITTLE_ENDIAN config CPU_LITTLE_ENDIAN -- cgit v1.2.3 From 5eb6e280432ddc9b755193552f3a070da8d7455c Mon Sep 17 00:00:00 2001 From: Nathan Chancellor Date: Sun, 29 Jan 2023 15:49:07 +0100 Subject: ARM: 9289/1: Allow pre-ARMv5 builds with ld.lld 16.0.0 and newer Commit 6a7ee50f8f56 ("ARM: disallow pre-ARMv5 builds with ld.lld") prevented v4 or v4t kernels when ld.lld will link the kernel due to inserting unsupported blx instructions. ld.lld has been fixed in current main (16.0.0) to avoid inserting these instructions by inserting position independent thunks instead. Allow these configurations to be enabled when ld.lld 16.0.0 is used to link the kernel. Additionally, add a link to the upstream LLVM issue so that the reason for this dependency is clearly documented. Link: https://github.com/ClangBuiltLinux/linux/issues/964 Link: https://github.com/llvm/llvm-project/commit/6f9ff1beee9d12aca0c9caa9ae0051dc6d0a718c Suggested-by: Nick Desaulniers Tested-by: Arnd Bergmann Signed-off-by: Nathan Chancellor Signed-off-by: Russell King (Oracle) --- arch/arm/Kconfig | 6 ++++-- 1 file changed, 4 insertions(+), 2 deletions(-) (limited to 'arch/arm') diff --git a/arch/arm/Kconfig b/arch/arm/Kconfig index 45af77e67973..666a25d6147c 100644 --- a/arch/arm/Kconfig +++ b/arch/arm/Kconfig @@ -345,14 +345,16 @@ comment "CPU Core family selection" config ARCH_MULTI_V4 bool "ARMv4 based platforms (FA526, StrongARM)" depends on !ARCH_MULTI_V6_V7 - depends on !LD_IS_LLD + # https://github.com/llvm/llvm-project/issues/50764 + depends on !LD_IS_LLD || LLD_VERSION >= 160000 select ARCH_MULTI_V4_V5 select CPU_FA526 if !(CPU_SA110 || CPU_SA1100) config ARCH_MULTI_V4T bool "ARMv4T based platforms (ARM720T, ARM920T, ...)" depends on !ARCH_MULTI_V6_V7 - depends on !LD_IS_LLD + # https://github.com/llvm/llvm-project/issues/50764 + depends on !LD_IS_LLD || LLD_VERSION >= 160000 select ARCH_MULTI_V4_V5 select CPU_ARM920T if !(CPU_ARM7TDMI || CPU_ARM720T || \ CPU_ARM740T || CPU_ARM9TDMI || CPU_ARM922T || \ -- cgit v1.2.3