Age | Commit message (Collapse) | Author |
|
When KVM is in protected mode, host calls to PSCI are proxied via EL2,
and cold entries from CPU_ON, CPU_SUSPEND, and SYSTEM_SUSPEND bounce
through __kvm_hyp_init_cpu() at EL2 before entering the host kernel's
entry point at EL1. While __kvm_hyp_init_cpu() initializes SPSR_EL2 for
the exception return to EL1, it does not initialize SCTLR_EL1.
Due to this, it's possible to enter EL1 with SCTLR_EL1 in an UNKNOWN
state. In practice this has been seen to result in kernel crashes after
CPU_ON as a result of SCTLR_EL1.M being 1 in violation of the initial
core configuration specified by PSCI.
Fix this by initializing SCTLR_EL1 for cold entry to the host kernel.
As it's necessary to write to SCTLR_EL12 in VHE mode, this
initialization is moved into __kvm_host_psci_cpu_entry() where we can
use write_sysreg_el1().
The remnants of the '__init_el2_nvhe_prepare_eret' macro are folded into
its only caller, as this is clearer than having the macro.
Fixes: cdf367192766ad11 ("KVM: arm64: Intercept host's CPU_ON SMCs")
Reported-by: Leo Yan <leo.yan@arm.com>
Signed-off-by: Ahmed Genidi <ahmed.genidi@arm.com>
[ Mark: clarify commit message, handle E2H, move to C, remove macro ]
Signed-off-by: Mark Rutland <mark.rutland@arm.com>
Cc: Ahmed Genidi <ahmed.genidi@arm.com>
Cc: Ben Horgan <ben.horgan@arm.com>
Cc: Catalin Marinas <catalin.marinas@arm.com>
Cc: Leo Yan <leo.yan@arm.com>
Cc: Marc Zyngier <maz@kernel.org>
Cc: Oliver Upton <oliver.upton@linux.dev>
Cc: Will Deacon <will@kernel.org>
Reviewed-by: Leo Yan <leo.yan@arm.com>
Link: https://lore.kernel.org/r/20250227180526.1204723-3-mark.rutland@arm.com
Signed-off-by: Marc Zyngier <maz@kernel.org>
|
|
Pass through the SYSTEM_OFF2 function for hibernation, just like SYSTEM_OFF.
Signed-off-by: David Woodhouse <dwmw@amazon.co.uk>
Reviewed-by: Miguel Luis <miguel.luis@oracle.com>
Link: https://lore.kernel.org/r/20241019172459.2241939-6-dwmw2@infradead.org
Signed-off-by: Oliver Upton <oliver.upton@linux.dev>
|
|
In order to facilitate the introduction of new per-CPU state,
add a new host_data_ptr() helped that hides some of the per-CPU
verbosity, and make it easier to move that state around in the
future.
Reviewed-by: Suzuki K Poulose <suzuki.poulose@arm.com>
Signed-off-by: Marc Zyngier <maz@kernel.org>
|
|
When SVE is enabled, the host may set bit 16 in SMCCC function IDs, a
hint that indicates an unused SVE state. At the moment NVHE doesn't
account for this bit when inspecting the function ID, and rejects most
calls. Clear the hint bit before comparing function IDs.
About version compatibility: the host's PSCI driver initially probes the
firmware for a SMCCC version number. If the firmware implements a
protocol recent enough (1.3), subsequent SMCCC calls have the hint bit
set. Since the hint bit was reserved in earlier versions of the
protocol, clearing it is fine regardless of the version in use.
When a new hint is added to the protocol in the future, it will be added
to ARM_SMCCC_CALL_HINTS and NVHE will handle it straight away. This
patch only clears known hints and leaves reserved bits as is, because
future SMCCC versions could use reserved bits as modifiers for the
function ID, rather than hints.
Fixes: cfa7ff959a78 ("arm64: smccc: Support SMCCC v1.3 SVE register saving hint")
Reported-by: Ben Horgan <ben.horgan@arm.com>
Signed-off-by: Jean-Philippe Brucker <jean-philippe@linaro.org>
Signed-off-by: Marc Zyngier <maz@kernel.org>
Link: https://lore.kernel.org/r/20230911145254.934414-4-jean-philippe@linaro.org
|
|
Some bti instructions were missing from
commit b53d4a272349 ("KVM: arm64: Use BTI for nvhe")
1) kvm_host_psci_cpu_entry
kvm_host_psci_cpu_entry is called from __kvm_hyp_init_cpu through "br"
instruction as __kvm_hyp_init_cpu resides in idmap section while
kvm_host_psci_cpu_entry is in hyp .text so the offset is larger than
128MB range covered by "b".
Which means that this function should start with "bti j" instruction.
LLVM which is the only compiler supporting BTI for Linux, adds "bti j"
for jump tables or by when taking the address of the block [1].
Same behaviour is observed with GCC.
As kvm_host_psci_cpu_entry is a C function, this must be done in
assembly.
Another solution is to use X16/X17 with "br", as according to ARM
ARM DDI0487I.a RLJHCL/IGMGRS, PACIASP has an implicit branch
target identification instruction that is compatible with
PSTATE.BTYPE 0b01 which includes "br X16/X17"
And the kvm_host_psci_cpu_entry has PACIASP as it is an external
function.
Although, using explicit "bti" makes it more clear than relying on
which register is used.
A third solution is to clear SCTLR_EL2.BT, which would make PACIASP
compatible PSTATE.BTYPE 0b11 ("br" to other registers).
However this deviates from the kernel behaviour (in bti_enable()).
2) Spectre vector table
"br" instructions are generated at runtime for the vector table
(__bp_harden_hyp_vecs).
These branches would land on vectors in __kvm_hyp_vector at offset 8.
As all the macros are defined with valid_vect/invalid_vect, it is
sufficient to add "bti j" at the correct offset.
[1] https://reviews.llvm.org/D52867
Fixes: b53d4a272349 ("KVM: arm64: Use BTI for nvhe")
Signed-off-by: Mostafa Saleh <smostafa@google.com>
Reported-by: Sudeep Holla <sudeep.holla@arm.com>
Acked-by: Marc Zyngier <maz@kernel.org>
Tested-by: Sudeep Holla <sudeep.holla@arm.com>
Link: https://lore.kernel.org/r/20230706152240.685684-1-smostafa@google.com
Signed-off-by: Oliver Upton <oliver.upton@linux.dev>
|
|
With nVHE, the host currently creates all stage 1 hypervisor mappings at
EL1 during boot, installs them at EL2, and extends them as required
(e.g. when creating a new VM). But in a world where the host is no
longer trusted, it cannot have full control over the code mapped in the
hypervisor.
In preparation for enabling the hypervisor to create its own stage 1
mappings during boot, introduce an early page allocator, with minimal
functionality. This allocator is designed to be used only during early
bootstrap of the hyp code when memory protection is enabled, which will
then switch to using a full-fledged page allocator after init.
Acked-by: Will Deacon <will@kernel.org>
Signed-off-by: Quentin Perret <qperret@google.com>
Signed-off-by: Marc Zyngier <maz@kernel.org>
Link: https://lore.kernel.org/r/20210319100146.1149909-11-qperret@google.com
|
|
Signed-off-by: Marc Zyngier <maz@kernel.org>
|
|
Hyp code used the hyp_symbol_addr helper to force PC-relative addressing
because absolute addressing results in kernel VAs due to the way hyp
code is linked. This is not true anymore, so remove the helper and
update all of its users.
Acked-by: Ard Biesheuvel <ardb@kernel.org>
Signed-off-by: David Brazdil <dbrazdil@google.com>
Signed-off-by: Marc Zyngier <maz@kernel.org>
Link: https://lore.kernel.org/r/20210105180541.65031-9-dbrazdil@google.com
|
|
The KVM/arm64 PSCI relay assumes that SYSTEM_OFF and SYSTEM_RESET should
not return, as dictated by the PSCI spec. However, there is firmware out
there which breaks this assumption, leading to a hyp panic. Make KVM
more robust to broken firmware by allowing these to return.
Signed-off-by: David Brazdil <dbrazdil@google.com>
Signed-off-by: Marc Zyngier <maz@kernel.org>
Link: https://lore.kernel.org/r/20201229160059.64135-1-dbrazdil@google.com
|
|
Although there is nothing wrong with the current host PSCI relay
implementation, we can clean it up and remove some of the helpers
that do not improve the overall readability of the legacy PSCI 0.1
handling.
Opportunity is taken to turn the bitmap into a set of booleans,
and creative use of preprocessor macros make init and check
more concise/readable.
Suggested-by: Mark Rutland <mark.rutland@arm.com>
Signed-off-by: Marc Zyngier <maz@kernel.org>
|
|
Minor cleanup removing unused includes.
Signed-off-by: David Brazdil <dbrazdil@google.com>
Signed-off-by: Marc Zyngier <maz@kernel.org>
Link: https://lore.kernel.org/r/20201208142452.87237-6-dbrazdil@google.com
|
|
PSCI driver exposes a struct containing the PSCI v0.1 function IDs
configured in the DT. However, the struct does not convey the
information whether these were set from DT or contain the default value
zero. This could be a problem for PSCI proxy in KVM protected mode.
Extend config passed to KVM with a bit mask with individual bits set
depending on whether the corresponding function pointer in psci_ops is
set, eg. set bit for PSCI_CPU_SUSPEND if psci_ops.cpu_suspend != NULL.
Previously config was split into multiple global variables. Put
everything into a single struct for convenience.
Reported-by: Mark Rutland <mark.rutland@arm.com>
Signed-off-by: David Brazdil <dbrazdil@google.com>
Signed-off-by: Marc Zyngier <maz@kernel.org>
Link: https://lore.kernel.org/r/20201208142452.87237-2-dbrazdil@google.com
|
|
Add a handler of SYSTEM_SUSPEND host PSCI SMCs. The semantics are
equivalent to CPU_SUSPEND, typically called on the last online CPU.
Reuse the same entry point and boot args struct as CPU_SUSPEND.
Signed-off-by: David Brazdil <dbrazdil@google.com>
Signed-off-by: Marc Zyngier <maz@kernel.org>
Link: https://lore.kernel.org/r/20201202184122.26046-24-dbrazdil@google.com
|
|
Add a handler of CPU_SUSPEND host PSCI SMCs. The SMC can either enter
a sleep state indistinguishable from a WFI or a deeper sleep state that
behaves like a CPU_OFF+CPU_ON except that the core is still considered
online while asleep.
The handler saves r0,pc of the host and makes the same call to EL3 with
the hyp CPU entry point. It either returns back to the handler and then
back to the host, or wakes up into the entry point and initializes EL2
state before dropping back to EL1. No EL2 state needs to be
saved/restored for this purpose.
CPU_ON and CPU_SUSPEND are both implemented using struct psci_boot_args
to store the state upon powerup, with each CPU having separate structs
for CPU_ON and CPU_SUSPEND so that CPU_SUSPEND can operate locklessly
and so that a CPU_ON call targeting a CPU cannot interfere with
a concurrent CPU_SUSPEND call on that CPU.
Signed-off-by: David Brazdil <dbrazdil@google.com>
Signed-off-by: Marc Zyngier <maz@kernel.org>
Link: https://lore.kernel.org/r/20201202184122.26046-23-dbrazdil@google.com
|
|
Add a handler of the CPU_ON PSCI call from host. When invoked, it looks
up the logical CPU ID corresponding to the provided MPIDR and populates
the state struct of the target CPU with the provided x0, pc. It then
calls CPU_ON itself, with an entry point in hyp that initializes EL2
state before returning ERET to the provided PC in EL1.
There is a simple atomic lock around the boot args struct. If it is
already locked, CPU_ON will return PENDING_ON error code.
Signed-off-by: David Brazdil <dbrazdil@google.com>
Signed-off-by: Marc Zyngier <maz@kernel.org>
Link: https://lore.kernel.org/r/20201202184122.26046-22-dbrazdil@google.com
|
|
Forward the following PSCI SMCs issued by host to EL3 as they do not
require the hypervisor's intervention. This assumes that EL3 correctly
implements the PSCI specification.
Only function IDs implemented in Linux are included.
Where both 32-bit and 64-bit variants exist, it is assumed that the host
will always use the 64-bit variant.
* SMCs that only return information about the system
* PSCI_VERSION - PSCI version implemented by EL3
* PSCI_FEATURES - optional features supported by EL3
* AFFINITY_INFO - power state of core/cluster
* MIGRATE_INFO_TYPE - whether Trusted OS can be migrated
* MIGRATE_INFO_UP_CPU - resident core of Trusted OS
* operations which do not affect the hypervisor
* MIGRATE - migrate Trusted OS to a different core
* SET_SUSPEND_MODE - toggle OS-initiated mode
* system shutdown/reset
* SYSTEM_OFF
* SYSTEM_RESET
* SYSTEM_RESET2
Signed-off-by: David Brazdil <dbrazdil@google.com>
Signed-off-by: Marc Zyngier <maz@kernel.org>
Link: https://lore.kernel.org/r/20201202184122.26046-19-dbrazdil@google.com
|
|
Add a host-initialized constant to KVM nVHE hyp code for converting
between EL2 linear map virtual addresses and physical addresses.
Also add `__hyp_pa` macro that performs the conversion.
Signed-off-by: David Brazdil <dbrazdil@google.com>
Signed-off-by: Marc Zyngier <maz@kernel.org>
Link: https://lore.kernel.org/r/20201202184122.26046-18-dbrazdil@google.com
|
|
Add a handler of PSCI SMCs in nVHE hyp code. The handler is initialized
with the version used by the host's PSCI driver and the function IDs it
was configured with. If the SMC function ID matches one of the
configured PSCI calls (for v0.1) or falls into the PSCI function ID
range (for v0.2+), the SMC is handled by the PSCI handler. For now, all
SMCs return PSCI_RET_NOT_SUPPORTED.
Signed-off-by: David Brazdil <dbrazdil@google.com>
Signed-off-by: Marc Zyngier <maz@kernel.org>
Link: https://lore.kernel.org/r/20201202184122.26046-17-dbrazdil@google.com
|