diff options
author | Paolo Bonzini <pbonzini@redhat.com> | 2024-07-17 13:04:48 -0400 |
---|---|---|
committer | Paolo Bonzini <pbonzini@redhat.com> | 2024-07-26 14:46:14 -0400 |
commit | 5932ca411e533e7ad2b97c47b4357a05fa06c2a5 (patch) | |
tree | 7a164c01d4fb3b3f840d9b464d7685c9ba109a76 /Documentation | |
parent | c2adcf051be01d3da1e138c178a680514299461b (diff) | |
download | lwn-5932ca411e533e7ad2b97c47b4357a05fa06c2a5.tar.gz lwn-5932ca411e533e7ad2b97c47b4357a05fa06c2a5.zip |
KVM: x86: disallow pre-fault for SNP VMs before initialization
KVM_PRE_FAULT_MEMORY for an SNP guest can race with
sev_gmem_post_populate() in bad ways. The following sequence for
instance can potentially trigger an RMP fault:
thread A, sev_gmem_post_populate: called
thread B, sev_gmem_prepare: places below 'pfn' in a private state in RMP
thread A, sev_gmem_post_populate: *vaddr = kmap_local_pfn(pfn + i);
thread A, sev_gmem_post_populate: copy_from_user(vaddr, src + i * PAGE_SIZE, PAGE_SIZE);
RMP #PF
Fix this by only allowing KVM_PRE_FAULT_MEMORY to run after a guest's
initial private memory contents have been finalized via
KVM_SEV_SNP_LAUNCH_FINISH.
Beyond fixing this issue, it just sort of makes sense to enforce this,
since the KVM_PRE_FAULT_MEMORY documentation states:
"KVM maps memory as if the vCPU generated a stage-2 read page fault"
which sort of implies we should be acting on the same guest state that a
vCPU would see post-launch after the initial guest memory is all set up.
Co-developed-by: Michael Roth <michael.roth@amd.com>
Signed-off-by: Michael Roth <michael.roth@amd.com>
Signed-off-by: Paolo Bonzini <pbonzini@redhat.com>
Diffstat (limited to 'Documentation')
-rw-r--r-- | Documentation/virt/kvm/api.rst | 6 |
1 files changed, 6 insertions, 0 deletions
diff --git a/Documentation/virt/kvm/api.rst b/Documentation/virt/kvm/api.rst index ec1cd8aa1d56..7b512286f8d2 100644 --- a/Documentation/virt/kvm/api.rst +++ b/Documentation/virt/kvm/api.rst @@ -6402,6 +6402,12 @@ for the current vCPU state. KVM maps memory as if the vCPU generated a stage-2 read page fault, e.g. faults in memory as needed, but doesn't break CoW. However, KVM does not mark any newly created stage-2 PTE as Accessed. +In the case of confidential VM types where there is an initial set up of +private guest memory before the guest is 'finalized'/measured, this ioctl +should only be issued after completing all the necessary setup to put the +guest into a 'finalized' state so that the above semantics can be reliably +ensured. + In some cases, multiple vCPUs might share the page tables. In this case, the ioctl can be called in parallel. |