1
2
3
4
5
6
7
8
9
10
11
12
13
14
15
16
17
18
19
20
21
22
23
24
25
26
27
28
29
30
31
32
33
34
35
36
37
38
39
40
41
42
43
44
45
46
47
48
49
50
51
52
53
54
55
56
57
58
59
60
61
62
63
64
65
66
67
68
69
70
71
72
73
74
75
76
77
78
79
80
81
82
83
84
85
86
87
88
89
90
91
92
93
94
95
96
97
98
99
100
101
102
103
104
105
106
107
108
109
110
111
112
113
114
115
116
117
118
119
120
121
122
123
124
125
126
127
128
129
130
131
132
133
134
135
136
137
138
139
140
141
142
143
144
145
146
147
148
149
150
151
152
153
154
155
156
157
158
159
160
161
162
163
164
165
166
167
168
169
170
171
172
173
174
175
176
177
178
179
180
181
182
183
184
185
186
187
188
189
190
191
192
193
194
195
196
197
198
199
200
201
202
203
204
205
206
207
208
209
210
211
212
213
214
215
216
217
218
219
220
221
222
223
224
225
226
227
228
229
230
231
232
233
234
235
236
237
238
239
240
241
242
243
244
245
246
247
248
249
250
251
252
253
254
255
256
257
258
259
260
261
262
263
264
265
266
267
268
269
270
271
272
273
274
275
276
277
278
|
.. SPDX-License-Identifier: GPL-2.0
=====================
AMD Memory Encryption
=====================
Secure Memory Encryption (SME) and Secure Encrypted Virtualization (SEV) are
features found on AMD processors.
SME provides the ability to mark individual pages of memory as encrypted using
the standard x86 page tables. A page that is marked encrypted will be
automatically decrypted when read from DRAM and encrypted when written to
DRAM. SME can therefore be used to protect the contents of DRAM from physical
attacks on the system.
SEV enables running encrypted virtual machines (VMs) in which the code and data
of the guest VM are secured so that a decrypted version is available only
within the VM itself. SEV guest VMs have the concept of private and shared
memory. Private memory is encrypted with the guest-specific key, while shared
memory may be encrypted with hypervisor key. When SME is enabled, the hypervisor
key is the same key which is used in SME.
A page is encrypted when a page table entry has the encryption bit set (see
below on how to determine its position). The encryption bit can also be
specified in the cr3 register, allowing the PGD table to be encrypted. Each
successive level of page tables can also be encrypted by setting the encryption
bit in the page table entry that points to the next table. This allows the full
page table hierarchy to be encrypted. Note, this means that just because the
encryption bit is set in cr3, doesn't imply the full hierarchy is encrypted.
Each page table entry in the hierarchy needs to have the encryption bit set to
achieve that. So, theoretically, you could have the encryption bit set in cr3
so that the PGD is encrypted, but not set the encryption bit in the PGD entry
for a PUD which results in the PUD pointed to by that entry to not be
encrypted.
When SEV is enabled, instruction pages and guest page tables are always treated
as private. All the DMA operations inside the guest must be performed on shared
memory. Since the memory encryption bit is controlled by the guest OS when it
is operating in 64-bit or 32-bit PAE mode, in all other modes the SEV hardware
forces the memory encryption bit to 1.
Support for SME and SEV can be determined through the CPUID instruction. The
CPUID function 0x8000001f reports information related to SME::
0x8000001f[eax]:
Bit[0] indicates support for SME
Bit[1] indicates support for SEV
0x8000001f[ebx]:
Bits[5:0] pagetable bit number used to activate memory
encryption
Bits[11:6] reduction in physical address space, in bits, when
memory encryption is enabled (this only affects
system physical addresses, not guest physical
addresses)
If support for SME is present, MSR 0xc00100010 (MSR_AMD64_SYSCFG) can be used to
determine if SME is enabled and/or to enable memory encryption::
0xc0010010:
Bit[23] 0 = memory encryption features are disabled
1 = memory encryption features are enabled
If SEV is supported, MSR 0xc0010131 (MSR_AMD64_SEV) can be used to determine if
SEV is active::
0xc0010131:
Bit[0] 0 = memory encryption is not active
1 = memory encryption is active
Linux relies on BIOS to set this bit if BIOS has determined that the reduction
in the physical address space as a result of enabling memory encryption (see
CPUID information above) will not conflict with the address space resource
requirements for the system. If this bit is not set upon Linux startup then
Linux itself will not set it and memory encryption will not be possible.
The state of SME in the Linux kernel can be documented as follows:
- Supported:
The CPU supports SME (determined through CPUID instruction).
- Enabled:
Supported and bit 23 of MSR_AMD64_SYSCFG is set.
- Active:
Supported, Enabled and the Linux kernel is actively applying
the encryption bit to page table entries (the SME mask in the
kernel is non-zero).
SME can also be enabled and activated in the BIOS. If SME is enabled and
activated in the BIOS, then all memory accesses will be encrypted and it
will not be necessary to activate the Linux memory encryption support.
If the BIOS merely enables SME (sets bit 23 of the MSR_AMD64_SYSCFG),
then memory encryption can be enabled by supplying mem_encrypt=on on the
kernel command line. However, if BIOS does not enable SME, then Linux
will not be able to activate memory encryption, even if configured to do
so by default or the mem_encrypt=on command line parameter is specified.
Secure Nested Paging (SNP)
==========================
SEV-SNP introduces new features (SEV_FEATURES[1:63]) which can be enabled
by the hypervisor for security enhancements. Some of these features need
guest side implementation to function correctly. The below table lists the
expected guest behavior with various possible scenarios of guest/hypervisor
SNP feature support.
+-----------------+---------------+---------------+------------------+
| Feature Enabled | Guest needs | Guest has | Guest boot |
| by the HV | implementation| implementation| behaviour |
+=================+===============+===============+==================+
| No | No | No | Boot |
| | | | |
+-----------------+---------------+---------------+------------------+
| No | Yes | No | Boot |
| | | | |
+-----------------+---------------+---------------+------------------+
| No | Yes | Yes | Boot |
| | | | |
+-----------------+---------------+---------------+------------------+
| Yes | No | No | Boot with |
| | | | feature enabled |
+-----------------+---------------+---------------+------------------+
| Yes | Yes | No | Graceful boot |
| | | | failure |
+-----------------+---------------+---------------+------------------+
| Yes | Yes | Yes | Boot with |
| | | | feature enabled |
+-----------------+---------------+---------------+------------------+
More details in AMD64 APM[1] Vol 2: 15.34.10 SEV_STATUS MSR
Reverse Map Table (RMP)
=======================
The RMP is a structure in system memory that is used to ensure a one-to-one
mapping between system physical addresses and guest physical addresses. Each
page of memory that is potentially assignable to guests has one entry within
the RMP.
The RMP table can be either contiguous in memory or a collection of segments
in memory.
Contiguous RMP
--------------
Support for this form of the RMP is present when support for SEV-SNP is
present, which can be determined using the CPUID instruction::
0x8000001f[eax]:
Bit[4] indicates support for SEV-SNP
The location of the RMP is identified to the hardware through two MSRs::
0xc0010132 (RMP_BASE):
System physical address of the first byte of the RMP
0xc0010133 (RMP_END):
System physical address of the last byte of the RMP
Hardware requires that RMP_BASE and (RPM_END + 1) be 8KB aligned, but SEV
firmware increases the alignment requirement to require a 1MB alignment.
The RMP consists of a 16KB region used for processor bookkeeping followed
by the RMP entries, which are 16 bytes in size. The size of the RMP
determines the range of physical memory that the hypervisor can assign to
SEV-SNP guests. The RMP covers the system physical address from::
0 to ((RMP_END + 1 - RMP_BASE - 16KB) / 16B) x 4KB.
The current Linux support relies on BIOS to allocate/reserve the memory for
the RMP and to set RMP_BASE and RMP_END appropriately. Linux uses the MSR
values to locate the RMP and determine the size of the RMP. The RMP must
cover all of system memory in order for Linux to enable SEV-SNP.
Segmented RMP
-------------
Segmented RMP support is a new way of representing the layout of an RMP.
Initial RMP support required the RMP table to be contiguous in memory.
RMP accesses from a NUMA node on which the RMP doesn't reside
can take longer than accesses from a NUMA node on which the RMP resides.
Segmented RMP support allows the RMP entries to be located on the same
node as the memory the RMP is covering, potentially reducing latency
associated with accessing an RMP entry associated with the memory. Each
RMP segment covers a specific range of system physical addresses.
Support for this form of the RMP can be determined using the CPUID
instruction::
0x8000001f[eax]:
Bit[23] indicates support for segmented RMP
If supported, segmented RMP attributes can be found using the CPUID
instruction::
0x80000025[eax]:
Bits[5:0] minimum supported RMP segment size
Bits[11:6] maximum supported RMP segment size
0x80000025[ebx]:
Bits[9:0] number of cacheable RMP segment definitions
Bit[10] indicates if the number of cacheable RMP segments
is a hard limit
To enable a segmented RMP, a new MSR is available::
0xc0010136 (RMP_CFG):
Bit[0] indicates if segmented RMP is enabled
Bits[13:8] contains the size of memory covered by an RMP
segment (expressed as a power of 2)
The RMP segment size defined in the RMP_CFG MSR applies to all segments
of the RMP. Therefore each RMP segment covers a specific range of system
physical addresses. For example, if the RMP_CFG MSR value is 0x2401, then
the RMP segment coverage value is 0x24 => 36, meaning the size of memory
covered by an RMP segment is 64GB (1 << 36). So the first RMP segment
covers physical addresses from 0 to 0xF_FFFF_FFFF, the second RMP segment
covers physical addresses from 0x10_0000_0000 to 0x1F_FFFF_FFFF, etc.
When a segmented RMP is enabled, RMP_BASE points to the RMP bookkeeping
area as it does today (16K in size). However, instead of RMP entries
beginning immediately after the bookkeeping area, there is a 4K RMP
segment table (RST). Each entry in the RST is 8-bytes in size and represents
an RMP segment::
Bits[19:0] mapped size (in GB)
The mapped size can be less than the defined segment size.
A value of zero, indicates that no RMP exists for the range
of system physical addresses associated with this segment.
Bits[51:20] segment physical address
This address is left shift 20-bits (or just masked when
read) to form the physical address of the segment (1MB
alignment).
The RST can hold 512 segment entries but can be limited in size to the number
of cacheable RMP segments (CPUID 0x80000025_EBX[9:0]) if the number of cacheable
RMP segments is a hard limit (CPUID 0x80000025_EBX[10]).
The current Linux support relies on BIOS to allocate/reserve the memory for
the segmented RMP (the bookkeeping area, RST, and all segments), build the RST
and to set RMP_BASE, RMP_END, and RMP_CFG appropriately. Linux uses the MSR
values to locate the RMP and determine the size and location of the RMP
segments. The RMP must cover all of system memory in order for Linux to enable
SEV-SNP.
More details in the AMD64 APM Vol 2, section "15.36.3 Reverse Map Table",
docID: 24593.
Secure VM Service Module (SVSM)
===============================
SNP provides a feature called Virtual Machine Privilege Levels (VMPL) which
defines four privilege levels at which guest software can run. The most
privileged level is 0 and numerically higher numbers have lesser privileges.
More details in the AMD64 APM Vol 2, section "15.35.7 Virtual Machine
Privilege Levels", docID: 24593.
When using that feature, different services can run at different protection
levels, apart from the guest OS but still within the secure SNP environment.
They can provide services to the guest, like a vTPM, for example.
When a guest is not running at VMPL0, it needs to communicate with the software
running at VMPL0 to perform privileged operations or to interact with secure
services. An example fur such a privileged operation is PVALIDATE which is
*required* to be executed at VMPL0.
In this scenario, the software running at VMPL0 is usually called a Secure VM
Service Module (SVSM). Discovery of an SVSM and the API used to communicate
with it is documented in "Secure VM Service Module for SEV-SNP Guests", docID:
58019.
(Latest versions of the above-mentioned documents can be found by using
a search engine like duckduckgo.com and typing in:
site:amd.com "Secure VM Service Module for SEV-SNP Guests", docID: 58019
for example.)
|