KVM: x86/mmu: Split huge pages mapped by the TDP MMU when dirty logging is enabled

When dirty logging is enabled without initially-all-set, try to split all huge pages in the memslot down to 4KB pages so that vCPUs do not have to take expensive write-protection faults to split huge pages. Eager page splitting is best-effort only. This commit only adds the support for the TDP MMU, and even there splitting may fail due to out of memory conditions. Failures to split a huge page is fine from a correctness standpoint because KVM will always follow up splitting by write-protecting any remaining huge pages. Eager page splitting moves the cost of splitting huge pages off of the vCPU threads and onto the thread enabling dirty logging on the memslot. This is useful because: 1. Splitting on the vCPU thread interrupts vCPUs execution and is disruptive to customers whereas splitting on VM ioctl threads can run in parallel with vCPU execution. 2. Splitting all huge pages at once is more efficient because it does not require performing VM-exit handling or walking the page table for every 4KiB page in the memslot, and greatly reduces the amount of contention on the mmu_lock. For example, when running dirty_log_perf_test with 96 virtual CPUs, 1GiB per vCPU, and 1GiB HugeTLB memory, the time it takes vCPUs to write to all of their memory after dirty logging is enabled decreased by 95% from 2.94s to 0.14s. Eager Page Splitting is over 100x more efficient than the current implementation of splitting on fault under the read lock. For example, taking the same workload as above, Eager Page Splitting reduced the CPU required to split all huge pages from ~270 CPU-seconds ((2.94s - 0.14s) * 96 vCPU threads) to only 1.55 CPU-seconds. Eager page splitting does increase the amount of time it takes to enable dirty logging since it has split all huge pages. For example, the time it took to enable dirty logging in the 96GiB region of the aforementioned test increased from 0.001s to 1.55s. Reviewed-by: Peter Xu <peterx@redhat.com> Signed-off-by: David Matlack <dmatlack@google.com> Message-Id: <20220119230739.2234394-16-dmatlack@google.com> Signed-off-by: Paolo Bonzini <pbonzini@redhat.com>
author: David Matlack <dmatlack@google.com> 2022-01-19 23:07:36 +0000
committer: Paolo Bonzini <pbonzini@redhat.com> 2022-02-10 13:50:42 -0500
commit: a3fe5dbda0a4bb7759dcd5a0ad713d347e020401 (patch)
tree: c04f89cb6ff4e72b98ebb65b6d5e4b1f57b2cb61 /arch/x86/kvm/mmu/mmu.c
parent: a82070b6e71a6642f87ef9e483ddc062c3571678 (diff)
download: lwn-a3fe5dbda0a4bb7759dcd5a0ad713d347e020401.tar.gz
lwn-a3fe5dbda0a4bb7759dcd5a0ad713d347e020401.zip
1 files changed, 24 insertions, 0 deletions
diff --git a/arch/x86/kvm/mmu/mmu.c b/arch/x86/kvm/mmu/mmu.c
index a75e4ae88cde..308c8b21f9b1 100644
--- a/arch/x86/kvm/mmu/mmu.c
+++ b/arch/x86/kvm/mmu/mmu.c
@@ -5830,6 +5830,30 @@ void kvm_mmu_slot_remove_write_access(struct kvm *kvm,
 		kvm_arch_flush_remote_tlbs_memslot(kvm, memslot);
 }
 
+void kvm_mmu_slot_try_split_huge_pages(struct kvm *kvm,
+				       const struct kvm_memory_slot *memslot,
+				       int target_level)
+{
+	u64 start = memslot->base_gfn;
+	u64 end = start + memslot->npages;
+
+	if (is_tdp_mmu_enabled(kvm)) {
+		read_lock(&kvm->mmu_lock);
+		kvm_tdp_mmu_try_split_huge_pages(kvm, memslot, start, end, target_level);
+		read_unlock(&kvm->mmu_lock);
+	}
+
+	/*
+	 * No TLB flush is necessary here. KVM will flush TLBs after
+	 * write-protecting and/or clearing dirty on the newly split SPTEs to
+	 * ensure that guest writes are reflected in the dirty log before the
+	 * ioctl to enable dirty logging on this memslot completes. Since the
+	 * split SPTEs retain the write and dirty bits of the huge SPTE, it is
+	 * safe for KVM to decide if a TLB flush is necessary based on the split
+	 * SPTEs.
+	 */
+}
+
 static bool kvm_mmu_zap_collapsible_spte(struct kvm *kvm,
 					 struct kvm_rmap_head *rmap_head,
 					 const struct kvm_memory_slot *slot)
author	David Matlack <dmatlack@google.com>	2022-01-19 23:07:36 +0000
committer	Paolo Bonzini <pbonzini@redhat.com>	2022-02-10 13:50:42 -0500
commit	a3fe5dbda0a4bb7759dcd5a0ad713d347e020401 (patch)
tree	c04f89cb6ff4e72b98ebb65b6d5e4b1f57b2cb61 /arch/x86/kvm/mmu/mmu.c
parent	a82070b6e71a6642f87ef9e483ddc062c3571678 (diff)
download	lwn-a3fe5dbda0a4bb7759dcd5a0ad713d347e020401.tar.gz lwn-a3fe5dbda0a4bb7759dcd5a0ad713d347e020401.zip