<feed xmlns='http://www.w3.org/2005/Atom'>
<title>lwn.git/Documentation/virtual, branch docs-5.2a</title>
<subtitle>Linux kernel documentation tree maintained by Jonathan Corbet</subtitle>
<id>http://mirrors.hust.edu.cn/git/lwn.git/atom?h=docs-5.2a</id>
<link rel='self' href='http://mirrors.hust.edu.cn/git/lwn.git/atom?h=docs-5.2a'/>
<link rel='alternate' type='text/html' href='http://mirrors.hust.edu.cn/git/lwn.git/'/>
<updated>2019-03-15T22:00:28+00:00</updated>
<entry>
<title>Merge tag 'for-linus' of git://git.kernel.org/pub/scm/virt/kvm/kvm</title>
<updated>2019-03-15T22:00:28+00:00</updated>
<author>
<name>Linus Torvalds</name>
<email>torvalds@linux-foundation.org</email>
</author>
<published>2019-03-15T22:00:28+00:00</published>
<link rel='alternate' type='text/html' href='http://mirrors.hust.edu.cn/git/lwn.git/commit/?id=636deed6c0bc137a7c4f4a97ae1fcf0ad75323da'/>
<id>urn:sha1:636deed6c0bc137a7c4f4a97ae1fcf0ad75323da</id>
<content type='text'>
Pull KVM updates from Paolo Bonzini:
 "ARM:
   - some cleanups
   - direct physical timer assignment
   - cache sanitization for 32-bit guests

  s390:
   - interrupt cleanup
   - introduction of the Guest Information Block
   - preparation for processor subfunctions in cpu models

  PPC:
   - bug fixes and improvements, especially related to machine checks
     and protection keys

  x86:
   - many, many cleanups, including removing a bunch of MMU code for
     unnecessary optimizations
   - AVIC fixes

  Generic:
   - memcg accounting"

* tag 'for-linus' of git://git.kernel.org/pub/scm/virt/kvm/kvm: (147 commits)
  kvm: vmx: fix formatting of a comment
  KVM: doc: Document the life cycle of a VM and its resources
  MAINTAINERS: Add KVM selftests to existing KVM entry
  Revert "KVM/MMU: Flush tlb directly in the kvm_zap_gfn_range()"
  KVM: PPC: Book3S: Add count cache flush parameters to kvmppc_get_cpu_char()
  KVM: PPC: Fix compilation when KVM is not enabled
  KVM: Minor cleanups for kvm_main.c
  KVM: s390: add debug logging for cpu model subfunctions
  KVM: s390: implement subfunction processor calls
  arm64: KVM: Fix architecturally invalid reset value for FPEXC32_EL2
  KVM: arm/arm64: Remove unused timer variable
  KVM: PPC: Book3S: Improve KVM reference counting
  KVM: PPC: Book3S HV: Fix build failure without IOMMU support
  Revert "KVM: Eliminate extra function calls in kvm_get_dirty_log_protect()"
  x86: kvmguest: use TSC clocksource if invariant TSC is exposed
  KVM: Never start grow vCPU halt_poll_ns from value below halt_poll_ns_grow_start
  KVM: Expose the initial start value in grow_halt_poll_ns() as a module parameter
  KVM: grow_halt_poll_ns() should never shrink vCPU halt_poll_ns
  KVM: x86/mmu: Consolidate kvm_mmu_zap_all() and kvm_mmu_zap_mmio_sptes()
  KVM: x86/mmu: WARN if zapping a MMIO spte results in zapping children
  ...
</content>
</entry>
<entry>
<title>KVM: doc: Document the life cycle of a VM and its resources</title>
<updated>2019-03-15T18:24:33+00:00</updated>
<author>
<name>Sean Christopherson</name>
<email>sean.j.christopherson@intel.com</email>
</author>
<published>2019-02-15T20:48:40+00:00</published>
<link rel='alternate' type='text/html' href='http://mirrors.hust.edu.cn/git/lwn.git/commit/?id=eca6be566d47029f945a5f8e1c94d374e31df2ca'/>
<id>urn:sha1:eca6be566d47029f945a5f8e1c94d374e31df2ca</id>
<content type='text'>
The series to add memcg accounting to KVM allocations[1] states:

  There are many KVM kernel memory allocations which are tied to the
  life of the VM process and should be charged to the VM process's
  cgroup.

While it is correct to account KVM kernel allocations to the cgroup of
the process that created the VM, it's technically incorrect to state
that the KVM kernel memory allocations are tied to the life of the VM
process.  This is because the VM itself, i.e. struct kvm, is not tied to
the life of the process which created it, rather it is tied to the life
of its associated file descriptor.  In other words, kvm_destroy_vm() is
not invoked until fput() decrements its associated file's refcount to
zero.  A simple example is to fork() in Qemu and have the child sleep
indefinitely; kvm_destroy_vm() isn't called until Qemu closes its file
descriptor *and* the rogue child is killed.

The allocations are guaranteed to be *accounted* to the process which
created the VM, but only because KVM's per-{VM,vCPU} ioctls reject the
ioctl() with -EIO if kvm-&gt;mm != current-&gt;mm.  I.e. the child can keep
the VM "alive" but can't do anything useful with its reference.

Note that because 'struct kvm' also holds a reference to the mm_struct
of its owner, the above behavior also applies to userspace allocations.

Given that mucking with a VM's file descriptor can lead to subtle and
undesirable behavior, e.g. memcg charges persisting after a VM is shut
down, explicitly document a VM's lifecycle and its impact on the VM's
resources.

Alternatively, KVM could aggressively free resources when the creating
process exits, e.g. via mmu_notifier-&gt;release().  However, mmu_notifier
isn't guaranteed to be available, and freeing resources when the creator
exits is likely to be error prone and fragile as KVM would need to
ensure that it only freed resources that are truly out of reach. In
practice, the existing behavior shouldn't be problematic as a properly
configured system will prevent a child process from being moved out of
the appropriate cgroup hierarchy, i.e. prevent hiding the process from
the OOM killer, and will prevent an unprivileged user from being able to
to hold a reference to struct kvm via another method, e.g. debugfs.

[1]https://patchwork.kernel.org/patch/10806707/

Signed-off-by: Sean Christopherson &lt;sean.j.christopherson@intel.com&gt;
Signed-off-by: Paolo Bonzini &lt;pbonzini@redhat.com&gt;
</content>
</entry>
<entry>
<title>virtio-ccw: diag 500 may return a negative cookie</title>
<updated>2019-03-06T16:19:33+00:00</updated>
<author>
<name>Cornelia Huck</name>
<email>cohuck@redhat.com</email>
</author>
<published>2019-01-21T12:19:42+00:00</published>
<link rel='alternate' type='text/html' href='http://mirrors.hust.edu.cn/git/lwn.git/commit/?id=8457fdfeb16d307b2acd502cb9224d03174294d2'/>
<id>urn:sha1:8457fdfeb16d307b2acd502cb9224d03174294d2</id>
<content type='text'>
If something goes wrong in the kvm io bus handling, the virtio-ccw
diagnose may return a negative error value in the cookie gpr.

Document this.

Reviewed-by: Halil Pasic &lt;pasic@linux.ibm.com&gt;
Signed-off-by: Cornelia Huck &lt;cohuck@redhat.com&gt;
Signed-off-by: Michael S. Tsirkin &lt;mst@redhat.com&gt;
</content>
</entry>
<entry>
<title>KVM: Expose the initial start value in grow_halt_poll_ns() as a module parameter</title>
<updated>2019-02-20T21:48:50+00:00</updated>
<author>
<name>Nir Weiner</name>
<email>nir.weiner@oracle.com</email>
</author>
<published>2019-01-27T10:17:15+00:00</published>
<link rel='alternate' type='text/html' href='http://mirrors.hust.edu.cn/git/lwn.git/commit/?id=49113d360bdeb4dd916fb6bffbcc3e157422b6fd'/>
<id>urn:sha1:49113d360bdeb4dd916fb6bffbcc3e157422b6fd</id>
<content type='text'>
The hard-coded value 10000 in grow_halt_poll_ns() stands for the initial
start value when raising up vcpu-&gt;halt_poll_ns.
It actually sets the first timeout to the first polling session.
This value has significant effect on how tolerant we are to outliers.
On the standard case, higher value is better - we will spend more time
in the polling busyloop, handle events/interrupts faster and result
in better performance.
But on outliers it puts us in a busy loop that does nothing.
Even if the shrink factor is zero, we will still waste time on the first
iteration.
The optimal value changes between different workloads. It depends on
outliers rate and polling sessions length.
As this value has significant effect on the dynamic halt-polling
algorithm, it should be configurable and exposed.

Reviewed-by: Boris Ostrovsky &lt;boris.ostrovsky@oracle.com&gt;
Reviewed-by: Liran Alon &lt;liran.alon@oracle.com&gt;
Signed-off-by: Nir Weiner &lt;nir.weiner@oracle.com&gt;
Signed-off-by: Paolo Bonzini &lt;pbonzini@redhat.com&gt;
</content>
</entry>
<entry>
<title>Revert "KVM: MMU: document fast invalidate all pages"</title>
<updated>2019-02-20T21:48:39+00:00</updated>
<author>
<name>Sean Christopherson</name>
<email>sean.j.christopherson@intel.com</email>
</author>
<published>2019-02-05T21:01:22+00:00</published>
<link rel='alternate' type='text/html' href='http://mirrors.hust.edu.cn/git/lwn.git/commit/?id=a592a3b8fc62af25a6e76aebde97a5d5f6f13e0f'/>
<id>urn:sha1:a592a3b8fc62af25a6e76aebde97a5d5f6f13e0f</id>
<content type='text'>
Remove x86 KVM's fast invalidate mechanism, i.e. revert all patches
from the original series[1].

Though not explicitly stated, for all intents and purposes the fast
invalidate mechanism was added to speed up the scenario where removing
a memslot, e.g. as part of accessing reading PCI ROM, caused KVM to
flush all shadow entries[1].  Now that the memslot case flushes only
shadow entries belonging to the memslot, i.e. doesn't use the fast
invalidate mechanism, the only remaining usage of the mechanism are
when the VM is being destroyed and when the MMIO generation rolls
over.

When a VM is being destroyed, either there are no active vcpus, i.e.
there's no lock contention, or the VM has ungracefully terminated, in
which case we want to reclaim its pages as quickly as possible, i.e.
not release the MMU lock if there are still CPUs executing in the VM.

The MMIO generation scenario is almost literally a one-in-a-million
occurrence, i.e. is not a performance sensitive scenario.

Given that lock-breaking is not desirable (VM teardown) or irrelevant
(MMIO generation overflow), remove the fast invalidate mechanism to
simplify the code (a small amount) and to discourage future code from
zapping all pages as using such a big hammer should be a last restort.

This reverts commit f6f8adeef542a18b1cb26a0b772c9781a10bb477.

[1] https://lkml.kernel.org/r/1369960590-14138-1-git-send-email-xiaoguangrong@linux.vnet.ibm.com

Cc: Xiao Guangrong &lt;guangrong.xiao@gmail.com&gt;
Signed-off-by: Sean Christopherson &lt;sean.j.christopherson@intel.com&gt;
Signed-off-by: Paolo Bonzini &lt;pbonzini@redhat.com&gt;
</content>
</entry>
<entry>
<title>KVM: Move the memslot update in-progress flag to bit 63</title>
<updated>2019-02-20T21:48:37+00:00</updated>
<author>
<name>Sean Christopherson</name>
<email>sean.j.christopherson@intel.com</email>
</author>
<published>2019-02-05T21:01:18+00:00</published>
<link rel='alternate' type='text/html' href='http://mirrors.hust.edu.cn/git/lwn.git/commit/?id=164bf7e56c5a73f2f819c39ba7e0f20e0f97dc7b'/>
<id>urn:sha1:164bf7e56c5a73f2f819c39ba7e0f20e0f97dc7b</id>
<content type='text'>
...now that KVM won't explode by moving it out of bit 0.  Using bit 63
eliminates the need to jump over bit 0, e.g. when calculating a new
memslots generation or when propagating the memslots generation to an
MMIO spte.

Signed-off-by: Sean Christopherson &lt;sean.j.christopherson@intel.com&gt;
Signed-off-by: Paolo Bonzini &lt;pbonzini@redhat.com&gt;
</content>
</entry>
<entry>
<title>Documentation/virtual/kvm: Update URL for AMD SEV API specification</title>
<updated>2019-01-11T17:38:07+00:00</updated>
<author>
<name>Christophe de Dinechin</name>
<email>dinechin@redhat.com</email>
</author>
<published>2019-01-07T17:52:38+00:00</published>
<link rel='alternate' type='text/html' href='http://mirrors.hust.edu.cn/git/lwn.git/commit/?id=cf1754c2a1d6e92688f7353aa7f598f5ad6d8f78'/>
<id>urn:sha1:cf1754c2a1d6e92688f7353aa7f598f5ad6d8f78</id>
<content type='text'>
The URL of [api-spec] in Documentation/virtual/kvm/amd-memory-encryption.rst
is no longer valid, replaced space with underscore.

Signed-off-by: Christophe de Dinechin &lt;dinechin@redhat.com&gt;
Reviewed-by: Brijesh Singh &lt;brijesh.singh@amd.com&gt;
Signed-off-by: Radim Krčmář &lt;rkrcmar@redhat.com&gt;
</content>
</entry>
<entry>
<title>x86/kvm/hyper-v: Introduce KVM_GET_SUPPORTED_HV_CPUID</title>
<updated>2018-12-14T16:59:54+00:00</updated>
<author>
<name>Vitaly Kuznetsov</name>
<email>vkuznets@redhat.com</email>
</author>
<published>2018-12-10T17:21:56+00:00</published>
<link rel='alternate' type='text/html' href='http://mirrors.hust.edu.cn/git/lwn.git/commit/?id=2bc39970e9327ceb06cb210f86ba35f81d00e350'/>
<id>urn:sha1:2bc39970e9327ceb06cb210f86ba35f81d00e350</id>
<content type='text'>
With every new Hyper-V Enlightenment we implement we're forced to add a
KVM_CAP_HYPERV_* capability. While this approach works it is fairly
inconvenient: the majority of the enlightenments we do have corresponding
CPUID feature bit(s) and userspace has to know this anyways to be able to
expose the feature to the guest.

Add KVM_GET_SUPPORTED_HV_CPUID ioctl (backed by KVM_CAP_HYPERV_CPUID, "one
cap to rule them all!") returning all Hyper-V CPUID feature leaves.

Using the existing KVM_GET_SUPPORTED_CPUID doesn't seem to be possible:
Hyper-V CPUID feature leaves intersect with KVM's (e.g. 0x40000000,
0x40000001) and we would probably confuse userspace in case we decide to
return these twice.

KVM_CAP_HYPERV_CPUID's number is interim: we're intended to drop
KVM_CAP_HYPERV_STIMER_DIRECT and use its number instead.

Suggested-by: Paolo Bonzini &lt;pbonzini@redhat.com&gt;
Signed-off-by: Vitaly Kuznetsov &lt;vkuznets@redhat.com&gt;
Signed-off-by: Paolo Bonzini &lt;pbonzini@redhat.com&gt;
</content>
</entry>
<entry>
<title>kvm: introduce manual dirty log reprotect</title>
<updated>2018-12-14T11:34:19+00:00</updated>
<author>
<name>Paolo Bonzini</name>
<email>pbonzini@redhat.com</email>
</author>
<published>2018-10-23T00:36:47+00:00</published>
<link rel='alternate' type='text/html' href='http://mirrors.hust.edu.cn/git/lwn.git/commit/?id=2a31b9db153530df4aa02dac8c32837bf5f47019'/>
<id>urn:sha1:2a31b9db153530df4aa02dac8c32837bf5f47019</id>
<content type='text'>
There are two problems with KVM_GET_DIRTY_LOG.  First, and less important,
it can take kvm-&gt;mmu_lock for an extended period of time.  Second, its user
can actually see many false positives in some cases.  The latter is due
to a benign race like this:

  1. KVM_GET_DIRTY_LOG returns a set of dirty pages and write protects
     them.
  2. The guest modifies the pages, causing them to be marked ditry.
  3. Userspace actually copies the pages.
  4. KVM_GET_DIRTY_LOG returns those pages as dirty again, even though
     they were not written to since (3).

This is especially a problem for large guests, where the time between
(1) and (3) can be substantial.  This patch introduces a new
capability which, when enabled, makes KVM_GET_DIRTY_LOG not
write-protect the pages it returns.  Instead, userspace has to
explicitly clear the dirty log bits just before using the content
of the page.  The new KVM_CLEAR_DIRTY_LOG ioctl can also operate on a
64-page granularity rather than requiring to sync a full memslot;
this way, the mmu_lock is taken for small amounts of time, and
only a small amount of time will pass between write protection
of pages and the sending of their content.

Signed-off-by: Paolo Bonzini &lt;pbonzini@redhat.com&gt;
</content>
</entry>
<entry>
<title>kvm: make KVM_CAP_ENABLE_CAP_VM architecture agnostic</title>
<updated>2018-12-14T11:34:18+00:00</updated>
<author>
<name>Paolo Bonzini</name>
<email>pbonzini@redhat.com</email>
</author>
<published>2017-02-16T09:40:56+00:00</published>
<link rel='alternate' type='text/html' href='http://mirrors.hust.edu.cn/git/lwn.git/commit/?id=e5d83c74a5800c2a1fa3ba982c1c4b2b39ae6db2'/>
<id>urn:sha1:e5d83c74a5800c2a1fa3ba982c1c4b2b39ae6db2</id>
<content type='text'>
The first such capability to be handled in virt/kvm/ will be manual
dirty page reprotection.

Signed-off-by: Paolo Bonzini &lt;pbonzini@redhat.com&gt;
</content>
</entry>
</feed>
