lwn.git - Linux kernel documentation tree maintained by Jonathan Corbet

Age	Commit message (Collapse)	Author
2021-10-27	riscv: defconfig: enable DRM_NOUVEAU	Heinrich Schuchardt
	Both RADEON and NOUVEAU graphics cards are supported on RISC-V. Enabling the one and not the other does not make sense. As typically at most one of RADEON, NOUVEAU, or VIRTIO GPU support will be needed DRM drivers should be compiled as modules. Signed-off-by: Heinrich Schuchardt <heinrich.schuchardt@canonical.com> Signed-off-by: Palmer Dabbelt <palmerdabbelt@google.com>
2021-10-26	riscv/vdso: Drop unneeded part due to merge issue	Kefeng Wang
	It seems that something is wrong when patch "riscv/vdso: Refactor asm/vdso.h" is merged. Let's fix the merge issue. Fixes: 8edab02386c3 ("Merge remote-tracking branch 'palmer/riscv-vdso-cleanup' into for-next") Signed-off-by: Kefeng Wang <wangkefeng.wang@huawei.com> Signed-off-by: Palmer Dabbelt <palmerdabbelt@google.com>
2021-10-26	riscv: remove .text section size limitation for XIP	Vitaly Wool
	Currently there's a limit of 8MB for the .text section of a RISC-V image in the XIP case. This breaks compilation of many automatic builds and is generally inconvenient. This patch removes that limitation and optimizes XIP image file size at the same time. Signed-off-by: Vitaly Wool <vitaly.wool@konsulko.com> Signed-off-by: Palmer Dabbelt <palmerdabbelt@google.com>
2021-10-21	Merge tag 'riscv-sifive-dt-5.16' of ↵	Palmer Dabbelt
	git://gitolite.kernel.org/pub/scm/linux/kernel/git/krzk/linux into for-next RISC-V DTS changes for v5.16 Cleanups of RISC-V SiFive and Microchip DTSes with dtschema. These are few minor fixes to make DTSes pass the dtschema, without actual functional effect. * tag 'riscv-sifive-dt-5.16' of ssh://gitolite.kernel.org/pub/scm/linux/kernel/git/krzk/linux: riscv: dts: sifive: add missing compatible for plic riscv: dts: microchip: add missing compatibles for clint and plic riscv: dts: sifive: drop duplicated nodes and properties in sifive riscv: dts: sifive: fix Unleashed board compatible riscv: dts: sifive: use only generic JEDEC SPI NOR flash compatible
2021-10-19	riscv: dts: sifive: add missing compatible for plic	Krzysztof Kozlowski
	Add proper compatible for Platform-Level Interrupt Controller to silence dtbs_check warnings: interrupt-controller@c000000: compatible: ['sifive,plic-1.0.0'] is too short Signed-off-by: Krzysztof Kozlowski <krzysztof.kozlowski@canonical.com> Reviewed-by: Alexandre Ghiti <alexandre.ghiti@canonical.com> Tested-by: Alexandre Ghiti <alexandre.ghiti@canonical.com> Link: https://lore.kernel.org/r/20210920130412.145231-2-krzysztof.kozlowski@canonical.com
2021-10-19	riscv: dts: microchip: add missing compatibles for clint and plic	Krzysztof Kozlowski
	The Microchip Icicle kit uses SiFive E51 and U54 cores, so it looks that also Core Local Interruptor and Platform-Level Interrupt Controller are coming from SiFive. Add proper compatibles to silence dtbs_check warnings: clint@2000000: compatible:0: 'sifive,clint0' is not one of ['sifive,fu540-c000-clint', 'canaan,k210-clint'] interrupt-controller@c000000: compatible:0: 'sifive,plic-1.0.0' is not one of ['sifive,fu540-c000-plic', 'canaan,k210-plic'] Signed-off-by: Krzysztof Kozlowski <krzysztof.kozlowski@canonical.com> Reviewed-by: Conor Dooley <conor.dooley@microchip.com> Link: https://lore.kernel.org/r/20210920130412.145231-1-krzysztof.kozlowski@canonical.com
2021-10-19	riscv: dts: sifive: drop duplicated nodes and properties in sifive	Krzysztof Kozlowski
	The DTSI file defines soc node and address/size cells, so there is no point in duplicating it in DTS file. Signed-off-by: Krzysztof Kozlowski <krzysztof.kozlowski@canonical.com> Reviewed-by: Alexandre Ghiti <alexandre.ghiti@canonical.com> Tested-by: Alexandre Ghiti <alexandre.ghiti@canonical.com> Link: https://lore.kernel.org/r/20210920130248.145058-3-krzysztof.kozlowski@canonical.com
2021-10-19	riscv: dts: sifive: fix Unleashed board compatible	Krzysztof Kozlowski
	Add missing sifive,fu540 compatible to fix dtbs_check warnings: arch/riscv/boot/dts/sifive/hifive-unleashed-a00.dt.yaml: /: compatible: 'oneOf' conditional failed, one must be fixed: ['sifive,hifive-unleashed-a00', 'sifive,fu540-c000'] is too short 'sifive,hifive-unleashed-a00' is not one of ['sifive,hifive-unmatched-a00'] 'sifive,fu740-c000' was expected Signed-off-by: Krzysztof Kozlowski <krzysztof.kozlowski@canonical.com> Reviewed-by: Alexandre Ghiti <alexandre.ghiti@canonical.com> Tested-by: Alexandre Ghiti <alexandre.ghiti@canonical.com> Link: https://lore.kernel.org/r/20210920130248.145058-2-krzysztof.kozlowski@canonical.com
2021-10-19	riscv: dts: sifive: use only generic JEDEC SPI NOR flash compatible	Krzysztof Kozlowski
	The compatible "issi,is25wp256" is undocumented and instead only a generic jedec,spi-nor should be used (if appropriate). Signed-off-by: Krzysztof Kozlowski <krzysztof.kozlowski@canonical.com> Link: https://lore.kernel.org/r/20210920130248.145058-1-krzysztof.kozlowski@canonical.com
2021-10-07	riscv: dts: microchip: use vendor compatible for Cadence SD4HC	Krzysztof Kozlowski
	Licensed IP blocks should have their own vendor compatible. Signed-off-by: Krzysztof Kozlowski <krzysztof.kozlowski@canonical.com> Reviewed-by: Conor Dooley <conor.dooley@microchip.com> Reviewed-by: Geert Uytterhoeven <geert@linux-m68k.org> Signed-off-by: Palmer Dabbelt <palmerdabbelt@google.com>
2021-10-07	riscv: dts: microchip: drop unused pinctrl-names	Krzysztof Kozlowski
	pinctrl-names without pinctrl-0 does not have any sense: arch/riscv/boot/dts/microchip/microchip-mpfs-icicle-kit.dt.yaml: sdhc@20008000: 'pinctrl-0' is a dependency of 'pinctrl-names' Signed-off-by: Krzysztof Kozlowski <krzysztof.kozlowski@canonical.com> Reviewed-by: Geert Uytterhoeven <geert@linux-m68k.org> Signed-off-by: Palmer Dabbelt <palmerdabbelt@google.com>
2021-10-07	riscv: dts: microchip: drop duplicated MMC/SDHC node	Krzysztof Kozlowski
	Devicetree source is a description of hardware and hardware has only one block @20008000 which can be configured either as eMMC or SDHC. Having two node for different modes is an obscure, unusual and confusing way to configure it. Instead the board file is supposed to customize the block to its needs, e.g. to SDHC mode. This fixes dtbs_check warning: arch/riscv/boot/dts/microchip/microchip-mpfs-icicle-kit.dt.yaml: sdhc@20008000: $nodename:0: 'sdhc@20008000' does not match '^mmc(@.*)?$' Signed-off-by: Krzysztof Kozlowski <krzysztof.kozlowski@canonical.com> Signed-off-by: Palmer Dabbelt <palmerdabbelt@google.com>
2021-10-07	riscv: dts: microchip: fix board compatible	Krzysztof Kozlowski
	According to bindings, the compatible must include microchip,mpfs. This fixes dtbs_check warning: arch/riscv/boot/dts/microchip/microchip-mpfs-icicle-kit.dt.yaml: /: compatible: ['microchip,mpfs-icicle-kit'] is too short Signed-off-by: Krzysztof Kozlowski <krzysztof.kozlowski@canonical.com> Reviewed-by: Conor Dooley <conor.dooley@microchip.com> Reviewed-by: Geert Uytterhoeven <geert@linux-m68k.org> Signed-off-by: Palmer Dabbelt <palmerdabbelt@google.com>
2021-10-07	riscv: dts: microchip: drop duplicated nodes	Krzysztof Kozlowski
	The DTSI file defines soc node and address/size cells, so there is no point in duplicating it in DTS file. Signed-off-by: Krzysztof Kozlowski <krzysztof.kozlowski@canonical.com> Reviewed-by: Geert Uytterhoeven <geert@linux-m68k.org> Signed-off-by: Palmer Dabbelt <palmerdabbelt@google.com>
2021-10-07	dt-bindings: mmc: cdns: document Microchip MPFS MMC/SDHCI controller	Krzysztof Kozlowski
	The Microchip PolarFire SoC FPGA DTSI uses Cadence SD/SDIO/eMMC Host Controller without any additional vendor compatible: arch/riscv/boot/dts/microchip/microchip-mpfs-icicle-kit.dt.yaml: mmc@20008000: compatible:0: 'cdns,sd4hc' is not one of ['socionext,uniphier-sd4hc'] arch/riscv/boot/dts/microchip/microchip-mpfs-icicle-kit.dt.yaml: mmc@20008000: compatible: ['cdns,sd4hc'] is too short Signed-off-by: Krzysztof Kozlowski <krzysztof.kozlowski@canonical.com> Reviewed-by: Geert Uytterhoeven <geert@linux-m68k.org> Acked-by: Rob Herring <robh@kernel.org> Reviewed-by: Conor Dooley <conor.dooley@microchip.com> Signed-off-by: Palmer Dabbelt <palmerdabbelt@google.com>
2021-10-04	Merge tag 'for-riscv' of https://git.kernel.org/pub/scm/virt/kvm/kvm.git ↵	Palmer Dabbelt
	into for-next H extension definitions, shared by the KVM and RISC-V trees. * tag 'for-riscv' of ssh://gitolite.kernel.org/pub/scm/virt/kvm/kvm: (301 commits) RISC-V: Add hypervisor extension related CSR defines KVM: selftests: Ensure all migrations are performed when test is affined KVM: x86: Swap order of CPUID entry "index" vs. "significant flag" checks ptp: Fix ptp_kvm_getcrosststamp issue for x86 ptp_kvm x86/kvmclock: Move this_cpu_pvti into kvmclock.h KVM: s390: Function documentation fixes selftests: KVM: Don't clobber XMM register when read KVM: VMX: Fix a TSX_CTRL_CPUID_CLEAR field mask issue selftests: KVM: Explicitly use movq to read xmm registers selftests: KVM: Call ucall_init when setting up in rseq_test KVM: Remove tlbs_dirty KVM: X86: Synchronize the shadow pagetable before link it KVM: X86: Fix missed remote tlb flush in rmap_write_protect() KVM: x86: nSVM: don't copy virt_ext from vmcb12 KVM: x86: nSVM: test eax for 4K alignment for GP errata workaround KVM: x86: selftests: test simultaneous uses of V_IRQ from L1 and L0 KVM: x86: nSVM: restore int_vector in svm_clear_vintr kvm: x86: Add AMD PMU MSRs to msrs_to_save_all[] KVM: x86: nVMX: re-evaluate emulation_required on nested VM exit KVM: x86: nVMX: don't fail nested VM entry on invalid guest state if !from_vmentry ...
2021-10-04	riscv: add rv32 and rv64 randconfig build targets	Randy Dunlap
	Add the ability to do randconfig build targets for both rv32 and rv64. Based on a similar patch by Michael Ellerman for PowerPC. Usage: make ARCH=riscv rv32_randconfig or make ARCH=riscv rv64_randconfig Signed-off-by: Randy Dunlap <rdunlap@infradead.org> Signed-off-by: Palmer Dabbelt <palmerdabbelt@google.com>
2021-10-04	riscv: mm: don't advertise 1 num_asid for 0 asid bits	Vineet Gupta
	Even if mmu doesn't support ASID, current code calculates @num_asids=1 which is misleading, so avoid setting any asid related variables in such case. Also while here, print the number of asid bits discovered even for the disabled case. Verified this on Hifive Unmatched. Reviewed-by: Anup Patel <anup@brainfault.org> Signed-off-by: Vineet Gupta <vgupta@kernel.org> Signed-off-by: Palmer Dabbelt <palmerdabbelt@google.com>
2021-10-04	riscv: set default pm_power_off to NULL	Dimitri John Ledkov
	Set pm_power_off to NULL like on all other architectures, check if it is set in machine_halt() and machine_power_off() and fallback to default_power_off if no other power driver got registered. This brings riscv architecture inline with all other architectures, and allows to reuse exiting power drivers unmodified. Kernels without legacy SBI v0.1 extensions (CONFIG_RISCV_SBI_V01 is not set), do not set pm_power_off to sbi_shutdown(). There is no support for SBI v0.3 system reset extension either. This prevents using gpio_poweroff on SiFive HiFive Unmatched. Tested on SiFive HiFive unmatched, with a dtb specifying gpio-poweroff node and kernel complied without CONFIG_RISCV_SBI_V01. BugLink: https://bugs.launchpad.net/bugs/1942806 Signed-off-by: Dimitri John Ledkov <dimitri.ledkov@canonical.com> Reviewed-by: Anup Patel <anup@brainfault.org> Tested-by: Ron Economos <w6rz@comcast.net> Signed-off-by: Palmer Dabbelt <palmerdabbelt@google.com>
2021-10-04	riscv/vdso: Add support for time namespaces	Tong Tiangen
	Implement generic vdso time namespace support which also enables time namespaces for riscv. This is quite similar to what arm64 does. selftest/timens test result: 1..10 ok 1 Passed for CLOCK_BOOTTIME (syscall) ok 2 Passed for CLOCK_BOOTTIME (vdso) ok 3 # SKIP CLOCK_BOOTTIME_ALARM isn't supported ok 4 # SKIP CLOCK_BOOTTIME_ALARM isn't supported ok 5 Passed for CLOCK_MONOTONIC (syscall) ok 6 Passed for CLOCK_MONOTONIC (vdso) ok 7 Passed for CLOCK_MONOTONIC_COARSE (syscall) ok 8 Passed for CLOCK_MONOTONIC_COARSE (vdso) ok 9 Passed for CLOCK_MONOTONIC_RAW (syscall) ok 10 Passed for CLOCK_MONOTONIC_RAW (vdso) # Totals: pass:8 fail:0 xfail:0 xpass:0 skip:2 error:0 Signed-off-by: Tong Tiangen <tongtiangen@huawei.com> Signed-off-by: Palmer Dabbelt <palmerdabbelt@google.com>
2021-10-04	RISC-V: Add hypervisor extension related CSR defines	Anup Patel
	This patch adds asm/kvm_csr.h for RISC-V hypervisor extension related defines. Signed-off-by: Anup Patel <anup.patel@wdc.com> Reviewed-by: Paolo Bonzini <pbonzini@redhat.com> Reviewed-by: Alexander Graf <graf@amazon.com> Message-Id: <20210927114016.1089328-2-anup.patel@wdc.com> Acked-by: Palmer Dabbelt <palmerdabbelt@google.com> Signed-off-by: Paolo Bonzini <pbonzini@redhat.com>
2021-10-04	Merge tag 'kvm-s390-master-5.15-1' of ↵	Paolo Bonzini
	git://git.kernel.org/pub/scm/linux/kernel/git/kvms390/linux into kvm-master KVM: s390: allow to compile without warning with W=1
2021-10-02	Merge remote-tracking branch 'palmer/riscv-vdso-cleanup' into for-next	Palmer Dabbelt
	This contains a VDSO cleanup, along with a handful of VDSO fixes. These serve as a base for the new time namespace support, but are also on fixes. * palmer/riscv-vdso-cleanup: riscv/vdso: make arch_setup_additional_pages wait for mmap_sem for write killable riscv/vdso: Move vdso data page up front riscv/vdso: Refactor asm/vdso.h
2021-10-02	riscv/vdso: make arch_setup_additional_pages wait for mmap_sem for write ↵	Tong Tiangen
	killable riscv architectures relying on mmap_sem for write in their arch_setup_additional_pages. If the waiting task gets killed by the oom killer it would block oom_reaper from asynchronous address space reclaim and reduce the chances of timely OOM resolving. Wait for the lock in the killable mode and return with EINTR if the task got killed while waiting. Signed-off-by: Tong Tiangen <tongtiangen@huawei.com> Reviewed-by: Kefeng Wang <wangkefeng.wang@huawei.com> Fixes: 76d2a0493a17 ("RISC-V: Init and Halt Code") Signed-off-by: Palmer Dabbelt <palmerdabbelt@google.com>
2021-10-02	riscv/vdso: Move vdso data page up front	Tong Tiangen
	As commit 601255ae3c98 ("arm64: vdso: move data page before code pages"), the same issue exists on riscv, testcase is shown below, make sure that vdso.so is bigger than page size, struct timespec tp; clock_gettime(5, &tp); printf("tv_sec: %ld, tv_nsec: %ld\n", tp.tv_sec, tp.tv_nsec); without this patch, test result : tv_sec: 0, tv_nsec: 0 with this patch, test result : tv_sec: 1629271537, tv_nsec: 748000000 Move the vdso data page in front of the VDSO area to fix the issue. Fixes: ad5d1122b82fb ("riscv: use vDSO common flow to reduce the latency of the time-related functions") Signed-off-by: Tong Tiangen <tongtiangen@huawei.com> Reviewed-by: Kefeng Wang <wangkefeng.wang@huawei.com> Signed-off-by: Palmer Dabbelt <palmerdabbelt@google.com>
2021-10-02	riscv/vdso: Refactor asm/vdso.h	Tong Tiangen
	The asm/vdso.h will be included in vdso.lds.S in the next patch, the following cleanup is needed to avoid syntax error: 1.the declaration of sys_riscv_flush_icache() is moved into asm/syscall.h. 2.the definition of struct vdso_data is moved into kernel/vdso.c. 2.the definition of VDSO_SYMBOL is placed under "#ifndef __ASSEMBLY__". Also remove the redundant linux/types.h include. Signed-off-by: Tong Tiangen <tongtiangen@huawei.com> Reviewed-by: Kefeng Wang <wangkefeng.wang@huawei.com> Signed-off-by: Palmer Dabbelt <palmerdabbelt@google.com>
2021-09-30	KVM: selftests: Ensure all migrations are performed when test is affined	Sean Christopherson
	Rework the CPU selection in the migration worker to ensure the specified number of migrations are performed when the test iteslf is affined to a subset of CPUs. The existing logic skips iterations if the target CPU is not in the original set of possible CPUs, which causes the test to fail if too many iterations are skipped. ==== Test Assertion Failure ==== rseq_test.c:228: i > (NR_TASK_MIGRATIONS / 2) pid=10127 tid=10127 errno=4 - Interrupted system call 1 0x00000000004018e5: main at rseq_test.c:227 2 0x00007fcc8fc66bf6: ?? ??:0 3 0x0000000000401959: _start at ??:? Only performed 4 KVM_RUNs, task stalled too much? Calculate the min/max possible CPUs as a cheap "best effort" to avoid high runtimes when the test is affined to a small percentage of CPUs. Alternatively, a list or xarray of the possible CPUs could be used, but even in a horrendously inefficient setup, such optimizations are not needed because the runtime is completely dominated by the cost of migrating the task, and the absolute runtime is well under a minute in even truly absurd setups, e.g. running on a subset of vCPUs in a VM that is heavily overcommited (16 vCPUs per pCPU). Fixes: 61e52f1630f5 ("KVM: selftests: Add a test for KVM_RUN+rseq to detect task migration bugs") Reported-by: Dongli Zhang <dongli.zhang@oracle.com> Signed-off-by: Sean Christopherson <seanjc@google.com> Message-Id: <20210929234112.1862848-1-seanjc@google.com> Signed-off-by: Paolo Bonzini <pbonzini@redhat.com>
2021-09-30	KVM: x86: Swap order of CPUID entry "index" vs. "significant flag" checks	Sean Christopherson
	Check whether a CPUID entry's index is significant before checking for a matching index to hack-a-fix an undefined behavior bug due to consuming uninitialized data. RESET/INIT emulation uses kvm_cpuid() to retrieve CPUID.0x1, which does _not_ have a significant index, and fails to initialize the dummy variable that doubles as EBX/ECX/EDX output _and_ ECX, a.k.a. index, input. Practically speaking, it's _extremely_ unlikely any compiler will yield code that causes problems, as the compiler would need to inline the kvm_cpuid() call to detect the uninitialized data, and intentionally hose the kernel, e.g. insert ud2, instead of simply ignoring the result of the index comparison. Although the sketchy "dummy" pattern was introduced in SVM by commit 66f7b72e1171 ("KVM: x86: Make register state after reset conform to specification"), it wasn't actually broken until commit 7ff6c0350315 ("KVM: x86: Remove stateful CPUID handling") arbitrarily swapped the order of operations such that "index" was checked before the significant flag. Avoid consuming uninitialized data by reverting to checking the flag before the index purely so that the fix can be easily backported; the offending RESET/INIT code has been refactored, moved, and consolidated from vendor code to common x86 since the bug was introduced. A future patch will directly address the bad RESET/INIT behavior. The undefined behavior was detected by syzbot + KernelMemorySanitizer. BUG: KMSAN: uninit-value in cpuid_entry2_find arch/x86/kvm/cpuid.c:68 BUG: KMSAN: uninit-value in kvm_find_cpuid_entry arch/x86/kvm/cpuid.c:1103 BUG: KMSAN: uninit-value in kvm_cpuid+0x456/0x28f0 arch/x86/kvm/cpuid.c:1183 cpuid_entry2_find arch/x86/kvm/cpuid.c:68 [inline] kvm_find_cpuid_entry arch/x86/kvm/cpuid.c:1103 [inline] kvm_cpuid+0x456/0x28f0 arch/x86/kvm/cpuid.c:1183 kvm_vcpu_reset+0x13fb/0x1c20 arch/x86/kvm/x86.c:10885 kvm_apic_accept_events+0x58f/0x8c0 arch/x86/kvm/lapic.c:2923 vcpu_enter_guest+0xfd2/0x6d80 arch/x86/kvm/x86.c:9534 vcpu_run+0x7f5/0x18d0 arch/x86/kvm/x86.c:9788 kvm_arch_vcpu_ioctl_run+0x245b/0x2d10 arch/x86/kvm/x86.c:10020 Local variable ----dummy@kvm_vcpu_reset created at: kvm_vcpu_reset+0x1fb/0x1c20 arch/x86/kvm/x86.c:10812 kvm_apic_accept_events+0x58f/0x8c0 arch/x86/kvm/lapic.c:2923 Reported-by: syzbot+f3985126b746b3d59c9d@syzkaller.appspotmail.com Reported-by: Alexander Potapenko <glider@google.com> Fixes: 2a24be79b6b7 ("KVM: VMX: Set EDX at INIT with CPUID.0x1, Family-Model-Stepping") Fixes: 7ff6c0350315 ("KVM: x86: Remove stateful CPUID handling") Cc: stable@vger.kernel.org Signed-off-by: Sean Christopherson <seanjc@google.com> Reviewed-by: Jim Mattson <jmattson@google.com> Message-Id: <20210929222426.1855730-2-seanjc@google.com> Signed-off-by: Paolo Bonzini <pbonzini@redhat.com>
2021-09-30	ptp: Fix ptp_kvm_getcrosststamp issue for x86 ptp_kvm	Zelin Deng
	hv_clock is preallocated to have only HVC_BOOT_ARRAY_SIZE (64) elements; if the PTP_SYS_OFFSET_PRECISE ioctl is executed on vCPUs whose index is 64 of higher, retrieving the struct pvclock_vcpu_time_info pointer with "src = &hv_clock[cpu].pvti" will result in an out-of-bounds access and a wild pointer. Change it to "this_cpu_pvti()" which is guaranteed to be valid. Fixes: 95a3d4454bb1 ("Switch kvmclock data to a PER_CPU variable") Signed-off-by: Zelin Deng <zelin.deng@linux.alibaba.com> Cc: <stable@vger.kernel.org> Message-Id: <1632892429-101194-3-git-send-email-zelin.deng@linux.alibaba.com> Signed-off-by: Paolo Bonzini <pbonzini@redhat.com>
2021-09-30	x86/kvmclock: Move this_cpu_pvti into kvmclock.h	Zelin Deng
	There're other modules might use hv_clock_per_cpu variable like ptp_kvm, so move it into kvmclock.h and export the symbol to make it visiable to other modules. Signed-off-by: Zelin Deng <zelin.deng@linux.alibaba.com> Cc: <stable@vger.kernel.org> Message-Id: <1632892429-101194-2-git-send-email-zelin.deng@linux.alibaba.com> Signed-off-by: Paolo Bonzini <pbonzini@redhat.com>
2021-09-28	KVM: s390: Function documentation fixes	Janosch Frank
	The latest compile changes pointed us to a few instances where we use the kernel documentation style but don't explain all variables or don't adhere to it 100%. It's easy to fix so let's do that. Signed-off-by: Janosch Frank <frankja@linux.ibm.com> Signed-off-by: Christian Borntraeger <borntraeger@de.ibm.com>
2021-09-28	selftests: KVM: Don't clobber XMM register when read	Oliver Upton
	There is no need to clobber a register that is only being read from. Oops. Drop the XMM register from the clobbers list. Signed-off-by: Oliver Upton <oupton@google.com> Message-Id: <20210927223621.50178-1-oupton@google.com> Signed-off-by: Paolo Bonzini <pbonzini@redhat.com>
2021-09-27	KVM: VMX: Fix a TSX_CTRL_CPUID_CLEAR field mask issue	Zhenzhong Duan
	When updating the host's mask for its MSR_IA32_TSX_CTRL user return entry, clear the mask in the found uret MSR instead of vmx->guest_uret_msrs[i]. Modifying guest_uret_msrs directly is completely broken as 'i' does not point at the MSR_IA32_TSX_CTRL entry. In fact, it's guaranteed to be an out-of-bounds accesses as is always set to kvm_nr_uret_msrs in a prior loop. By sheer dumb luck, the fallout is limited to "only" failing to preserve the host's TSX_CTRL_CPUID_CLEAR. The out-of-bounds access is benign as it's guaranteed to clear a bit in a guest MSR value, which are always zero at vCPU creation on both x86-64 and i386. Cc: stable@vger.kernel.org Fixes: 8ea8b8d6f869 ("KVM: VMX: Use common x86's uret MSR list as the one true list") Signed-off-by: Zhenzhong Duan <zhenzhong.duan@intel.com> Reviewed-by: Sean Christopherson <seanjc@google.com> Message-Id: <20210926015545.281083-1-zhenzhong.duan@intel.com> Signed-off-by: Paolo Bonzini <pbonzini@redhat.com>
2021-09-24	Merge tag 'kvmarm-fixes-5.15-1' of ↵	Paolo Bonzini
	git://git.kernel.org/pub/scm/linux/kernel/git/kvmarm/kvmarm into kvm-master KVM/arm64 fixes for 5.15, take #1 - Add missing FORCE target when building the EL2 object - Fix a PMU probe regression on some platforms
2021-09-24	selftests: KVM: Explicitly use movq to read xmm registers	Oliver Upton
	Compiling the KVM selftests with clang emits the following warning: >> include/x86_64/processor.h:297:25: error: variable 'xmm0' is uninitialized when used here [-Werror,-Wuninitialized] >> return (unsigned long)xmm0; where xmm0 is accessed via an uninitialized register variable. Indeed, this is a misuse of register variables, which really should only be used for specifying register constraints on variables passed to inline assembly. Rather than attempting to read xmm registers via register variables, just explicitly perform the movq from the desired xmm register. Fixes: 783e9e51266e ("kvm: selftests: add API testing infrastructure") Signed-off-by: Oliver Upton <oupton@google.com> Message-Id: <20210924005147.1122357-1-oupton@google.com> Reviewed-by: Ricardo Koller <ricarkol@google.com> Signed-off-by: Paolo Bonzini <pbonzini@redhat.com>
2021-09-24	selftests: KVM: Call ucall_init when setting up in rseq_test	Oliver Upton
	While x86 does not require any additional setup to use the ucall infrastructure, arm64 needs to set up the MMIO address used to signal a ucall to userspace. rseq_test does not initialize the MMIO address, resulting in the test spinning indefinitely. Fix the issue by calling ucall_init() during setup. Fixes: 61e52f1630f5 ("KVM: selftests: Add a test for KVM_RUN+rseq to detect task migration bugs") Signed-off-by: Oliver Upton <oupton@google.com> Message-Id: <20210923220033.4172362-1-oupton@google.com> Signed-off-by: Paolo Bonzini <pbonzini@redhat.com>
2021-09-23	KVM: Remove tlbs_dirty	Lai Jiangshan
	There is no user of tlbs_dirty. Signed-off-by: Lai Jiangshan <laijs@linux.alibaba.com> Signed-off-by: Paolo Bonzini <pbonzini@redhat.com> Message-Id: <20210918005636.3675-4-jiangshanlai@gmail.com> Signed-off-by: Paolo Bonzini <pbonzini@redhat.com>
2021-09-23	KVM: X86: Synchronize the shadow pagetable before link it	Lai Jiangshan
	If gpte is changed from non-present to present, the guest doesn't need to flush tlb per SDM. So the host must synchronze sp before link it. Otherwise the guest might use a wrong mapping. For example: the guest first changes a level-1 pagetable, and then links its parent to a new place where the original gpte is non-present. Finally the guest can access the remapped area without flushing the tlb. The guest's behavior should be allowed per SDM, but the host kvm mmu makes it wrong. Fixes: 4731d4c7a077 ("KVM: MMU: out of sync shadow core") Signed-off-by: Lai Jiangshan <laijs@linux.alibaba.com> Signed-off-by: Paolo Bonzini <pbonzini@redhat.com> Message-Id: <20210918005636.3675-3-jiangshanlai@gmail.com> Signed-off-by: Paolo Bonzini <pbonzini@redhat.com>
2021-09-23	KVM: X86: Fix missed remote tlb flush in rmap_write_protect()	Lai Jiangshan
	When kvm->tlbs_dirty > 0, some rmaps might have been deleted without flushing tlb remotely after kvm_sync_page(). If @gfn was writable before and it's rmaps was deleted in kvm_sync_page(), and if the tlb entry is still in a remote running VCPU, the @gfn is not safely protected. To fix the problem, kvm_sync_page() does the remote flush when needed to avoid the problem. Fixes: a4ee1ca4a36e ("KVM: MMU: delay flush all tlbs on sync_page path") Signed-off-by: Lai Jiangshan <laijs@linux.alibaba.com> Signed-off-by: Paolo Bonzini <pbonzini@redhat.com> Message-Id: <20210918005636.3675-2-jiangshanlai@gmail.com> Signed-off-by: Paolo Bonzini <pbonzini@redhat.com>
2021-09-23	KVM: x86: nSVM: don't copy virt_ext from vmcb12	Maxim Levitsky
	These field correspond to features that we don't expose yet to L2 While currently there are no CVE worthy features in this field, if AMD adds more features to this field, that could allow guest escapes similar to CVE-2021-3653 and CVE-2021-3656. Signed-off-by: Maxim Levitsky <mlevitsk@redhat.com> Message-Id: <20210914154825.104886-6-mlevitsk@redhat.com> Cc: stable@vger.kernel.org Signed-off-by: Paolo Bonzini <pbonzini@redhat.com>
2021-09-23	KVM: x86: nSVM: test eax for 4K alignment for GP errata workaround	Maxim Levitsky
	GP SVM errata workaround made the #GP handler always emulate the SVM instructions. However these instructions #GP in case the operand is not 4K aligned, but the workaround code didn't check this and we ended up emulating these instructions anyway. This is only an emulation accuracy check bug as there is no harm for KVM to read/write unaligned vmcb images. Fixes: 82a11e9c6fa2 ("KVM: SVM: Add emulation support for #GP triggered by SVM instructions") Signed-off-by: Maxim Levitsky <mlevitsk@redhat.com> Message-Id: <20210914154825.104886-4-mlevitsk@redhat.com> Signed-off-by: Paolo Bonzini <pbonzini@redhat.com>
2021-09-23	KVM: x86: selftests: test simultaneous uses of V_IRQ from L1 and L0	Maxim Levitsky
	Test that if: * L1 disables virtual interrupt masking, and INTR intercept. * L1 setups a virtual interrupt to be injected to L2 and enters L2 with interrupts disabled, thus the virtual interrupt is pending. * Now an external interrupt arrives in L1 and since L1 doesn't intercept it, it should be delivered to L2 when it enables interrupts. to do this L0 (abuses) V_IRQ to setup an interrupt window, and returns to L2. * L2 enables interrupts. This should trigger the interrupt window, injection of the external interrupt and delivery of the virtual interrupt that can now be done. * Test that now L2 gets those interrupts. This is the test that demonstrates the issue that was fixed in the previous patch. Signed-off-by: Maxim Levitsky <mlevitsk@redhat.com> Message-Id: <20210914154825.104886-3-mlevitsk@redhat.com> Signed-off-by: Paolo Bonzini <pbonzini@redhat.com>
2021-09-23	KVM: x86: nSVM: restore int_vector in svm_clear_vintr	Maxim Levitsky
	In svm_clear_vintr we try to restore the virtual interrupt injection that might be pending, but we fail to restore the interrupt vector. Signed-off-by: Maxim Levitsky <mlevitsk@redhat.com> Message-Id: <20210914154825.104886-2-mlevitsk@redhat.com> Signed-off-by: Paolo Bonzini <pbonzini@redhat.com>
2021-09-22	kvm: x86: Add AMD PMU MSRs to msrs_to_save_all[]	Fares Mehanna
	Intel PMU MSRs is in msrs_to_save_all[], so add AMD PMU MSRs to have a consistent behavior between Intel and AMD when using KVM_GET_MSRS, KVM_SET_MSRS or KVM_GET_MSR_INDEX_LIST. We have to add legacy and new MSRs to handle guests running without X86_FEATURE_PERFCTR_CORE. Signed-off-by: Fares Mehanna <faresx@amazon.de> Message-Id: <20210915133951.22389-1-faresx@amazon.de> Signed-off-by: Paolo Bonzini <pbonzini@redhat.com>
2021-09-22	KVM: x86: nVMX: re-evaluate emulation_required on nested VM exit	Maxim Levitsky
	If L1 had invalid state on VM entry (can happen on SMM transactions when we enter from real mode, straight to nested guest), then after we load 'host' state from VMCS12, the state has to become valid again, but since we load the segment registers with __vmx_set_segment we weren't always updating emulation_required. Update emulation_required explicitly at end of load_vmcs12_host_state. Signed-off-by: Maxim Levitsky <mlevitsk@redhat.com> Message-Id: <20210913140954.165665-8-mlevitsk@redhat.com> Signed-off-by: Paolo Bonzini <pbonzini@redhat.com>
2021-09-22	KVM: x86: nVMX: don't fail nested VM entry on invalid guest state if ↵	Maxim Levitsky
	!from_vmentry It is possible that when non root mode is entered via special entry (!from_vmentry), that is from SMM or from loading the nested state, the L2 state could be invalid in regard to non unrestricted guest mode, but later it can become valid. (for example when RSM emulation restores segment registers from SMRAM) Thus delay the check to VM entry, where we will check this and fail. Signed-off-by: Maxim Levitsky <mlevitsk@redhat.com> Message-Id: <20210913140954.165665-7-mlevitsk@redhat.com> Signed-off-by: Paolo Bonzini <pbonzini@redhat.com>
2021-09-22	KVM: x86: VMX: synthesize invalid VM exit when emulating invalid guest state	Maxim Levitsky
	Since no actual VM entry happened, the VM exit information is stale. To avoid this, synthesize an invalid VM guest state VM exit. Suggested-by: Sean Christopherson <seanjc@google.com> Signed-off-by: Maxim Levitsky <mlevitsk@redhat.com> Message-Id: <20210913140954.165665-6-mlevitsk@redhat.com> Signed-off-by: Paolo Bonzini <pbonzini@redhat.com>
2021-09-22	KVM: x86: nSVM: refactor svm_leave_smm and smm_enter_smm	Maxim Levitsky
	Use return statements instead of nested if, and fix error path to free all the maps that were allocated. Suggested-by: Sean Christopherson <seanjc@google.com> Signed-off-by: Maxim Levitsky <mlevitsk@redhat.com> Message-Id: <20210913140954.165665-2-mlevitsk@redhat.com> Signed-off-by: Paolo Bonzini <pbonzini@redhat.com>
2021-09-22	KVM: x86: SVM: call KVM_REQ_GET_NESTED_STATE_PAGES on exit from SMM mode	Maxim Levitsky
	Currently the KVM_REQ_GET_NESTED_STATE_PAGES on SVM only reloads PDPTRs, and MSR bitmap, with former not really needed for SMM as SMM exit code reloads them again from SMRAM'S CR3, and later happens to work since MSR bitmap isn't modified while in SMM. Still it is better to be consistient with VMX. Signed-off-by: Maxim Levitsky <mlevitsk@redhat.com> Message-Id: <20210913140954.165665-5-mlevitsk@redhat.com> Signed-off-by: Paolo Bonzini <pbonzini@redhat.com>
2021-09-22	KVM: x86: reset pdptrs_from_userspace when exiting smm	Maxim Levitsky
	When exiting SMM, pdpts are loaded again from the guest memory. This fixes a theoretical bug, when exit from SMM triggers entry to the nested guest which re-uses some of the migration code which uses this flag as a workaround for a legacy userspace. Signed-off-by: Maxim Levitsky <mlevitsk@redhat.com> Message-Id: <20210913140954.165665-4-mlevitsk@redhat.com> Signed-off-by: Paolo Bonzini <pbonzini@redhat.com>