summaryrefslogtreecommitdiff
path: root/arch/arm64/include
AgeCommit message (Collapse)Author
2021-03-14Merge tag 'for-linus' of git://git.kernel.org/pub/scm/virt/kvm/kvmLinus Torvalds
Pull KVM fixes from Paolo Bonzini: "More fixes for ARM and x86" * tag 'for-linus' of git://git.kernel.org/pub/scm/virt/kvm/kvm: KVM: LAPIC: Advancing the timer expiration on guest initiated write KVM: x86/mmu: Skip !MMU-present SPTEs when removing SP in exclusive mode KVM: kvmclock: Fix vCPUs > 64 can't be online/hotpluged kvm: x86: annotate RCU pointers KVM: arm64: Fix exclusive limit for IPA size KVM: arm64: Reject VM creation when the default IPA size is unsupported KVM: arm64: Ensure I-cache isolation between vcpus of a same VM KVM: arm64: Don't use cbz/adr with external symbols KVM: arm64: Fix range alignment when walking page tables KVM: arm64: Workaround firmware wrongly advertising GICv2-on-v3 compatibility KVM: arm64: Rename __vgic_v3_get_ich_vtr_el2() to __vgic_v3_get_gic_config() KVM: arm64: Don't access PMSELR_EL0/PMUSERENR_EL0 when no PMU is available KVM: arm64: Turn kvm_arm_support_pmu_v3() into a static key KVM: arm64: Fix nVHE hyp panic host context restore KVM: arm64: Avoid corrupting vCPU context register in guest exit KVM: arm64: nvhe: Save the SPE context early kvm: x86: use NULL instead of using plain integer as pointer KVM: SVM: Connect 'npt' module param to KVM's internal 'npt_enabled' KVM: x86: Ensure deadline timer has truly expired before posting its IRQ
2021-03-11arm64: mm: remove unused __cpu_uses_extended_idmap[_level()]Ard Biesheuvel
These routines lost all existing users during the latest merge window so we can remove them. This avoids the need to fix them in the context of fixing a regression related to the ID map on 52-bit VA kernels. Signed-off-by: Ard Biesheuvel <ardb@kernel.org> Link: https://lore.kernel.org/r/20210310171515.416643-3-ardb@kernel.org Signed-off-by: Will Deacon <will@kernel.org>
2021-03-11arm64: mm: use a 48-bit ID map when possible on 52-bit VA buildsArd Biesheuvel
52-bit VA kernels can run on hardware that is only 48-bit capable, but configure the ID map as 52-bit by default. This was not a problem until recently, because the special T0SZ value for a 52-bit VA space was never programmed into the TCR register anwyay, and because a 52-bit ID map happens to use the same number of translation levels as a 48-bit one. This behavior was changed by commit 1401bef703a4 ("arm64: mm: Always update TCR_EL1 from __cpu_set_tcr_t0sz()"), which causes the unsupported T0SZ value for a 52-bit VA to be programmed into TCR_EL1. While some hardware simply ignores this, Mark reports that Amberwing systems choke on this, resulting in a broken boot. But even before that commit, the unsupported idmap_t0sz value was exposed to KVM and used to program TCR_EL2 incorrectly as well. Given that we already have to deal with address spaces being either 48-bit or 52-bit in size, the cleanest approach seems to be to simply default to a 48-bit VA ID map, and only switch to a 52-bit one if the placement of the kernel in DRAM requires it. This is guaranteed not to happen unless the system is actually 52-bit VA capable. Fixes: 90ec95cda91a ("arm64: mm: Introduce VA_BITS_MIN") Reported-by: Mark Salter <msalter@redhat.com> Link: http://lore.kernel.org/r/20210310003216.410037-1-msalter@redhat.com Signed-off-by: Ard Biesheuvel <ardb@kernel.org> Link: https://lore.kernel.org/r/20210310171515.416643-2-ardb@kernel.org Signed-off-by: Will Deacon <will@kernel.org>
2021-03-10arm64/mm: Fix __enable_mmu() for new TGRAN range valuesJames Morse
As per ARM ARM DDI 0487G.a, when FEAT_LPA2 is implemented, ID_AA64MMFR0_EL1 might contain a range of values to describe supported translation granules (4K and 16K pages sizes in particular) instead of just enabled or disabled values. This changes __enable_mmu() function to handle complete acceptable range of values (depending on whether the field is signed or unsigned) now represented with ID_AA64MMFR0_TGRAN_SUPPORTED_[MIN..MAX] pair. While here, also fix similar situations in EFI stub and KVM as well. Cc: Catalin Marinas <catalin.marinas@arm.com> Cc: Will Deacon <will@kernel.org> Cc: Marc Zyngier <maz@kernel.org> Cc: James Morse <james.morse@arm.com> Cc: Suzuki K Poulose <suzuki.poulose@arm.com> Cc: Ard Biesheuvel <ardb@kernel.org> Cc: Mark Rutland <mark.rutland@arm.com> Cc: linux-arm-kernel@lists.infradead.org Cc: kvmarm@lists.cs.columbia.edu Cc: linux-efi@vger.kernel.org Cc: linux-kernel@vger.kernel.org Acked-by: Marc Zyngier <maz@kernel.org> Signed-off-by: James Morse <james.morse@arm.com> Signed-off-by: Anshuman Khandual <anshuman.khandual@arm.com> Link: https://lore.kernel.org/r/1615355590-21102-1-git-send-email-anshuman.khandual@arm.com Signed-off-by: Will Deacon <will@kernel.org>
2021-03-10arm64: mte: Map hotplugged memory as Normal TaggedCatalin Marinas
In a system supporting MTE, the linear map must allow reading/writing allocation tags by setting the memory type as Normal Tagged. Currently, this is only handled for memory present at boot. Hotplugged memory uses Normal non-Tagged memory. Introduce pgprot_mhp() for hotplugged memory and use it in add_memory_resource(). The arm64 code maps pgprot_mhp() to pgprot_tagged(). Note that ZONE_DEVICE memory should not be mapped as Tagged and therefore setting the memory type in arch_add_memory() is not feasible. Signed-off-by: Catalin Marinas <catalin.marinas@arm.com> Fixes: 0178dc761368 ("arm64: mte: Use Normal Tagged attributes for the linear map") Reported-by: Patrick Daly <pdaly@codeaurora.org> Tested-by: Patrick Daly <pdaly@codeaurora.org> Link: https://lore.kernel.org/r/1614745263-27827-1-git-send-email-pdaly@codeaurora.org Cc: <stable@vger.kernel.org> # 5.10.x Cc: Will Deacon <will@kernel.org> Cc: Andrew Morton <akpm@linux-foundation.org> Cc: Vincenzo Frascino <vincenzo.frascino@arm.com> Cc: David Hildenbrand <david@redhat.com> Reviewed-by: David Hildenbrand <david@redhat.com> Reviewed-by: Vincenzo Frascino <vincenzo.frascino@arm.com> Reviewed-by: Anshuman Khandual <anshuman.khandual@arm.com> Link: https://lore.kernel.org/r/20210309122601.5543-1-catalin.marinas@arm.com Signed-off-by: Will Deacon <will@kernel.org>
2021-03-09KVM: arm64: Ensure I-cache isolation between vcpus of a same VMMarc Zyngier
It recently became apparent that the ARMv8 architecture has interesting rules regarding attributes being used when fetching instructions if the MMU is off at Stage-1. In this situation, the CPU is allowed to fetch from the PoC and allocate into the I-cache (unless the memory is mapped with the XN attribute at Stage-2). If we transpose this to vcpus sharing a single physical CPU, it is possible for a vcpu running with its MMU off to influence another vcpu running with its MMU on, as the latter is expected to fetch from the PoU (and self-patching code doesn't flush below that level). In order to solve this, reuse the vcpu-private TLB invalidation code to apply the same policy to the I-cache, nuking it every time the vcpu runs on a physical CPU that ran another vcpu of the same VM in the past. This involve renaming __kvm_tlb_flush_local_vmid() to __kvm_flush_cpu_context(), and inserting a local i-cache invalidation there. Cc: stable@vger.kernel.org Signed-off-by: Marc Zyngier <maz@kernel.org> Acked-by: Will Deacon <will@kernel.org> Acked-by: Catalin Marinas <catalin.marinas@arm.com> Link: https://lore.kernel.org/r/20210303164505.68492-1-maz@kernel.org
2021-03-09arm64: kasan: fix page_alloc tagging with DEBUG_VIRTUALAndrey Konovalov
When CONFIG_DEBUG_VIRTUAL is enabled, the default page_to_virt() macro implementation from include/linux/mm.h is used. That definition doesn't account for KASAN tags, which leads to no tags on page_alloc allocations. Provide an arm64-specific definition for page_to_virt() when CONFIG_DEBUG_VIRTUAL is enabled that takes care of KASAN tags. Fixes: 2813b9c02962 ("kasan, mm, arm64: tag non slab memory allocated via pagealloc") Cc: <stable@vger.kernel.org> Signed-off-by: Andrey Konovalov <andreyknvl@google.com> Reviewed-by: Catalin Marinas <catalin.marinas@arm.com> Link: https://lore.kernel.org/r/4b55b35202706223d3118230701c6a59749d9b72.1615219501.git.andreyknvl@google.com Signed-off-by: Will Deacon <will@kernel.org>
2021-03-06KVM: arm64: Rename __vgic_v3_get_ich_vtr_el2() to __vgic_v3_get_gic_config()Marc Zyngier
As we are about to report a bit more information to the rest of the kernel, rename __vgic_v3_get_ich_vtr_el2() to the more explicit __vgic_v3_get_gic_config(). No functional change. Tested-by: Shameer Kolothum <shameerali.kolothum.thodi@huawei.com> Signed-off-by: Marc Zyngier <maz@kernel.org> Message-Id: <20210305185254.3730990-7-maz@kernel.org> Signed-off-by: Paolo Bonzini <pbonzini@redhat.com>
2021-03-06KVM: arm64: Fix nVHE hyp panic host context restoreAndrew Scull
When panicking from the nVHE hyp and restoring the host context, x29 is expected to hold a pointer to the host context. This wasn't being done so fix it to make sure there's a valid pointer the host context being used. Rather than passing a boolean indicating whether or not the host context should be restored, instead pass the pointer to the host context. NULL is passed to indicate that no context should be restored. Fixes: a2e102e20fd6 ("KVM: arm64: nVHE: Handle hyp panics") Cc: stable@vger.kernel.org Signed-off-by: Andrew Scull <ascull@google.com> [maz: partial rewrite to fit 5.12-rc1] Signed-off-by: Marc Zyngier <maz@kernel.org> Link: https://lore.kernel.org/r/20210219122406.1337626-1-ascull@google.com Message-Id: <20210305185254.3730990-4-maz@kernel.org> Signed-off-by: Paolo Bonzini <pbonzini@redhat.com>
2021-03-06KVM: arm64: nvhe: Save the SPE context earlySuzuki K Poulose
The nVHE KVM hyp drains and disables the SPE buffer, before entering the guest, as the EL1&0 translation regime is going to be loaded with that of the guest. But this operation is performed way too late, because : - The owning translation regime of the SPE buffer is transferred to EL2. (MDCR_EL2_E2PB == 0) - The guest Stage1 is loaded. Thus the flush could use the host EL1 virtual address, but use the EL2 translations instead of host EL1, for writing out any cached data. Fix this by moving the SPE buffer handling early enough. The restore path is doing the right thing. Fixes: 014c4c77aad7 ("KVM: arm64: Improve debug register save/restore flow") Cc: stable@vger.kernel.org Cc: Christoffer Dall <christoffer.dall@arm.com> Cc: Marc Zyngier <maz@kernel.org> Cc: Will Deacon <will@kernel.org> Cc: Catalin Marinas <catalin.marinas@arm.com> Cc: Mark Rutland <mark.rutland@arm.com> Cc: Alexandru Elisei <alexandru.elisei@arm.com> Reviewed-by: Alexandru Elisei <alexandru.elisei@arm.com> Signed-off-by: Suzuki K Poulose <suzuki.poulose@arm.com> Signed-off-by: Marc Zyngier <maz@kernel.org> Link: https://lore.kernel.org/r/20210302120345.3102874-1-suzuki.poulose@arm.com Message-Id: <20210305185254.3730990-2-maz@kernel.org> Signed-off-by: Paolo Bonzini <pbonzini@redhat.com>
2021-02-26Merge tag 'riscv-for-linus-5.12-mw0' of ↵Linus Torvalds
git://git.kernel.org/pub/scm/linux/kernel/git/riscv/linux Pull RISC-V updates from Palmer Dabbelt: "A handful of new RISC-V related patches for this merge window: - A check to ensure drivers are properly using uaccess. This isn't manifesting with any of the drivers I'm currently using, but may catch errors in new drivers. - Some preliminary support for the FU740, along with the HiFive Unleashed it will appear on. - NUMA support for RISC-V, which involves making the arm64 code generic. - Support for kasan on the vmalloc region. - A handful of new drivers for the Kendryte K210, along with the DT plumbing required to boot on a handful of K210-based boards. - Support for allocating ASIDs. - Preliminary support for kernels larger than 128MiB. - Various other improvements to our KASAN support, including the utilization of huge pages when allocating the KASAN regions. We may have already found a bug with the KASAN_VMALLOC code, but it's passing my tests. There's a fix in the works, but that will probably miss the merge window. * tag 'riscv-for-linus-5.12-mw0' of git://git.kernel.org/pub/scm/linux/kernel/git/riscv/linux: (75 commits) riscv: Improve kasan population by using hugepages when possible riscv: Improve kasan population function riscv: Use KASAN_SHADOW_INIT define for kasan memory initialization riscv: Improve kasan definitions riscv: Get rid of MAX_EARLY_MAPPING_SIZE soc: canaan: Sort the Makefile alphabetically riscv: Disable KSAN_SANITIZE for vDSO riscv: Remove unnecessary declaration riscv: Add Canaan Kendryte K210 SD card defconfig riscv: Update Canaan Kendryte K210 defconfig riscv: Add Kendryte KD233 board device tree riscv: Add SiPeed MAIXDUINO board device tree riscv: Add SiPeed MAIX GO board device tree riscv: Add SiPeed MAIX DOCK board device tree riscv: Add SiPeed MAIX BiT board device tree riscv: Update Canaan Kendryte K210 device tree dt-bindings: add resets property to dw-apb-timer dt-bindings: fix sifive gpio properties dt-bindings: update sifive uart compatible string dt-bindings: update sifive clint compatible string ...
2021-02-26Merge tag 'arm64-fixes' of ↵Linus Torvalds
git://git.kernel.org/pub/scm/linux/kernel/git/arm64/linux Pull arm64 fixes from Will Deacon: "The big one is a fix for the VHE enabling path during early boot, where the code enabling the MMU wasn't necessarily in the identity map of the new page-tables, resulting in a consistent crash with 64k pages. In fixing that, we noticed some missing barriers too, so we added those for the sake of architectural compliance. Other than that, just the usual merge window trickle. There'll be more to come, too. Summary: - Fix lockdep false alarm on resume-from-cpuidle path - Fix memory leak in kexec_file - Fix module linker script to work with GDB - Fix error code when trying to use uprobes with AArch32 instructions - Fix late VHE enabling with 64k pages - Add missing ISBs after TLB invalidation - Fix seccomp when tracing syscall -1 - Fix stacktrace return code at end of stack - Fix inconsistent whitespace for pointer return values - Fix compiler warnings when building with W=1" * tag 'arm64-fixes' of git://git.kernel.org/pub/scm/linux/kernel/git/arm64/linux: arm64: stacktrace: Report when we reach the end of the stack arm64: ptrace: Fix seccomp of traced syscall -1 (NO_SYSCALL) arm64: Add missing ISB after invalidating TLB in enter_vhe arm64: Add missing ISB after invalidating TLB in __primary_switch arm64: VHE: Enable EL2 MMU from the idmap KVM: arm64: make the hyp vector table entries local arm64/mm: Fixed some coding style issues arm64: uprobe: Return EOPNOTSUPP for AARCH32 instruction probing kexec: move machine_kexec_post_load() to public interface arm64 module: set plt* section addresses to 0x0 arm64: kexec_file: fix memory leakage in create_dtb() when fdt_open_into() fails arm64: spectre: Prevent lockdep splat on v4 mitigation enable path
2021-02-26arm64: kasan: simplify and inline MTE functionsAndrey Konovalov
This change provides a simpler implementation of mte_get_mem_tag(), mte_get_random_tag(), and mte_set_mem_tag_range(). Simplifications include removing system_supports_mte() checks as these functions are onlye called from KASAN runtime that had already checked system_supports_mte(). Besides that, size and address alignment checks are removed from mte_set_mem_tag_range(), as KASAN now does those. This change also moves these functions into the asm/mte-kasan.h header and implements mte_set_mem_tag_range() via inline assembly to avoid unnecessary functions calls. [vincenzo.frascino@arm.com: fix warning in mte_get_random_tag()] Link: https://lkml.kernel.org/r/20210211152208.23811-1-vincenzo.frascino@arm.com Link: https://lkml.kernel.org/r/a26121b294fdf76e369cb7a74351d1c03a908930.1612546384.git.andreyknvl@google.com Co-developed-by: Vincenzo Frascino <vincenzo.frascino@arm.com> Signed-off-by: Vincenzo Frascino <vincenzo.frascino@arm.com> Signed-off-by: Andrey Konovalov <andreyknvl@google.com> Reviewed-by: Catalin Marinas <catalin.marinas@arm.com> Cc: Alexander Potapenko <glider@google.com> Cc: Andrey Ryabinin <aryabinin@virtuozzo.com> Cc: Branislav Rankov <Branislav.Rankov@arm.com> Cc: Dmitry Vyukov <dvyukov@google.com> Cc: Evgenii Stepanov <eugenis@google.com> Cc: Kevin Brodsky <kevin.brodsky@arm.com> Cc: Marco Elver <elver@google.com> Cc: Peter Collingbourne <pcc@google.com> Cc: Will Deacon <will.deacon@arm.com> Signed-off-by: Andrew Morton <akpm@linux-foundation.org> Signed-off-by: Linus Torvalds <torvalds@linux-foundation.org>
2021-02-26kfence: use pt_regs to generate stack trace on faultsMarco Elver
Instead of removing the fault handling portion of the stack trace based on the fault handler's name, just use struct pt_regs directly. Change kfence_handle_page_fault() to take a struct pt_regs, and plumb it through to kfence_report_error() for out-of-bounds, use-after-free, or invalid access errors, where pt_regs is used to generate the stack trace. If the kernel is a DEBUG_KERNEL, also show registers for more information. Link: https://lkml.kernel.org/r/20201105092133.2075331-1-elver@google.com Signed-off-by: Marco Elver <elver@google.com> Suggested-by: Mark Rutland <mark.rutland@arm.com> Acked-by: Mark Rutland <mark.rutland@arm.com> Cc: Alexander Potapenko <glider@google.com> Cc: Dmitry Vyukov <dvyukov@google.com> Cc: Jann Horn <jannh@google.com> Signed-off-by: Andrew Morton <akpm@linux-foundation.org> Signed-off-by: Linus Torvalds <torvalds@linux-foundation.org>
2021-02-26arm64, kfence: enable KFENCE for ARM64Marco Elver
Add architecture specific implementation details for KFENCE and enable KFENCE for the arm64 architecture. In particular, this implements the required interface in <asm/kfence.h>. KFENCE requires that attributes for pages from its memory pool can individually be set. Therefore, force the entire linear map to be mapped at page granularity. Doing so may result in extra memory allocated for page tables in case rodata=full is not set; however, currently CONFIG_RODATA_FULL_DEFAULT_ENABLED=y is the default, and the common case is therefore not affected by this change. [elver@google.com: add missing copyright and description header] Link: https://lkml.kernel.org/r/20210118092159.145934-3-elver@google.com Link: https://lkml.kernel.org/r/20201103175841.3495947-4-elver@google.com Signed-off-by: Alexander Potapenko <glider@google.com> Signed-off-by: Marco Elver <elver@google.com> Reviewed-by: Dmitry Vyukov <dvyukov@google.com> Co-developed-by: Alexander Potapenko <glider@google.com> Reviewed-by: Jann Horn <jannh@google.com> Reviewed-by: Mark Rutland <mark.rutland@arm.com> Cc: Andrey Konovalov <andreyknvl@google.com> Cc: Andrey Ryabinin <aryabinin@virtuozzo.com> Cc: Andy Lutomirski <luto@kernel.org> Cc: Borislav Petkov <bp@alien8.de> Cc: Catalin Marinas <catalin.marinas@arm.com> Cc: Christopher Lameter <cl@linux.com> Cc: Dave Hansen <dave.hansen@linux.intel.com> Cc: David Rientjes <rientjes@google.com> Cc: Eric Dumazet <edumazet@google.com> Cc: Greg Kroah-Hartman <gregkh@linuxfoundation.org> Cc: Hillf Danton <hdanton@sina.com> Cc: "H. Peter Anvin" <hpa@zytor.com> Cc: Ingo Molnar <mingo@redhat.com> Cc: Joern Engel <joern@purestorage.com> Cc: Jonathan Corbet <corbet@lwn.net> Cc: Joonsoo Kim <iamjoonsoo.kim@lge.com> Cc: Kees Cook <keescook@chromium.org> Cc: Paul E. McKenney <paulmck@kernel.org> Cc: Pekka Enberg <penberg@kernel.org> Cc: Peter Zijlstra <peterz@infradead.org> Cc: SeongJae Park <sjpark@amazon.de> Cc: Thomas Gleixner <tglx@linutronix.de> Cc: Vlastimil Babka <vbabka@suse.cz> Cc: Will Deacon <will@kernel.org> Signed-off-by: Andrew Morton <akpm@linux-foundation.org> Signed-off-by: Linus Torvalds <torvalds@linux-foundation.org>
2021-02-24Merge branch 'akpm' (patches from Andrew)Linus Torvalds
Merge misc updates from Andrew Morton: "A few small subsystems and some of MM. 172 patches. Subsystems affected by this patch series: hexagon, scripts, ntfs, ocfs2, vfs, and mm (slab-generic, slab, slub, debug, pagecache, swap, memcg, pagemap, mprotect, mremap, page-reporting, vmalloc, kasan, pagealloc, memory-failure, hugetlb, vmscan, z3fold, compaction, mempolicy, oom-kill, hugetlbfs, and migration)" * emailed patches from Andrew Morton <akpm@linux-foundation.org>: (172 commits) mm/migrate: remove unneeded semicolons hugetlbfs: remove unneeded return value of hugetlb_vmtruncate() hugetlbfs: fix some comment typos hugetlbfs: correct some obsolete comments about inode i_mutex hugetlbfs: make hugepage size conversion more readable hugetlbfs: remove meaningless variable avoid_reserve hugetlbfs: correct obsolete function name in hugetlbfs_read_iter() hugetlbfs: use helper macro default_hstate in init_hugetlbfs_fs hugetlbfs: remove useless BUG_ON(!inode) in hugetlbfs_setattr() hugetlbfs: remove special hugetlbfs_set_page_dirty() mm/hugetlb: change hugetlb_reserve_pages() to type bool mm, oom: fix a comment in dump_task() mm/mempolicy: use helper range_in_vma() in queue_pages_test_walk() numa balancing: migrate on fault among multiple bound nodes mm, compaction: make fast_isolate_freepages() stay within zone mm/compaction: fix misbehaviors of fast_find_migrateblock() mm/compaction: correct deferral logic for proactive compaction mm/compaction: remove duplicated VM_BUG_ON_PAGE !PageLocked mm/compaction: remove rcu_read_lock during page compaction z3fold: simplify the zhdr initialization code in init_z3fold_page() ...
2021-02-24kasan, arm64: allow using KUnit tests with HW_TAGS modeAndrey Konovalov
On a high level, this patch allows running KUnit KASAN tests with the hardware tag-based KASAN mode. Internally, this change reenables tag checking at the end of each KASAN test that triggers a tag fault and leads to tag checking being disabled. Also simplify is_write calculation in report_tag_fault. With this patch KASAN tests are still failing for the hardware tag-based mode; fixes come in the next few patches. [andreyknvl@google.com: export HW_TAGS symbols for KUnit tests] Link: https://lkml.kernel.org/r/e7eeb252da408b08f0c81b950a55fb852f92000b.1613155970.git.andreyknvl@google.com Link: https://linux-review.googlesource.com/id/Id94dc9eccd33b23cda4950be408c27f879e474c8 Link: https://lkml.kernel.org/r/51b23112cf3fd62b8f8e9df81026fa2b15870501.1610733117.git.andreyknvl@google.com Signed-off-by: Andrey Konovalov <andreyknvl@google.com> Reviewed-by: Catalin Marinas <catalin.marinas@arm.com> Reviewed-by: Vincenzo Frascino <vincenzo.frascino@arm.com> Cc: Alexander Potapenko <glider@google.com> Cc: Andrey Ryabinin <aryabinin@virtuozzo.com> Cc: Branislav Rankov <Branislav.Rankov@arm.com> Cc: Dmitry Vyukov <dvyukov@google.com> Cc: Evgenii Stepanov <eugenis@google.com> Cc: Kevin Brodsky <kevin.brodsky@arm.com> Cc: Marco Elver <elver@google.com> Cc: Peter Collingbourne <pcc@google.com> Cc: Will Deacon <will.deacon@arm.com> Signed-off-by: Andrew Morton <akpm@linux-foundation.org> Signed-off-by: Linus Torvalds <torvalds@linux-foundation.org>
2021-02-24Merge tag 'char-misc-5.12-rc1' of ↵Linus Torvalds
git://git.kernel.org/pub/scm/linux/kernel/git/gregkh/char-misc Pull char/misc driver updates from Greg KH: "Here is the large set of char/misc/whatever driver subsystem updates for 5.12-rc1. Over time it seems like this tree is collecting more and more tiny driver subsystems in one place, making it easier for those maintainers, which is why this is getting larger. Included in here are: - coresight driver updates - habannalabs driver updates - virtual acrn driver addition (proper acks from the x86 maintainers) - broadcom misc driver addition - speakup driver updates - soundwire driver updates - fpga driver updates - amba driver updates - mei driver updates - vfio driver updates - greybus driver updates - nvmeem driver updates - phy driver updates - mhi driver updates - interconnect driver udpates - fsl-mc bus driver updates - random driver fix - some small misc driver updates (rtsx, pvpanic, etc.) All of these have been in linux-next for a while, with the only reported issue being a merge conflict due to the dfl_device_id addition from the fpga subsystem in here" * tag 'char-misc-5.12-rc1' of git://git.kernel.org/pub/scm/linux/kernel/git/gregkh/char-misc: (311 commits) spmi: spmi-pmic-arb: Fix hw_irq overflow Documentation: coresight: Add PID tracing description coresight: etm-perf: Support PID tracing for kernel at EL2 coresight: etm-perf: Clarify comment on perf options ACRN: update MAINTAINERS: mailing list is subscribers-only regmap: sdw-mbq: use MODULE_LICENSE("GPL") regmap: sdw: use no_pm routines for SoundWire 1.2 MBQ regmap: sdw: use _no_pm functions in regmap_read/write soundwire: intel: fix possible crash when no device is detected MAINTAINERS: replace my with email with replacements mhi: Fix double dma free uapi: map_to_7segment: Update example in documentation uio: uio_pci_generic: don't fail probe if pdev->irq equals to IRQ_NOTCONNECTED drivers/misc/vmw_vmci: restrict too big queue size in qp_host_alloc_queue firewire: replace tricky statement by two simple ones vme: make remove callback return void firmware: google: make coreboot driver's remove callback return void firmware: xilinx: Use explicit values for all enum values sample/acrn: Introduce a sample of HSM ioctl interface usage virt: acrn: Introduce an interface for Service VM to control vCPU ...
2021-02-23Merge tag 'idmapped-mounts-v5.12' of ↵Linus Torvalds
git://git.kernel.org/pub/scm/linux/kernel/git/brauner/linux Pull idmapped mounts from Christian Brauner: "This introduces idmapped mounts which has been in the making for some time. Simply put, different mounts can expose the same file or directory with different ownership. This initial implementation comes with ports for fat, ext4 and with Christoph's port for xfs with more filesystems being actively worked on by independent people and maintainers. Idmapping mounts handle a wide range of long standing use-cases. Here are just a few: - Idmapped mounts make it possible to easily share files between multiple users or multiple machines especially in complex scenarios. For example, idmapped mounts will be used in the implementation of portable home directories in systemd-homed.service(8) where they allow users to move their home directory to an external storage device and use it on multiple computers where they are assigned different uids and gids. This effectively makes it possible to assign random uids and gids at login time. - It is possible to share files from the host with unprivileged containers without having to change ownership permanently through chown(2). - It is possible to idmap a container's rootfs and without having to mangle every file. For example, Chromebooks use it to share the user's Download folder with their unprivileged containers in their Linux subsystem. - It is possible to share files between containers with non-overlapping idmappings. - Filesystem that lack a proper concept of ownership such as fat can use idmapped mounts to implement discretionary access (DAC) permission checking. - They allow users to efficiently changing ownership on a per-mount basis without having to (recursively) chown(2) all files. In contrast to chown (2) changing ownership of large sets of files is instantenous with idmapped mounts. This is especially useful when ownership of a whole root filesystem of a virtual machine or container is changed. With idmapped mounts a single syscall mount_setattr syscall will be sufficient to change the ownership of all files. - Idmapped mounts always take the current ownership into account as idmappings specify what a given uid or gid is supposed to be mapped to. This contrasts with the chown(2) syscall which cannot by itself take the current ownership of the files it changes into account. It simply changes the ownership to the specified uid and gid. This is especially problematic when recursively chown(2)ing a large set of files which is commong with the aforementioned portable home directory and container and vm scenario. - Idmapped mounts allow to change ownership locally, restricting it to specific mounts, and temporarily as the ownership changes only apply as long as the mount exists. Several userspace projects have either already put up patches and pull-requests for this feature or will do so should you decide to pull this: - systemd: In a wide variety of scenarios but especially right away in their implementation of portable home directories. https://systemd.io/HOME_DIRECTORY/ - container runtimes: containerd, runC, LXD:To share data between host and unprivileged containers, unprivileged and privileged containers, etc. The pull request for idmapped mounts support in containerd, the default Kubernetes runtime is already up for quite a while now: https://github.com/containerd/containerd/pull/4734 - The virtio-fs developers and several users have expressed interest in using this feature with virtual machines once virtio-fs is ported. - ChromeOS: Sharing host-directories with unprivileged containers. I've tightly synced with all those projects and all of those listed here have also expressed their need/desire for this feature on the mailing list. For more info on how people use this there's a bunch of talks about this too. Here's just two recent ones: https://www.cncf.io/wp-content/uploads/2020/12/Rootless-Containers-in-Gitpod.pdf https://fosdem.org/2021/schedule/event/containers_idmap/ This comes with an extensive xfstests suite covering both ext4 and xfs: https://git.kernel.org/brauner/xfstests-dev/h/idmapped_mounts It covers truncation, creation, opening, xattrs, vfscaps, setid execution, setgid inheritance and more both with idmapped and non-idmapped mounts. It already helped to discover an unrelated xfs setgid inheritance bug which has since been fixed in mainline. It will be sent for inclusion with the xfstests project should you decide to merge this. In order to support per-mount idmappings vfsmounts are marked with user namespaces. The idmapping of the user namespace will be used to map the ids of vfs objects when they are accessed through that mount. By default all vfsmounts are marked with the initial user namespace. The initial user namespace is used to indicate that a mount is not idmapped. All operations behave as before and this is verified in the testsuite. Based on prior discussions we want to attach the whole user namespace and not just a dedicated idmapping struct. This allows us to reuse all the helpers that already exist for dealing with idmappings instead of introducing a whole new range of helpers. In addition, if we decide in the future that we are confident enough to enable unprivileged users to setup idmapped mounts the permission checking can take into account whether the caller is privileged in the user namespace the mount is currently marked with. The user namespace the mount will be marked with can be specified by passing a file descriptor refering to the user namespace as an argument to the new mount_setattr() syscall together with the new MOUNT_ATTR_IDMAP flag. The system call follows the openat2() pattern of extensibility. The following conditions must be met in order to create an idmapped mount: - The caller must currently have the CAP_SYS_ADMIN capability in the user namespace the underlying filesystem has been mounted in. - The underlying filesystem must support idmapped mounts. - The mount must not already be idmapped. This also implies that the idmapping of a mount cannot be altered once it has been idmapped. - The mount must be a detached/anonymous mount, i.e. it must have been created by calling open_tree() with the OPEN_TREE_CLONE flag and it must not already have been visible in the filesystem. The last two points guarantee easier semantics for userspace and the kernel and make the implementation significantly simpler. By default vfsmounts are marked with the initial user namespace and no behavioral or performance changes are observed. The manpage with a detailed description can be found here: https://git.kernel.org/brauner/man-pages/c/1d7b902e2875a1ff342e036a9f866a995640aea8 In order to support idmapped mounts, filesystems need to be changed and mark themselves with the FS_ALLOW_IDMAP flag in fs_flags. The patches to convert individual filesystem are not very large or complicated overall as can be seen from the included fat, ext4, and xfs ports. Patches for other filesystems are actively worked on and will be sent out separately. The xfstestsuite can be used to verify that port has been done correctly. The mount_setattr() syscall is motivated independent of the idmapped mounts patches and it's been around since July 2019. One of the most valuable features of the new mount api is the ability to perform mounts based on file descriptors only. Together with the lookup restrictions available in the openat2() RESOLVE_* flag namespace which we added in v5.6 this is the first time we are close to hardened and race-free (e.g. symlinks) mounting and path resolution. While userspace has started porting to the new mount api to mount proper filesystems and create new bind-mounts it is currently not possible to change mount options of an already existing bind mount in the new mount api since the mount_setattr() syscall is missing. With the addition of the mount_setattr() syscall we remove this last restriction and userspace can now fully port to the new mount api, covering every use-case the old mount api could. We also add the crucial ability to recursively change mount options for a whole mount tree, both removing and adding mount options at the same time. This syscall has been requested multiple times by various people and projects. There is a simple tool available at https://github.com/brauner/mount-idmapped that allows to create idmapped mounts so people can play with this patch series. I'll add support for the regular mount binary should you decide to pull this in the following weeks: Here's an example to a simple idmapped mount of another user's home directory: u1001@f2-vm:/$ sudo ./mount --idmap both:1000:1001:1 /home/ubuntu/ /mnt u1001@f2-vm:/$ ls -al /home/ubuntu/ total 28 drwxr-xr-x 2 ubuntu ubuntu 4096 Oct 28 22:07 . drwxr-xr-x 4 root root 4096 Oct 28 04:00 .. -rw------- 1 ubuntu ubuntu 3154 Oct 28 22:12 .bash_history -rw-r--r-- 1 ubuntu ubuntu 220 Feb 25 2020 .bash_logout -rw-r--r-- 1 ubuntu ubuntu 3771 Feb 25 2020 .bashrc -rw-r--r-- 1 ubuntu ubuntu 807 Feb 25 2020 .profile -rw-r--r-- 1 ubuntu ubuntu 0 Oct 16 16:11 .sudo_as_admin_successful -rw------- 1 ubuntu ubuntu 1144 Oct 28 00:43 .viminfo u1001@f2-vm:/$ ls -al /mnt/ total 28 drwxr-xr-x 2 u1001 u1001 4096 Oct 28 22:07 . drwxr-xr-x 29 root root 4096 Oct 28 22:01 .. -rw------- 1 u1001 u1001 3154 Oct 28 22:12 .bash_history -rw-r--r-- 1 u1001 u1001 220 Feb 25 2020 .bash_logout -rw-r--r-- 1 u1001 u1001 3771 Feb 25 2020 .bashrc -rw-r--r-- 1 u1001 u1001 807 Feb 25 2020 .profile -rw-r--r-- 1 u1001 u1001 0 Oct 16 16:11 .sudo_as_admin_successful -rw------- 1 u1001 u1001 1144 Oct 28 00:43 .viminfo u1001@f2-vm:/$ touch /mnt/my-file u1001@f2-vm:/$ setfacl -m u:1001:rwx /mnt/my-file u1001@f2-vm:/$ sudo setcap -n 1001 cap_net_raw+ep /mnt/my-file u1001@f2-vm:/$ ls -al /mnt/my-file -rw-rwxr--+ 1 u1001 u1001 0 Oct 28 22:14 /mnt/my-file u1001@f2-vm:/$ ls -al /home/ubuntu/my-file -rw-rwxr--+ 1 ubuntu ubuntu 0 Oct 28 22:14 /home/ubuntu/my-file u1001@f2-vm:/$ getfacl /mnt/my-file getfacl: Removing leading '/' from absolute path names # file: mnt/my-file # owner: u1001 # group: u1001 user::rw- user:u1001:rwx group::rw- mask::rwx other::r-- u1001@f2-vm:/$ getfacl /home/ubuntu/my-file getfacl: Removing leading '/' from absolute path names # file: home/ubuntu/my-file # owner: ubuntu # group: ubuntu user::rw- user:ubuntu:rwx group::rw- mask::rwx other::r--" * tag 'idmapped-mounts-v5.12' of git://git.kernel.org/pub/scm/linux/kernel/git/brauner/linux: (41 commits) xfs: remove the possibly unused mp variable in xfs_file_compat_ioctl xfs: support idmapped mounts ext4: support idmapped mounts fat: handle idmapped mounts tests: add mount_setattr() selftests fs: introduce MOUNT_ATTR_IDMAP fs: add mount_setattr() fs: add attr_flags_to_mnt_flags helper fs: split out functions to hold writers namespace: only take read lock in do_reconfigure_mnt() mount: make {lock,unlock}_mount_hash() static namespace: take lock_mount_hash() directly when changing flags nfs: do not export idmapped mounts overlayfs: do not mount on top of idmapped mounts ecryptfs: do not mount on top of idmapped mounts ima: handle idmapped mounts apparmor: handle idmapped mounts fs: make helpers idmap mount aware exec: handle idmapped mounts would_dump: handle idmapped mounts ...
2021-02-21Merge tag 'for-linus' of git://git.kernel.org/pub/scm/virt/kvm/kvmLinus Torvalds
Pull KVM updates from Paolo Bonzini: "x86: - Support for userspace to emulate Xen hypercalls - Raise the maximum number of user memslots - Scalability improvements for the new MMU. Instead of the complex "fast page fault" logic that is used in mmu.c, tdp_mmu.c uses an rwlock so that page faults are concurrent, but the code that can run against page faults is limited. Right now only page faults take the lock for reading; in the future this will be extended to some cases of page table destruction. I hope to switch the default MMU around 5.12-rc3 (some testing was delayed due to Chinese New Year). - Cleanups for MAXPHYADDR checks - Use static calls for vendor-specific callbacks - On AMD, use VMLOAD/VMSAVE to save and restore host state - Stop using deprecated jump label APIs - Workaround for AMD erratum that made nested virtualization unreliable - Support for LBR emulation in the guest - Support for communicating bus lock vmexits to userspace - Add support for SEV attestation command - Miscellaneous cleanups PPC: - Support for second data watchpoint on POWER10 - Remove some complex workarounds for buggy early versions of POWER9 - Guest entry/exit fixes ARM64: - Make the nVHE EL2 object relocatable - Cleanups for concurrent translation faults hitting the same page - Support for the standard TRNG hypervisor call - A bunch of small PMU/Debug fixes - Simplification of the early init hypercall handling Non-KVM changes (with acks): - Detection of contended rwlocks (implemented only for qrwlocks, because KVM only needs it for x86) - Allow __DISABLE_EXPORTS from assembly code - Provide a saner follow_pfn replacements for modules" * tag 'for-linus' of git://git.kernel.org/pub/scm/virt/kvm/kvm: (192 commits) KVM: x86/xen: Explicitly pad struct compat_vcpu_info to 64 bytes KVM: selftests: Don't bother mapping GVA for Xen shinfo test KVM: selftests: Fix hex vs. decimal snafu in Xen test KVM: selftests: Fix size of memslots created by Xen tests KVM: selftests: Ignore recently added Xen tests' build output KVM: selftests: Add missing header file needed by xAPIC IPI tests KVM: selftests: Add operand to vmsave/vmload/vmrun in svm.c KVM: SVM: Make symbol 'svm_gp_erratum_intercept' static locking/arch: Move qrwlock.h include after qspinlock.h KVM: PPC: Book3S HV: Fix host radix SLB optimisation with hash guests KVM: PPC: Book3S HV: Ensure radix guest has no SLB entries KVM: PPC: Don't always report hash MMU capability for P9 < DD2.2 KVM: PPC: Book3S HV: Save and restore FSCR in the P9 path KVM: PPC: remove unneeded semicolon KVM: PPC: Book3S HV: Use POWER9 SLBIA IH=6 variant to clear SLB KVM: PPC: Book3S HV: No need to clear radix host SLB before loading HPT guest KVM: PPC: Book3S HV: Fix radix guest SLB side channel KVM: PPC: Book3S HV: Remove support for running HPT guest on RPT host without mixed mode support KVM: PPC: Book3S HV: Introduce new capability for 2nd DAWR KVM: PPC: Book3S HV: Add infrastructure to support 2nd DAWR ...
2021-02-21Merge tag 'arm64-upstream' of ↵Linus Torvalds
git://git.kernel.org/pub/scm/linux/kernel/git/arm64/linux Pull arm64 updates from Will Deacon: - vDSO build improvements including support for building with BSD. - Cleanup to the AMU support code and initialisation rework to support cpufreq drivers built as modules. - Removal of synthetic frame record from exception stack when entering the kernel from EL0. - Add support for the TRNG firmware call introduced by Arm spec DEN0098. - Cleanup and refactoring across the board. - Avoid calling arch_get_random_seed_long() from add_interrupt_randomness() - Perf and PMU updates including support for Cortex-A78 and the v8.3 SPE extensions. - Significant steps along the road to leaving the MMU enabled during kexec relocation. - Faultaround changes to initialise prefaulted PTEs as 'old' when hardware access-flag updates are supported, which drastically improves vmscan performance. - CPU errata updates for Cortex-A76 (#1463225) and Cortex-A55 (#1024718) - Preparatory work for yielding the vector unit at a finer granularity in the crypto code, which in turn will one day allow us to defer softirq processing when it is in use. - Support for overriding CPU ID register fields on the command-line. * tag 'arm64-upstream' of git://git.kernel.org/pub/scm/linux/kernel/git/arm64/linux: (85 commits) drivers/perf: Replace spin_lock_irqsave to spin_lock mm: filemap: Fix microblaze build failure with 'mmu_defconfig' arm64: Make CPU_BIG_ENDIAN depend on ld.bfd or ld.lld 13.0.0+ arm64: cpufeatures: Allow disabling of Pointer Auth from the command-line arm64: Defer enabling pointer authentication on boot core arm64: cpufeatures: Allow disabling of BTI from the command-line arm64: Move "nokaslr" over to the early cpufeature infrastructure KVM: arm64: Document HVC_VHE_RESTART stub hypercall arm64: Make kvm-arm.mode={nvhe, protected} an alias of id_aa64mmfr1.vh=0 arm64: Add an aliasing facility for the idreg override arm64: Honor VHE being disabled from the command-line arm64: Allow ID_AA64MMFR1_EL1.VH to be overridden from the command line arm64: cpufeature: Add an early command-line cpufeature override facility arm64: Extract early FDT mapping from kaslr_early_init() arm64: cpufeature: Use IDreg override in __read_sysreg_by_encoding() arm64: cpufeature: Add global feature override facility arm64: Move SCTLR_EL1 initialisation to EL-agnostic code arm64: Simplify init_el2_state to be non-VHE only arm64: Move VHE-specific SPE setup to mutate_to_vhe() arm64: Drop early setting of MDSCR_EL2.TPMS ...
2021-02-19arm64 module: set plt* section addresses to 0x0Shaoying Xu
These plt* and .text.ftrace_trampoline sections specified for arm64 have non-zero addressses. Non-zero section addresses in a relocatable ELF would confuse GDB when it tries to compute the section offsets and it ends up printing wrong symbol addresses. Therefore, set them to zero, which mirrors the change in commit 5d8591bc0fba ("module: set ksymtab/kcrctab* section addresses to 0x0"). Reported-by: Frank van der Linden <fllinden@amazon.com> Signed-off-by: Shaoying Xu <shaoyi@amazon.com> Cc: <stable@vger.kernel.org> Link: https://lore.kernel.org/r/20210216183234.GA23876@amazon.com Signed-off-by: Will Deacon <will@kernel.org>
2021-02-12Merge tag 'kvmarm-5.12' of ↵Paolo Bonzini
git://git.kernel.org/pub/scm/linux/kernel/git/kvmarm/kvmarm into HEAD KVM/arm64 updates for Linux 5.12 - Make the nVHE EL2 object relocatable, resulting in much more maintainable code - Handle concurrent translation faults hitting the same page in a more elegant way - Support for the standard TRNG hypervisor call - A bunch of small PMU/Debug fixes - Allow the disabling of symbol export from assembly code - Simplification of the early init hypercall handling
2021-02-12Merge branch 'for-next/rng' into for-next/coreWill Deacon
Add support for the TRNG firmware call introduced by Arm spec DEN0098. * for-next/rng: arm64: Add support for SMCCC TRNG entropy source firmware: smccc: Introduce SMCCC TRNG framework firmware: smccc: Add SMCCC TRNG function call IDs
2021-02-12Merge branch 'for-next/perf' into for-next/coreWill Deacon
Perf and PMU updates including support for Cortex-A78 and the v8.3 SPE extensions. * for-next/perf: drivers/perf: Replace spin_lock_irqsave to spin_lock dt-bindings: arm: add Cortex-A78 binding arm64: perf: add support for Cortex-A78 arm64: perf: Constify static attribute_group structs drivers/perf: Prevent forced unbinding of ARM_DMC620_PMU drivers perf/arm-cmn: Move IRQs when migrating context perf/arm-cmn: Fix PMU instance naming perf: Constify static struct attribute_group perf: hisi: Constify static struct attribute_group perf/imx_ddr: Constify static struct attribute_group perf: qcom: Constify static struct attribute_group drivers/perf: Add support for ARMv8.3-SPE
2021-02-12Merge branch 'for-next/misc' into for-next/coreWill Deacon
Miscellaneous arm64 changes for 5.12. * for-next/misc: arm64: Make CPU_BIG_ENDIAN depend on ld.bfd or ld.lld 13.0.0+ arm64: vmlinux.ld.S: add assertion for tramp_pg_dir offset arm64: vmlinux.ld.S: add assertion for reserved_pg_dir offset arm64/ptdump:display the Linear Mapping start marker arm64: ptrace: Fix missing return in hw breakpoint code KVM: arm64: Move __hyp_set_vectors out of .hyp.text arm64: Include linux/io.h in mm/mmap.c arm64: cacheflush: Remove stale comment arm64: mm: Remove unused header file arm64/sparsemem: reduce SECTION_SIZE_BITS arm64/mm: Add warning for outside range requests in vmemmap_populate() arm64: Drop workaround for broken 'S' constraint with GCC 4.9
2021-02-12Merge branch 'for-next/kexec' into for-next/coreWill Deacon
Significant steps along the road to leaving the MMU enabled during kexec relocation. * for-next/kexec: arm64: hibernate: add __force attribute to gfp_t casting arm64: kexec: arm64_relocate_new_kernel don't use x0 as temp arm64: kexec: arm64_relocate_new_kernel clean-ups and optimizations arm64: kexec: call kexec_image_info only once arm64: kexec: move relocation function setup arm64: trans_pgd: hibernate: idmap the single page that holds the copy page routines arm64: mm: Always update TCR_EL1 from __cpu_set_tcr_t0sz() arm64: trans_pgd: pass NULL instead of init_mm to *_populate functions arm64: trans_pgd: pass allocator trans_pgd_create_copy arm64: trans_pgd: make trans_pgd_map_page generic arm64: hibernate: move page handling function to new trans_pgd.c arm64: hibernate: variable pudp is used instead of pd4dp arm64: kexec: make dtb_mem always enabled
2021-02-12Merge branch 'for-next/faultaround' into for-next/coreWill Deacon
Initialise prefaulted PTEs as 'old' for arm64 when hardware access-flag updates are supported, which drastically improves vmscan performance. * for-next/faultaround: mm: filemap: Fix microblaze build failure with 'mmu_defconfig' mm/nommu: Fix return type of filemap_map_pages() mm: Mark anonymous struct field of 'struct vm_fault' as 'const' mm: Use static initialisers for immutable fields of 'struct vm_fault' mm: Avoid modifying vmf.address in __collapse_huge_page_swapin() mm: Pass 'address' to map to do_set_pte() and drop FAULT_FLAG_PREFAULT mm: Move immutable fields of 'struct vm_fault' into anonymous struct arm64: mm: Implement arch_wants_old_prefaulted_pte() mm: Allow architectures to request 'old' entries when prefaulting mm: Cleanup faultaround and finish_fault() codepaths
2021-02-12Merge branch 'for-next/crypto' into for-next/coreWill Deacon
Introduce a new macro to allow yielding the vector unit if preemption is required. The initial users of this are being merged via the crypto tree for 5.12. * for-next/crypto: arm64: assembler: add cond_yield macro
2021-02-12Merge branch 'kvm-arm64/pmu-debug-fixes-5.11' into kvmarm-master/nextMarc Zyngier
Signed-off-by: Marc Zyngier <maz@kernel.org>
2021-02-12Merge branch 'kvm-arm64/rng-5.12' into kvmarm-master/nextMarc Zyngier
Signed-off-by: Marc Zyngier <maz@kernel.org>
2021-02-12Merge branch 'kvm-arm64/hyp-reloc' into kvmarm-master/nextMarc Zyngier
Signed-off-by: Marc Zyngier <maz@kernel.org>
2021-02-11locking/arch: Move qrwlock.h include after qspinlock.hWaiman Long
include/asm-generic/qrwlock.h was trying to get arch_spin_is_locked via asm-generic/qspinlock.h. However, this does not work because architectures might be using queued rwlocks but not queued spinlocks (csky), or because they might be defining their own queued_* macros before including asm/qspinlock.h. To fix this, ensure that asm/spinlock.h always includes qrwlock.h after defining arch_spin_is_locked (either directly for csky, or via asm/qspinlock.h for other architectures). The only inclusion elsewhere is in kernel/locking/qrwlock.c. That one is really unnecessary because the file is only compiled in SMP configurations (config QUEUED_RWLOCKS depends on SMP) and in that case linux/spinlock.h already includes asm/qrwlock.h if needed, via asm/spinlock.h. Reported-by: Guenter Roeck <linux@roeck-us.net> Signed-off-by: Waiman Long <longman@redhat.com> Fixes: 26128cb6c7e6 ("locking/rwlocks: Add contention detection for rwlocks") Tested-by: Guenter Roeck <linux@roeck-us.net> Reviewed-by: Ben Gardon <bgardon@google.com> [Add arch/sparc and kernel/locking parts per discussion with Waiman. - Paolo] Signed-off-by: Paolo Bonzini <pbonzini@redhat.com>
2021-02-09arm64: cpufeatures: Allow disabling of Pointer Auth from the command-lineMarc Zyngier
In order to be able to disable Pointer Authentication at runtime, whether it is for testing purposes, or to work around HW issues, let's add support for overriding the ID_AA64ISAR1_EL1.{GPI,GPA,API,APA} fields. This is further mapped on the arm64.nopauth command-line alias. Signed-off-by: Marc Zyngier <maz@kernel.org> Reviewed-by: Catalin Marinas <catalin.marinas@arm.com> Acked-by: David Brazdil <dbrazdil@google.com> Tested-by: Srinivas Ramana <sramana@codeaurora.org> Link: https://lore.kernel.org/r/20210208095732.3267263-23-maz@kernel.org Signed-off-by: Will Deacon <will@kernel.org>
2021-02-09arm64: Defer enabling pointer authentication on boot coreSrinivas Ramana
Defer enabling pointer authentication on boot core until after its required to be enabled by cpufeature framework. This will help in controlling the feature dynamically with a boot parameter. Signed-off-by: Ajay Patil <pajay@qti.qualcomm.com> Signed-off-by: Prasad Sodagudi <psodagud@codeaurora.org> Signed-off-by: Srinivas Ramana <sramana@codeaurora.org> Signed-off-by: Marc Zyngier <maz@kernel.org> Link: https://lore.kernel.org/r/1610152163-16554-2-git-send-email-sramana@codeaurora.org Reviewed-by: Catalin Marinas <catalin.marinas@arm.com> Acked-by: David Brazdil <dbrazdil@google.com> Link: https://lore.kernel.org/r/20210208095732.3267263-22-maz@kernel.org Signed-off-by: Will Deacon <will@kernel.org>
2021-02-09arm64: cpufeatures: Allow disabling of BTI from the command-lineMarc Zyngier
In order to be able to disable BTI at runtime, whether it is for testing purposes, or to work around HW issues, let's add support for overriding the ID_AA64PFR1_EL1.BTI field. This is further mapped on the arm64.nobti command-line alias. Signed-off-by: Marc Zyngier <maz@kernel.org> Reviewed-by: Catalin Marinas <catalin.marinas@arm.com> Acked-by: David Brazdil <dbrazdil@google.com> Tested-by: Srinivas Ramana <sramana@codeaurora.org> Link: https://lore.kernel.org/r/20210208095732.3267263-21-maz@kernel.org Signed-off-by: Will Deacon <will@kernel.org>
2021-02-09arm64: Allow ID_AA64MMFR1_EL1.VH to be overridden from the command lineMarc Zyngier
As we want to be able to disable VHE at runtime, let's match "id_aa64mmfr1.vh=" from the command line as an override. This doesn't have much effect yet as our boot code doesn't look at the cpufeature, but only at the HW registers. Signed-off-by: Marc Zyngier <maz@kernel.org> Acked-by: David Brazdil <dbrazdil@google.com> Acked-by: Suzuki K Poulose <suzuki.poulose@arm.com> Reviewed-by: Catalin Marinas <catalin.marinas@arm.com> Link: https://lore.kernel.org/r/20210208095732.3267263-15-maz@kernel.org Signed-off-by: Will Deacon <will@kernel.org>
2021-02-09arm64: Extract early FDT mapping from kaslr_early_init()Marc Zyngier
As we want to parse more options very early in the kernel lifetime, let's always map the FDT early. This is achieved by moving that code out of kaslr_early_init(). No functional change expected. Signed-off-by: Marc Zyngier <maz@kernel.org> Acked-by: Catalin Marinas <catalin.marinas@arm.com> Acked-by: David Brazdil <dbrazdil@google.com> Link: https://lore.kernel.org/r/20210208095732.3267263-13-maz@kernel.org [will: Ensue KASAN is enabled before running C code] Signed-off-by: Will Deacon <will@kernel.org>
2021-02-09arm64: cpufeature: Use IDreg override in __read_sysreg_by_encoding()Marc Zyngier
__read_sysreg_by_encoding() is used by a bunch of cpufeature helpers, which should take the feature override into account. Let's do that. For a good measure (and because we are likely to need to further down the line), make this helper available to the rest of the non-modular kernel. Code that needs to know the *real* features of a CPU can still use read_sysreg_s(), and find the bare, ugly truth. Signed-off-by: Marc Zyngier <maz@kernel.org> Reviewed-by: Suzuki K Poulose <suzuki.poulose@arm.com> Acked-by: David Brazdil <dbrazdil@google.com> Reviewed-by: Catalin Marinas <catalin.marinas@arm.com> Link: https://lore.kernel.org/r/20210208095732.3267263-12-maz@kernel.org Signed-off-by: Will Deacon <will@kernel.org>
2021-02-09arm64: cpufeature: Add global feature override facilityMarc Zyngier
Add a facility to globally override a feature, no matter what the HW says. Yes, this sounds dangerous, but we do respect the "safe" value for a given feature. This doesn't mean the user doesn't need to know what they are doing. Nothing uses this yet, so we are pretty safe. For now. Signed-off-by: Marc Zyngier <maz@kernel.org> Reviewed-by: Suzuki K Poulose <suzuki.poulose@arm.com> Acked-by: David Brazdil <dbrazdil@google.com> Reviewed-by: Catalin Marinas <catalin.marinas@arm.com> Link: https://lore.kernel.org/r/20210208095732.3267263-11-maz@kernel.org Signed-off-by: Will Deacon <will@kernel.org>
2021-02-09arm64: Simplify init_el2_state to be non-VHE onlyMarc Zyngier
As init_el2_state is now nVHE only, let's simplify it and drop the VHE setup. Signed-off-by: Marc Zyngier <maz@kernel.org> Acked-by: David Brazdil <dbrazdil@google.com> Acked-by: Catalin Marinas <catalin.marinas@arm.com> Link: https://lore.kernel.org/r/20210208095732.3267263-9-maz@kernel.org Signed-off-by: Will Deacon <will@kernel.org>
2021-02-09arm64: Drop early setting of MDSCR_EL2.TPMSMarc Zyngier
When running VHE, we set MDSCR_EL2.TPMS very early on to force the trapping of EL1 SPE accesses to EL2. However: - we are running with HCR_EL2.{E2H,TGE}={1,1}, meaning that there is no EL1 to trap from - before entering a guest, we call kvm_arm_setup_debug(), which sets MDCR_EL2_TPMS in the per-vcpu shadow mdscr_el2, which gets applied on entry by __activate_traps_common(). The early setting of MDSCR_EL2.TPMS is therefore useless and can be dropped. Signed-off-by: Marc Zyngier <maz@kernel.org> Link: https://lore.kernel.org/r/20210208095732.3267263-7-maz@kernel.org Signed-off-by: Will Deacon <will@kernel.org>
2021-02-09KVM: Raise the maximum number of user memslotsVitaly Kuznetsov
Current KVM_USER_MEM_SLOTS limits are arch specific (512 on Power, 509 on x86, 32 on s390, 16 on MIPS) but they don't really need to be. Memory slots are allocated dynamically in KVM when added so the only real limitation is 'id_to_index' array which is 'short'. We don't have any other KVM_MEM_SLOTS_NUM/KVM_USER_MEM_SLOTS-sized statically defined structures. Low KVM_USER_MEM_SLOTS can be a limiting factor for some configurations. In particular, when QEMU tries to start a Windows guest with Hyper-V SynIC enabled and e.g. 256 vCPUs the limit is hit as SynIC requires two pages per vCPU and the guest is free to pick any GFN for each of them, this fragments memslots as QEMU wants to have a separate memslot for each of these pages (which are supposed to act as 'overlay' pages). Signed-off-by: Vitaly Kuznetsov <vkuznets@redhat.com> Message-Id: <20210127175731.2020089-3-vkuznets@redhat.com> Signed-off-by: Paolo Bonzini <pbonzini@redhat.com>
2021-02-08arm64: Provide an 'upgrade to VHE' stub hypercallMarc Zyngier
As we are about to change the way a VHE system boots, let's provide the core helper, in the form of a stub hypercall that enables VHE and replicates the full EL1 context at EL2, thanks to EL1 and VHE-EL2 being extremely similar. On exception return, the kernel carries on at EL2. Fancy! Nothing calls this new hypercall yet, so no functional change. Signed-off-by: Marc Zyngier <maz@kernel.org> Acked-by: David Brazdil <dbrazdil@google.com> Acked-by: Catalin Marinas <catalin.marinas@arm.com> Link: https://lore.kernel.org/r/20210208095732.3267263-5-maz@kernel.org Signed-off-by: Will Deacon <will@kernel.org>
2021-02-08arm64: Turn the MMU-on sequence into a macroMarc Zyngier
Turning the MMU on is a popular sport in the arm64 kernel, and we do it more than once, or even twice. As we are about to add even more, let's turn it into a macro. No expected functional change. Signed-off-by: Marc Zyngier <maz@kernel.org> Acked-by: Catalin Marinas <catalin.marinas@arm.com> Acked-by: David Brazdil <dbrazdil@google.com> Link: https://lore.kernel.org/r/20210208095732.3267263-4-maz@kernel.org Signed-off-by: Will Deacon <will@kernel.org>
2021-02-08arm64: Fix labels in el2_setup macrosMarc Zyngier
If someone happens to write the following code: b 1f init_el2_state vhe 1: [...] they will be in for a long debugging session, as the label "1f" will be resolved *inside* the init_el2_state macro instead of after it. Not really what one expects. Instead, rewite the EL2 setup macros to use unambiguous labels, thanks to the usual macro counter trick. Acked-by: Catalin Marinas <catalin.marinas@arm.com> Signed-off-by: Marc Zyngier <maz@kernel.org> Acked-by: David Brazdil <dbrazdil@google.com> Link: https://lore.kernel.org/r/20210208095732.3267263-2-maz@kernel.org Signed-off-by: Will Deacon <will@kernel.org>
2021-02-04arm64: Add TRFCR_ELx definitionsJonathan Zhou
Add definitions for the Arm v8.4 SelfHosted trace extensions registers. [ split the register definitions to separate patch rename some of the symbols ] Link: https://lore.kernel.org/r/20210110224850.1880240-28-suzuki.poulose@arm.com Cc: Will Deacon <will@kernel.org> Acked-by: Catalin Marinas <catalin.marinas@arm.com> Signed-off-by: Jonathan Zhou <jonathan.zhouwen@huawei.com> Signed-off-by: Suzuki K Poulose <suzuki.poulose@arm.com> Signed-off-by: Mathieu Poirier <mathieu.poirier@linaro.org> Link: https://lore.kernel.org/r/20210201181351.1475223-30-mathieu.poirier@linaro.org Signed-off-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org>
2021-02-03arm64: assembler: add cond_yield macroArd Biesheuvel
Add a macro cond_yield that branches to a specified label when called if the TIF_NEED_RESCHED flag is set and decreasing the preempt count would make the task preemptible again, resulting in a schedule to occur. This can be used by kernel mode SIMD code that keeps a lot of state in SIMD registers, which would make chunking the input in order to perform the cond_resched() check from C code disproportionately costly. Signed-off-by: Ard Biesheuvel <ardb@kernel.org> Link: https://lore.kernel.org/r/20210203113626.220151-2-ardb@kernel.org Signed-off-by: Will Deacon <will@kernel.org>
2021-02-03arm64: vmlinux.ld.S: add assertion for tramp_pg_dir offsetJoey Gouly
Add TRAMP_SWAPPER_OFFSET and use that instead of hardcoding the offset between swapper_pg_dir and tramp_pg_dir. Then use TRAMP_SWAPPER_OFFSET to assert that the offset is correct at link time. Signed-off-by: Joey Gouly <joey.gouly@arm.com> Reviewed-by: Mark Rutland <mark.rutland@arm.com> Tested-by: Mark Rutland <mark.rutland@arm.com> Link: https://lore.kernel.org/r/20210202123658.22308-3-joey.gouly@arm.com Signed-off-by: Will Deacon <will@kernel.org>
2021-02-03arm64: vmlinux.ld.S: add assertion for reserved_pg_dir offsetJoey Gouly
Add RESERVED_SWAPPER_OFFSET and use that instead of hardcoding the offset between swapper_pg_dir and reserved_pg_dir. Then use RESERVED_SWAPPER_OFFSET to assert that the offset is correct at link time. Signed-off-by: Joey Gouly <joey.gouly@arm.com> Reviewed-by: Mark Rutland <mark.rutland@arm.com> Tested-by: Mark Rutland <mark.rutland@arm.com> Link: https://lore.kernel.org/r/20210202123658.22308-2-joey.gouly@arm.com Signed-off-by: Will Deacon <will@kernel.org>