summaryrefslogtreecommitdiff
path: root/arch
AgeCommit message (Collapse)Author
2022-12-04Merge tag 'powerpc-6.1-6' of ↵Linus Torvalds
git://git.kernel.org/pub/scm/linux/kernel/git/powerpc/linux Pull powerpc fixes from Michael Ellerman: - Fix oops in 32-bit BPF tail call tests - Add missing declaration for machine_check_early_boot() Thanks to Christophe Leroy and Naveen N. Rao. * tag 'powerpc-6.1-6' of git://git.kernel.org/pub/scm/linux/kernel/git/powerpc/linux: powerpc/64s: Add missing declaration for machine_check_early_boot() powerpc/bpf/32: Fix Oops on tail call tests
2022-12-02Merge tag 'riscv-for-linus-6.1-rc8' of ↵Linus Torvalds
git://git.kernel.org/pub/scm/linux/kernel/git/riscv/linux Pull RISC-V fixes from Palmer Dabbelt: - build fix for the NR_CPUS Kconfig SBI version dependency - fixes to early memory initialization, to fix page permissions in EFI and post-initmem-free - build fix for the VDSO, to avoid trying to profile the VDSO functions - fixes for kexec crash handling, to fix multi-core and interrupt related initialization inside the crash kernel - fix for a race condition when handling multiple concurrect kernel stack overflows * tag 'riscv-for-linus-6.1-rc8' of git://git.kernel.org/pub/scm/linux/kernel/git/riscv/linux: riscv: kexec: Fixup crash_smp_send_stop without multi cores riscv: kexec: Fixup irq controller broken in kexec crash path riscv: mm: Proper page permissions after initmem free riscv: vdso: fix section overlapping under some conditions riscv: fix race when vmap stack overflow riscv: Sync efi page table's kernel mappings before switching riscv: Fix NR_CPUS range conditions
2022-12-02x86/bugs: Make sure MSR_SPEC_CTRL is updated properly upon resume from S3Pawan Gupta
The "force" argument to write_spec_ctrl_current() is currently ambiguous as it does not guarantee the MSR write. This is due to the optimization that writes to the MSR happen only when the new value differs from the cached value. This is fine in most cases, but breaks for S3 resume when the cached MSR value gets out of sync with the hardware MSR value due to S3 resetting it. When x86_spec_ctrl_current is same as x86_spec_ctrl_base, the MSR write is skipped. Which results in SPEC_CTRL mitigations not getting restored. Move the MSR write from write_spec_ctrl_current() to a new function that unconditionally writes to the MSR. Update the callers accordingly and rename functions. [ bp: Rework a bit. ] Fixes: caa0ff24d5d0 ("x86/bugs: Keep a per-CPU IA32_SPEC_CTRL value") Suggested-by: Borislav Petkov <bp@alien8.de> Signed-off-by: Pawan Gupta <pawan.kumar.gupta@linux.intel.com> Signed-off-by: Borislav Petkov (AMD) <bp@alien8.de> Reviewed-by: Thomas Gleixner <tglx@linutronix.de> Cc: <stable@kernel.org> Link: https://lore.kernel.org/r/806d39b0bfec2fe8f50dc5446dff20f5bb24a959.1669821572.git.pawan.kumar.gupta@linux.intel.com Signed-off-by: Linus Torvalds <torvalds@linux-foundation.org>
2022-12-02Merge tag 'mm-hotfixes-stable-2022-12-02' of ↵Linus Torvalds
git://git.kernel.org/pub/scm/linux/kernel/git/akpm/mm Pull misc hotfixes from Andrew Morton: "15 hotfixes, 11 marked cc:stable. Only three or four of the latter address post-6.0 issues, which is hopefully a sign that things are converging" * tag 'mm-hotfixes-stable-2022-12-02' of git://git.kernel.org/pub/scm/linux/kernel/git/akpm/mm: revert "kbuild: fix -Wimplicit-function-declaration in license_is_gpl_compatible" Kconfig.debug: provide a little extra FRAME_WARN leeway when KASAN is enabled drm/amdgpu: temporarily disable broken Clang builds due to blown stack-frame mm/khugepaged: invoke MMU notifiers in shmem/file collapse paths mm/khugepaged: fix GUP-fast interaction by sending IPI mm/khugepaged: take the right locks for page table retraction mm: migrate: fix THP's mapcount on isolation mm: introduce arch_has_hw_nonleaf_pmd_young() mm: add dummy pmd_young() for architectures not having it mm/damon/sysfs: fix wrong empty schemes assumption under online tuning in damon_sysfs_set_schemes() tools/vm/slabinfo-gnuplot: use "grep -E" instead of "egrep" nilfs2: fix NULL pointer dereference in nilfs_palloc_commit_free_entry() hugetlb: don't delete vma_lock in hugetlb MADV_DONTNEED processing madvise: use zap_page_range_single for madvise dontneed mm: replace VM_WARN_ON to pr_warn if the node is offline with __GFP_THISNODE
2022-12-01RISC-V: Fix a race condition during kernel stack overflowPalmer Dabbelt
This fixes a concrete bug but is also the basis for some cleanup work, so I'm merging it based on the offending commit in order to minimize future conflicts. * commit '7e1864332fbc1b993659eab7974da9fe8bf8c128': riscv: fix race when vmap stack overflow
2022-12-01Merge tag 'efi-fixes-for-v6.1-4' of ↵Linus Torvalds
git://git.kernel.org/pub/scm/linux/kernel/git/efi/efi Pull EFI fix from Ard Biesheuvel: "A single revert for some code that I added during this cycle. The code is not wrong, but it should be a bit more careful about how to handle the shadow call stack pointer, so it is better to revert it for now and bring it back later in improved form. Summary: - Revert runtime service sync exception recovery on arm64" * tag 'efi-fixes-for-v6.1-4' of git://git.kernel.org/pub/scm/linux/kernel/git/efi/efi: arm64: efi: Revert "Recover from synchronous exceptions ..."
2022-12-01arm64: efi: Revert "Recover from synchronous exceptions ..."Ard Biesheuvel
This reverts commit 23715a26c8d81291, which introduced some code in assembler that manipulates both the ordinary and the shadow call stack pointer in a way that could potentially be taken advantage of. So let's revert it, and do a better job the next time around. Signed-off-by: Ard Biesheuvel <ardb@kernel.org>
2022-11-30Merge tag 'clk-fixes-for-linus' of ↵Linus Torvalds
git://git.kernel.org/pub/scm/linux/kernel/git/clk/linux Pull clk fixes from Stephen Boyd: "A set of clk driver fixes that resolve issues for various SoCs. Most of these are incorrect clk data, like bad parent descriptions. When the clk tree is improperly described things don't work, like USB and UFS controllers, because clk frequencies are wonky. Here are the extra details: - Fix the parent of UFS reference clks on Qualcomm SC8280XP so that UFS works properly - Fix the clk ID for USB on AT91 RM9200 so the USB driver continues to probe - Stop using of_device_get_match_data() on the wrong device for a Samsung Exynos driver so it gets the proper clk data - Fix ExynosAutov9 binding - Fix the parent of the div4 clk on Exynos7885 - Stop calling runtime PM APIs from the Qualcomm GDSC driver directly as it leads to a lockdep splat and is just plain wrong because it violates runtime PM semantics by calling runtime PM APIs when the device has been runtime PM disabled" * tag 'clk-fixes-for-linus' of git://git.kernel.org/pub/scm/linux/kernel/git/clk/linux: clk: qcom: gcc-sc8280xp: add cxo as parent for three ufs ref clks ARM: at91: rm9200: fix usb device clock id clk: samsung: Revert "clk: samsung: exynos-clkout: Use of_device_get_match_data()" dt-bindings: clock: exynosautov9: fix reference to CMU_FSYS1 clk: qcom: gdsc: Remove direct runtime PM calls clk: samsung: exynos7885: Correct "div4" clock parents
2022-11-30mm: introduce arch_has_hw_nonleaf_pmd_young()Juergen Gross
When running as a Xen PV guests commit eed9a328aa1a ("mm: x86: add CONFIG_ARCH_HAS_NONLEAF_PMD_YOUNG") can cause a protection violation in pmdp_test_and_clear_young(): BUG: unable to handle page fault for address: ffff8880083374d0 #PF: supervisor write access in kernel mode #PF: error_code(0x0003) - permissions violation PGD 3026067 P4D 3026067 PUD 3027067 PMD 7fee5067 PTE 8010000008337065 Oops: 0003 [#1] PREEMPT SMP NOPTI CPU: 7 PID: 158 Comm: kswapd0 Not tainted 6.1.0-rc5-20221118-doflr+ #1 RIP: e030:pmdp_test_and_clear_young+0x25/0x40 This happens because the Xen hypervisor can't emulate direct writes to page table entries other than PTEs. This can easily be fixed by introducing arch_has_hw_nonleaf_pmd_young() similar to arch_has_hw_pte_young() and test that instead of CONFIG_ARCH_HAS_NONLEAF_PMD_YOUNG. Link: https://lkml.kernel.org/r/20221123064510.16225-1-jgross@suse.com Fixes: eed9a328aa1a ("mm: x86: add CONFIG_ARCH_HAS_NONLEAF_PMD_YOUNG") Signed-off-by: Juergen Gross <jgross@suse.com> Reported-by: Sander Eikelenboom <linux@eikelenboom.it> Acked-by: Yu Zhao <yuzhao@google.com> Tested-by: Sander Eikelenboom <linux@eikelenboom.it> Acked-by: David Hildenbrand <david@redhat.com> [core changes] Signed-off-by: Andrew Morton <akpm@linux-foundation.org>
2022-11-30mm: add dummy pmd_young() for architectures not having itJuergen Gross
In order to avoid #ifdeffery add a dummy pmd_young() implementation as a fallback. This is required for the later patch "mm: introduce arch_has_hw_nonleaf_pmd_young()". Link: https://lkml.kernel.org/r/fd3ac3cd-7349-6bbd-890a-71a9454ca0b3@suse.com Signed-off-by: Juergen Gross <jgross@suse.com> Acked-by: Yu Zhao <yuzhao@google.com> Cc: Borislav Petkov <bp@alien8.de> Cc: Dave Hansen <dave.hansen@linux.intel.com> Cc: Geert Uytterhoeven <geert@linux-m68k.org> Cc: "H. Peter Anvin" <hpa@zytor.com> Cc: Ingo Molnar <mingo@redhat.com> Cc: Sander Eikelenboom <linux@eikelenboom.it> Cc: Thomas Gleixner <tglx@linutronix.de> Signed-off-by: Andrew Morton <akpm@linux-foundation.org>
2022-11-29Merge patch series "riscv: kexec: Fxiup crash_save percpu and ↵Palmer Dabbelt
machine_kexec_mask_interrupts" guoren@kernel.org <guoren@kernel.org> says: From: Guo Ren <guoren@linux.alibaba.com> Current riscv kexec can't crash_save percpu states and disable interrupts properly. The patch series fix them, make kexec work correct. * b4-shazam-merge: riscv: kexec: Fixup crash_smp_send_stop without multi cores riscv: kexec: Fixup irq controller broken in kexec crash path Link: https://lore.kernel.org/r/20221020141603.2856206-1-guoren@kernel.org Signed-off-by: Palmer Dabbelt <palmer@rivosinc.com>
2022-11-29riscv: kexec: Fixup crash_smp_send_stop without multi coresGuo Ren
Current crash_smp_send_stop is the same as the generic one in kernel/panic and misses crash_save_cpu in percpu. This patch is inspired by 78fd584cdec0 ("arm64: kdump: implement machine_crash_shutdown()") and adds the same mechanism for riscv. Before this patch, test result: crash> help -r CPU 0: [OFFLINE] CPU 1: epc : ffffffff80009ff0 ra : ffffffff800b789a sp : ff2000001098bb40 gp : ffffffff815fca60 tp : ff60000004680000 t0 : 6666666666663c5b t1 : 0000000000000000 t2 : 666666666666663c s0 : ff2000001098bc90 s1 : ffffffff81600798 a0 : ff2000001098bb48 a1 : 0000000000000000 a2 : 0000000000000000 a3 : 0000000000000001 a4 : 0000000000000000 a5 : ff60000004690800 a6 : 0000000000000000 a7 : 0000000000000000 s2 : ff2000001098bb48 s3 : ffffffff81093ec8 s4 : ffffffff816004ac s5 : 0000000000000000 s6 : 0000000000000007 s7 : ffffffff80e7f720 s8 : 00fffffffffff3f0 s9 : 0000000000000007 s10: 00aaaaaaaab98700 s11: 0000000000000001 t3 : ffffffff819a8097 t4 : ffffffff819a8097 t5 : ffffffff819a8098 t6 : ff2000001098b9a8 CPU 2: [OFFLINE] CPU 3: [OFFLINE] After this patch, test result: crash> help -r CPU 0: epc : ffffffff80003f34 ra : ffffffff808caa7c sp : ffffffff81403eb0 gp : ffffffff815fcb48 tp : ffffffff81413400 t0 : 0000000000000000 t1 : 0000000000000000 t2 : 0000000000000000 s0 : ffffffff81403ec0 s1 : 0000000000000000 a0 : 0000000000000000 a1 : 0000000000000000 a2 : 0000000000000000 a3 : 0000000000000000 a4 : 0000000000000000 a5 : 0000000000000000 a6 : 0000000000000000 a7 : 0000000000000000 s2 : ffffffff816001c8 s3 : ffffffff81600370 s4 : ffffffff80c32e18 s5 : ffffffff819d3018 s6 : ffffffff810e2110 s7 : 0000000000000000 s8 : 0000000000000000 s9 : 0000000080039eac s10: 0000000000000000 s11: 0000000000000000 t3 : 0000000000000000 t4 : 0000000000000000 t5 : 0000000000000000 t6 : 0000000000000000 CPU 1: epc : ffffffff80003f34 ra : ffffffff808caa7c sp : ff2000000068bf30 gp : ffffffff815fcb48 tp : ff6000000240d400 t0 : 0000000000000000 t1 : 0000000000000000 t2 : 0000000000000000 s0 : ff2000000068bf40 s1 : 0000000000000001 a0 : 0000000000000000 a1 : 0000000000000000 a2 : 0000000000000000 a3 : 0000000000000000 a4 : 0000000000000000 a5 : 0000000000000000 a6 : 0000000000000000 a7 : 0000000000000000 s2 : ffffffff816001c8 s3 : ffffffff81600370 s4 : ffffffff80c32e18 s5 : ffffffff819d3018 s6 : ffffffff810e2110 s7 : 0000000000000000 s8 : 0000000000000000 s9 : 0000000080039ea8 s10: 0000000000000000 s11: 0000000000000000 t3 : 0000000000000000 t4 : 0000000000000000 t5 : 0000000000000000 t6 : 0000000000000000 CPU 2: epc : ffffffff80003f34 ra : ffffffff808caa7c sp : ff20000000693f30 gp : ffffffff815fcb48 tp : ff6000000240e900 t0 : 0000000000000000 t1 : 0000000000000000 t2 : 0000000000000000 s0 : ff20000000693f40 s1 : 0000000000000002 a0 : 0000000000000000 a1 : 0000000000000000 a2 : 0000000000000000 a3 : 0000000000000000 a4 : 0000000000000000 a5 : 0000000000000000 a6 : 0000000000000000 a7 : 0000000000000000 s2 : ffffffff816001c8 s3 : ffffffff81600370 s4 : ffffffff80c32e18 s5 : ffffffff819d3018 s6 : ffffffff810e2110 s7 : 0000000000000000 s8 : 0000000000000000 s9 : 0000000080039eb0 s10: 0000000000000000 s11: 0000000000000000 t3 : 0000000000000000 t4 : 0000000000000000 t5 : 0000000000000000 t6 : 0000000000000000 CPU 3: epc : ffffffff8000a1e4 ra : ffffffff800b7bba sp : ff200000109bbb40 gp : ffffffff815fcb48 tp : ff6000000373aa00 t0 : 6666666666663c5b t1 : 0000000000000000 t2 : 666666666666663c s0 : ff200000109bbc90 s1 : ffffffff816007a0 a0 : ff200000109bbb48 a1 : 0000000000000000 a2 : 0000000000000000 a3 : 0000000000000001 a4 : 0000000000000000 a5 : ff60000002c61c00 a6 : 0000000000000000 a7 : 0000000000000000 s2 : ff200000109bbb48 s3 : ffffffff810941a8 s4 : ffffffff816004b4 s5 : 0000000000000000 s6 : 0000000000000007 s7 : ffffffff80e7f7a0 s8 : 00fffffffffff3f0 s9 : 0000000000000007 s10: 00aaaaaaaab98700 s11: 0000000000000001 t3 : ffffffff819a8097 t4 : ffffffff819a8097 t5 : ffffffff819a8098 t6 : ff200000109bb9a8 Fixes: ad943893d5f1 ("RISC-V: Fixup schedule out issue in machine_crash_shutdown()") Reviewed-by: Xianting Tian <xianting.tian@linux.alibaba.com> Signed-off-by: Guo Ren <guoren@linux.alibaba.com> Signed-off-by: Guo Ren <guoren@kernel.org> Cc: Nick Kossifidis <mick@ics.forth.gr> Link: https://lore.kernel.org/r/20221020141603.2856206-3-guoren@kernel.org Signed-off-by: Palmer Dabbelt <palmer@rivosinc.com>
2022-11-29riscv: kexec: Fixup irq controller broken in kexec crash pathGuo Ren
If a crash happens on cpu3 and all interrupts are binding on cpu0, the bad irq routing will cause a crash kernel which can't receive any irq. Because crash kernel won't clean up all harts' PLIC enable bits in enable registers. This patch is similar to 9141a003a491 ("ARM: 7316/1: kexec: EOI active and mask all interrupts in kexec crash path") and 78fd584cdec0 ("arm64: kdump: implement machine_crash_shutdown()"), and PowerPC also has the same mechanism. Fixes: fba8a8674f68 ("RISC-V: Add kexec support") Signed-off-by: Guo Ren <guoren@linux.alibaba.com> Signed-off-by: Guo Ren <guoren@kernel.org> Reviewed-by: Xianting Tian <xianting.tian@linux.alibaba.com> Cc: Nick Kossifidis <mick@ics.forth.gr> Cc: Palmer Dabbelt <palmer@rivosinc.com> Link: https://lore.kernel.org/r/20221020141603.2856206-2-guoren@kernel.org Signed-off-by: Palmer Dabbelt <palmer@rivosinc.com>
2022-11-29riscv: mm: Proper page permissions after initmem freeBjörn Töpel
64-bit RISC-V kernels have the kernel image mapped separately to alias the linear map. The linear map and the kernel image map are documented as "direct mapping" and "kernel" respectively in [1]. At image load time, the linear map corresponding to the kernel image is set to PAGE_READ permission, and the kernel image map is set to PAGE_READ|PAGE_EXEC. When the initmem is freed, the pages in the linear map should be restored to PAGE_READ|PAGE_WRITE, whereas the corresponding pages in the kernel image map should be restored to PAGE_READ, by removing the PAGE_EXEC permission. This is not the case. For 64-bit kernels, only the linear map is restored to its proper page permissions at initmem free, and not the kernel image map. In practise this results in that the kernel can potentially jump to dead __init code, and start executing invalid instructions, without getting an exception. Restore the freed initmem properly, by setting both the kernel image map to the correct permissions. [1] Documentation/riscv/vm-layout.rst Fixes: e5c35fa04019 ("riscv: Map the kernel with correct permissions the first time") Signed-off-by: Björn Töpel <bjorn@rivosinc.com> Reviewed-by: Alexandre Ghiti <alex@ghiti.fr> Tested-by: Alexandre Ghiti <alex@ghiti.fr> Link: https://lore.kernel.org/r/20221115090641.258476-1-bjorn@kernel.org Cc: stable@vger.kernel.org Signed-off-by: Palmer Dabbelt <palmer@rivosinc.com>
2022-11-29riscv: vdso: fix section overlapping under some conditionsJisheng Zhang
lkp reported a build error, I tried the config and can reproduce build error as below: VDSOLD arch/riscv/kernel/vdso/vdso.so.dbg ld.lld: error: section .note file range overlaps with .text >>> .note range is [0x7C8, 0x803] >>> .text range is [0x800, 0x1993] ld.lld: error: section .text file range overlaps with .dynamic >>> .text range is [0x800, 0x1993] >>> .dynamic range is [0x808, 0x937] ld.lld: error: section .note virtual address range overlaps with .text >>> .note range is [0x7C8, 0x803] >>> .text range is [0x800, 0x1993] Fix it by setting DISABLE_BRANCH_PROFILING which will disable branch tracing for vdso, thus avoid useless _ftrace_annotated_branch section and _ftrace_branch section. Although we can also fix it by removing the hardcoded .text begin address, but I think that's another story and should be put into another patch. Link: https://lore.kernel.org/lkml/202210122123.Cc4FPShJ-lkp@intel.com/#r Reported-by: kernel test robot <lkp@intel.com> Signed-off-by: Jisheng Zhang <jszhang@kernel.org> Link: https://lore.kernel.org/r/20221102170254.1925-1-jszhang@kernel.org Fixes: ad5d1122b82f ("riscv: use vDSO common flow to reduce the latency of the time-related functions") Cc: stable@vger.kernel.org Signed-off-by: Palmer Dabbelt <palmer@rivosinc.com>
2022-11-29riscv: fix race when vmap stack overflowJisheng Zhang
Currently, when detecting vmap stack overflow, riscv firstly switches to the so called shadow stack, then use this shadow stack to call the get_overflow_stack() to get the overflow stack. However, there's a race here if two or more harts use the same shadow stack at the same time. To solve this race, we introduce spin_shadow_stack atomic var, which will be swap between its own address and 0 in atomic way, when the var is set, it means the shadow_stack is being used; when the var is cleared, it means the shadow_stack isn't being used. Fixes: 31da94c25aea ("riscv: add VMAP_STACK overflow detection") Signed-off-by: Jisheng Zhang <jszhang@kernel.org> Suggested-by: Guo Ren <guoren@kernel.org> Reviewed-by: Guo Ren <guoren@kernel.org> Link: https://lore.kernel.org/r/20221030124517.2370-1-jszhang@kernel.org [Palmer: Add AQ to the swap, and also some comments.] Signed-off-by: Palmer Dabbelt <palmer@rivosinc.com>
2022-11-28riscv: Sync efi page table's kernel mappings before switchingAlexandre Ghiti
The EFI page table is initially created as a copy of the kernel page table. With VMAP_STACK enabled, kernel stacks are allocated in the vmalloc area: if the stack is allocated in a new PGD (one that was not present at the moment of the efi page table creation or not synced in a previous vmalloc fault), the kernel will take a trap when switching to the efi page table when the vmalloc kernel stack is accessed, resulting in a kernel panic. Fix that by updating the efi kernel mappings before switching to the efi page table. Signed-off-by: Alexandre Ghiti <alexghiti@rivosinc.com> Fixes: b91540d52a08 ("RISC-V: Add EFI runtime services") Tested-by: Emil Renner Berthing <emil.renner.berthing@canonical.com> Reviewed-by: Atish Patra <atishp@rivosinc.com> Link: https://lore.kernel.org/r/20221121133303.1782246-1-alexghiti@rivosinc.com Signed-off-by: Palmer Dabbelt <palmer@rivosinc.com>
2022-11-28riscv: Fix NR_CPUS range conditionsSamuel Holland
The conditions reference the symbol SBI_V01, which does not exist. The correct symbol is RISCV_SBI_V01. Fixes: e623715f3d67 ("RISC-V: Increase range and default value of NR_CPUS") Signed-off-by: Samuel Holland <samuel@sholland.org> Reviewed-by: Andrew Jones <ajones@ventanamicro.com> Reviewed-by: Conor Dooley <conor.dooley@microchip.com> Link: https://lore.kernel.org/r/20221126061557.3541-1-samuel@sholland.org Signed-off-by: Palmer Dabbelt <palmer@rivosinc.com>
2022-11-27Merge tag 'x86_urgent_for_v6.1_rc7' of ↵Linus Torvalds
git://git.kernel.org/pub/scm/linux/kernel/git/tip/tip Pull x86 fixes from Borislav Petkov: - ioremap: mask out the bits which are not part of the physical address *after* the size computation is done to prevent any hypothetical ioremap failures - Change the MSR save/restore functionality during suspend to rely on flags denoting that the related MSRs are actually supported vs reading them and assuming they are (an Atom one allows reading but not writing, thus breaking this scheme at resume time) - prevent IV reuse in the AES-GCM communication scheme between SNP guests and the AMD secure processor * tag 'x86_urgent_for_v6.1_rc7' of git://git.kernel.org/pub/scm/linux/kernel/git/tip/tip: x86/ioremap: Fix page aligned size calculation in __ioremap_caller() x86/pm: Add enumeration check before spec MSRs save/restore setup x86/tsx: Add a feature bit for TSX control MSR support virt/sev-guest: Prevent IV reuse in the SNP guest driver
2022-11-27Merge tag 'for-linus' of git://git.kernel.org/pub/scm/virt/kvm/kvmLinus Torvalds
Pull kvm fixes from Paolo Bonzini: "x86: - Fixes for Xen emulation. While nobody should be enabling it in the kernel (the only public users of the feature are the selftests), the bug effectively allows userspace to read arbitrary memory. - Correctness fixes for nested hypervisors that do not intercept INIT or SHUTDOWN on AMD; the subsequent CPU reset can cause a use-after-free when it disables virtualization extensions. While downgrading the panic to a WARN is quite easy, the full fix is a bit more laborious; there are also tests. This is the bulk of the pull request. - Fix race condition due to incorrect mmu_lock use around make_mmu_pages_available(). Generic: - Obey changes to the kvm.halt_poll_ns module parameter in VMs not using KVM_CAP_HALT_POLL, restoring behavior from before the introduction of the capability" * tag 'for-linus' of git://git.kernel.org/pub/scm/virt/kvm/kvm: KVM: Update gfn_to_pfn_cache khva when it moves within the same page KVM: x86/xen: Only do in-kernel acceleration of hypercalls for guest CPL0 KVM: x86/xen: Validate port number in SCHEDOP_poll KVM: x86/mmu: Fix race condition in direct_page_fault KVM: x86: remove exit_int_info warning in svm_handle_exit KVM: selftests: add svm part to triple_fault_test KVM: x86: allow L1 to not intercept triple fault kvm: selftests: add svm nested shutdown test KVM: selftests: move idt_entry to header KVM: x86: forcibly leave nested mode on vCPU reset KVM: x86: add kvm_leave_nested KVM: x86: nSVM: harden svm_free_nested against freeing vmcb02 while still in use KVM: x86: nSVM: leave nested mode on vCPU free KVM: Obey kvm.halt_poll_ns in VMs not using KVM_CAP_HALT_POLL KVM: Avoid re-reading kvm->max_halt_poll_ns during halt-polling KVM: Cap vcpu->halt_poll_ns before halting rather than after
2022-11-26Merge tag 'kbuild-fixes-v6.1-4' of ↵Linus Torvalds
git://git.kernel.org/pub/scm/linux/kernel/git/masahiroy/linux-kbuild Pull Kbuild fixes from Masahiro Yamada: - Fix CC_HAS_ASM_GOTO_TIED_OUTPUT test in Kconfig - Fix noisy "No such file or directory" message when KBUILD_BUILD_VERSION is passed - Include rust/ in source tarballs - Fix missing FORCE for ARCH=nios2 builds * tag 'kbuild-fixes-v6.1-4' of git://git.kernel.org/pub/scm/linux/kernel/git/masahiroy/linux-kbuild: nios2: add FORCE for vmlinuz.gz scripts: add rust in scripts/Makefile.package kbuild: fix "cat: .version: No such file or directory" init/Kconfig: fix CC_HAS_ASM_GOTO_TIED_OUTPUT test with dash
2022-11-27nios2: add FORCE for vmlinuz.gzRandy Dunlap
Add FORCE to placate a warning from make: arch/nios2/boot/Makefile:24: FORCE prerequisite is missing Fixes: 2fc8483fdcde ("nios2: Build infrastructure") Signed-off-by: Randy Dunlap <rdunlap@infradead.org> Reviewed-by: Masahiro Yamada <masahiroy@kernel.org>
2022-11-25Merge tag 's390-6.1-6' of ↵Linus Torvalds
git://git.kernel.org/pub/scm/linux/kernel/git/s390/linux Pull s390 updates from Alexander Gordeev: - Fix size of incorrectly increased from four to eight bytes TOD field of crash dump save area. As result in case of kdump NT_S390_TODPREG ELF notes section contains correct value and "detected read beyond size of field" compiler warning goes away. - Fix memory leak in cryptographic Adjunct Processors (AP) module on initialization failure path. - Add Gerald Schaefer <gerald.schaefer@linux.ibm.com> and Alexander Gordeev <agordeev@linux.ibm.com> as S390 memory management maintainers. Also rename the S390 section to S390 ARCHITECTURE to be a bit more precise. * tag 's390-6.1-6' of git://git.kernel.org/pub/scm/linux/kernel/git/s390/linux: MAINTAINERS: add S390 MM section s390/crashdump: fix TOD programmable field size s390/ap: fix memory leak in ap_init_qci_info()
2022-11-25Merge tag 'hyperv-fixes-signed-20221125' of ↵Linus Torvalds
git://git.kernel.org/pub/scm/linux/kernel/git/hyperv/linux Pull hyperv fixes from Wei Liu: - Fix IRTE allocation in Hyper-V PCI controller (Dexuan Cui) - Fix handling of SCSI srb_status and capacity change events (Michael Kelley) - Restore VP assist page after CPU offlining and onlining (Vitaly Kuznetsov) - Fix some memory leak issues in VMBus (Yang Yingliang) * tag 'hyperv-fixes-signed-20221125' of git://git.kernel.org/pub/scm/linux/kernel/git/hyperv/linux: Drivers: hv: vmbus: fix possible memory leak in vmbus_device_register() Drivers: hv: vmbus: fix double free in the error path of vmbus_add_channel_work() PCI: hv: Only reuse existing IRTE allocation for Multi-MSI scsi: storvsc: Fix handling of srb_status and capacity change events x86/hyperv: Restore VP assist page after cpu offlining/onlining
2022-11-26powerpc/64s: Add missing declaration for machine_check_early_boot()Michael Ellerman
There's no declaration for machine_check_early_boot(), which leads to a build failure with W=1. Add one. Fixes: 2f5182cffa43 ("powerpc/64s: early boot machine check handler") Signed-off-by: Michael Ellerman <mpe@ellerman.id.au> Link: https://lore.kernel.org/r/20221125132521.2167039-1-mpe@ellerman.id.au
2022-11-24Merge tag 'soc-fixes-6.1-4' of ↵Linus Torvalds
git://git.kernel.org/pub/scm/linux/kernel/git/soc/soc Pull ARM SoC fixes from Arnd Bergmann: "There are a bunch of late fixes that just came in, in particular a longer series for Rockchips devicetree files, but most of those just address cosmetic errors that were found during the binding validation. There are a couple of code changes: - A regression fix to the IXP42x PCI bus - A fix for a memory leak on optee, and another one for mach-mxs - Two fixes for the sunxi rsb bus driver, to address problems with the shutdown logic The rest are small but important devicetree fixes for a number of individual boards, addressing issues across all platforms: - arm global timer on older rockchip SoCs is unstable and needs to be disabled in favor of a more reliable clocksource - Corrections to fix bluetooth, mmc, and networking on a few Rockchip boards - at91/sam9g20ek UDC needs a pin controller config change - an omap board runs into mmc probe errors because of regulator nodes in the wrong place - imx8mp-evk has a minor inaccuracy with its pin config, but without user visible impact - The Allwinner H6 Hantro G2 video decoder needs an IOMMU reference to prevent the driver from crashing" * tag 'soc-fixes-6.1-4' of git://git.kernel.org/pub/scm/linux/kernel/git/soc/soc: (30 commits) bus: ixp4xx: Don't touch bit 7 on IXP42x ARM: dts: imx6q-prti6q: Fix ref/tcxo-clock-frequency properties arm64: dts: imx8mp-evk: correct pcie pad settings ARM: mxs: fix memory leak in mxs_machine_init() ARM: dts: at91: sam9g20ek: enable udc vbus gpio pinctrl tee: optee: fix possible memory leak in optee_register_device() arm64: dts: allwinner: h6: Add IOMMU reference to Hantro G2 media: dt-bindings: allwinner: h6-vpu-g2: Add IOMMU reference property bus: sunxi-rsb: Support atomic transfers bus: sunxi-rsb: Remove the shutdown callback ARM: dts: rockchip: disable arm_global_timer on rk3066 and rk3188 arm64: dts: rockchip: Fix Pine64 Quartz4-B PMIC interrupt ARM: dts: am335x-pcm-953: Define fixed regulators in root node ARM: dts: rockchip: rk3188: fix lcdc1-rgb24 node name arm64: dts: rockchip: fix ir-receiver node names ARM: dts: rockchip: fix ir-receiver node names arm64: dts: rockchip: fix adc-keys sub node names ARM: dts: rockchip: fix adc-keys sub node names arm: dts: rockchip: remove clock-frequency from rtc arm: dts: rockchip: fix node name for hym8563 rtc ...
2022-11-24Merge tag 'for-linus' of git://git.armlinux.org.uk/~rmk/linux-armLinus Torvalds
Pull ARM fixes from Russell King: "Two fixes for 6.1: - fix stacktraces for tracepoint events in Thumb2 mode - fix for noMMU ZERO_PAGE() implementation" * tag 'for-linus' of git://git.armlinux.org.uk/~rmk/linux-arm: ARM: 9266/1: mm: fix no-MMU ZERO_PAGE() implementation ARM: 9251/1: perf: Fix stacktraces for tracepoint events in THUMB2 kernels
2022-11-24Merge tag 'v6.2-rockchip-dts32-1' of ↵Arnd Bergmann
git://git.kernel.org/pub/scm/linux/kernel/git/mmind/linux-rockchip into arm/fixes Disabling of the unreliable arm-global-timer on earliest Rockchip SoCs, due to its frequency being bound to the changing cpu clock. * tag 'v6.2-rockchip-dts32-1' of git://git.kernel.org/pub/scm/linux/kernel/git/mmind/linux-rockchip: ARM: dts: rockchip: disable arm_global_timer on rk3066 and rk3188
2022-11-24s390/crashdump: fix TOD programmable field sizeHeiko Carstens
The size of the TOD programmable field was incorrectly increased from four to eight bytes with commit 1a2c5840acf9 ("s390/dump: cleanup CPU save area handling"). This leads to an elf notes section NT_S390_TODPREG which has a size of eight instead of four bytes in case of kdump, however even worse is that the contents is incorrect: it is supposed to contain only the contents of the TOD programmable field, but in fact contains a mix of the TOD programmable field (32 bit upper bits) and parts of the CPU timer register (lower 32 bits). Fix this by simply changing the size of the todpreg field within the save area structure. This will implicitly also fix the size of the corresponding elf notes sections. This also gets rid of this compile time warning: in function ‘fortify_memcpy_chk’, inlined from ‘save_area_add_regs’ at arch/s390/kernel/crash_dump.c:99:2: ./include/linux/fortify-string.h:413:25: error: call to ‘__read_overflow2_field’ declared with attribute warning: detected read beyond size of field (2nd parameter); maybe use struct_group()? [-Werror=attribute-warning] 413 | __read_overflow2_field(q_size_field, size); | ^~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~ Fixes: 1a2c5840acf9 ("s390/dump: cleanup CPU save area handling") Reviewed-by: Christian Borntraeger <borntraeger@linux.ibm.com> Signed-off-by: Heiko Carstens <hca@linux.ibm.com> Signed-off-by: Alexander Gordeev <agordeev@linux.ibm.com>
2022-11-24powerpc/bpf/32: Fix Oops on tail call testsChristophe Leroy
test_bpf tail call tests end up as: test_bpf: #0 Tail call leaf jited:1 85 PASS test_bpf: #1 Tail call 2 jited:1 111 PASS test_bpf: #2 Tail call 3 jited:1 145 PASS test_bpf: #3 Tail call 4 jited:1 170 PASS test_bpf: #4 Tail call load/store leaf jited:1 190 PASS test_bpf: #5 Tail call load/store jited:1 BUG: Unable to handle kernel data access on write at 0xf1b4e000 Faulting instruction address: 0xbe86b710 Oops: Kernel access of bad area, sig: 11 [#1] BE PAGE_SIZE=4K MMU=Hash PowerMac Modules linked in: test_bpf(+) CPU: 0 PID: 97 Comm: insmod Not tainted 6.1.0-rc4+ #195 Hardware name: PowerMac3,1 750CL 0x87210 PowerMac NIP: be86b710 LR: be857e88 CTR: be86b704 REGS: f1b4df20 TRAP: 0300 Not tainted (6.1.0-rc4+) MSR: 00009032 <EE,ME,IR,DR,RI> CR: 28008242 XER: 00000000 DAR: f1b4e000 DSISR: 42000000 GPR00: 00000001 f1b4dfe0 c11d2280 00000000 00000000 00000000 00000002 00000000 GPR08: f1b4e000 be86b704 f1b4e000 00000000 00000000 100d816a f2440000 fe73baa8 GPR16: f2458000 00000000 c1941ae4 f1fe2248 00000045 c0de0000 f2458030 00000000 GPR24: 000003e8 0000000f f2458000 f1b4dc90 3e584b46 00000000 f24466a0 c1941a00 NIP [be86b710] 0xbe86b710 LR [be857e88] __run_one+0xec/0x264 [test_bpf] Call Trace: [f1b4dfe0] [00000002] 0x2 (unreliable) Instruction dump: XXXXXXXX XXXXXXXX XXXXXXXX XXXXXXXX XXXXXXXX XXXXXXXX XXXXXXXX XXXXXXXX XXXXXXXX XXXXXXXX XXXXXXXX XXXXXXXX XXXXXXXX XXXXXXXX XXXXXXXX XXXXXXXX ---[ end trace 0000000000000000 ]--- This is a tentative to write above the stack. The problem is encoutered with tests added by commit 38608ee7b690 ("bpf, tests: Add load store test case for tail call") This happens because tail call is done to a BPF prog with a different stack_depth. At the time being, the stack is kept as is when the caller tail calls its callee. But at exit, the callee restores the stack based on its own properties. Therefore here, at each run, r1 is erroneously increased by 32 - 16 = 16 bytes. This was done that way in order to pass the tail call count from caller to callee through the stack. As powerpc32 doesn't have a red zone in the stack, it was necessary the maintain the stack as is for the tail call. But it was not anticipated that the BPF frame size could be different. Let's take a new approach. Use register r4 to carry the tail call count during the tail call, and save it into the stack at function entry if required. This means the input parameter must be in r3, which is more correct as it is a 32 bits parameter, then tail call better match with normal BPF function entry, the down side being that we move that input parameter back and forth between r3 and r4. That can be optimised later. Doing that also has the advantage of maximising the common parts between tail calls and a normal function exit. With the fix, tail call tests are now successfull: test_bpf: #0 Tail call leaf jited:1 53 PASS test_bpf: #1 Tail call 2 jited:1 115 PASS test_bpf: #2 Tail call 3 jited:1 154 PASS test_bpf: #3 Tail call 4 jited:1 165 PASS test_bpf: #4 Tail call load/store leaf jited:1 101 PASS test_bpf: #5 Tail call load/store jited:1 141 PASS test_bpf: #6 Tail call error path, max count reached jited:1 994 PASS test_bpf: #7 Tail call count preserved across function calls jited:1 140975 PASS test_bpf: #8 Tail call error path, NULL target jited:1 110 PASS test_bpf: #9 Tail call error path, index out of range jited:1 69 PASS test_bpf: test_tail_calls: Summary: 10 PASSED, 0 FAILED, [10/10 JIT'ed] Suggested-by: Naveen N. Rao <naveen.n.rao@linux.vnet.ibm.com> Fixes: 51c66ad849a7 ("powerpc/bpf: Implement extended BPF on PPC32") Cc: stable@vger.kernel.org Signed-off-by: Christophe Leroy <christophe.leroy@csgroup.eu> Tested-by: Naveen N. Rao <naveen.n.rao@linux.vnet.ibm.com Signed-off-by: Michael Ellerman <mpe@ellerman.id.au> Link: https://lore.kernel.org/r/757acccb7fbfc78efa42dcf3c974b46678198905.1669278887.git.christophe.leroy@csgroup.eu
2022-11-24kbuild: fix "cat: .version: No such file or directory"Masahiro Yamada
Since commit 2df8220cc511 ("kbuild: build init/built-in.a just once"), the .version file is not touched at all when KBUILD_BUILD_VERSION is given. If KBUILD_BUILD_VERSION is specified and the .version file is missing (for example right after 'make mrproper'), "No such file or director" is shown. Even if the .version exists, it is irrelevant to the version of the current build. $ make -j$(nproc) KBUILD_BUILD_VERSION=100 mrproper defconfig all [ snip ] BUILD arch/x86/boot/bzImage cat: .version: No such file or directory Kernel: arch/x86/boot/bzImage is ready (#) Show KBUILD_BUILD_VERSION if it is given. Fixes: 2df8220cc511 ("kbuild: build init/built-in.a just once") Signed-off-by: Masahiro Yamada <masahiroy@kernel.org> Reviewed-by: Nicolas Schier <nicolas@fjasle.eu>
2022-11-23Merge branch 'kvm-dwmw2-fixes' into HEADPaolo Bonzini
This brings in a few important fixes for Xen emulation. While nobody should be enabling it, the bug effectively allows userspace to read arbitrary memory. Signed-off-by: Paolo Bonzini <pbonzini@redhat.com>
2022-11-23KVM: x86/xen: Only do in-kernel acceleration of hypercalls for guest CPL0David Woodhouse
There are almost no hypercalls which are valid from CPL > 0, and definitely none which are handled by the kernel. Fixes: 2fd6df2f2b47 ("KVM: x86/xen: intercept EVTCHNOP_send from guests") Reported-by: Michal Luczaj <mhal@rbox.co> Signed-off-by: David Woodhouse <dwmw@amazon.co.uk> Reviewed-by: Sean Christopherson <seanjc@google.com> Cc: stable@kernel.org Signed-off-by: Paolo Bonzini <pbonzini@redhat.com>
2022-11-23KVM: x86/xen: Validate port number in SCHEDOP_pollDavid Woodhouse
We shouldn't allow guests to poll on arbitrary port numbers off the end of the event channel table. Fixes: 1a65105a5aba ("KVM: x86/xen: handle PV spinlocks slowpath") [dwmw2: my bug though; the original version did check the validity as a side-effect of an idr_find() which I ripped out in refactoring.] Reported-by: Michal Luczaj <mhal@rbox.co> Signed-off-by: David Woodhouse <dwmw@amazon.co.uk> Reviewed-by: Sean Christopherson <seanjc@google.com> Cc: stable@kernel.org Signed-off-by: Paolo Bonzini <pbonzini@redhat.com>
2022-11-23KVM: x86/mmu: Fix race condition in direct_page_faultKazuki Takiguchi
make_mmu_pages_available() must be called with mmu_lock held for write. However, if the TDP MMU is used, it will be called with mmu_lock held for read. This function does nothing unless shadow pages are used, so there is no race unless nested TDP is used. Since nested TDP uses shadow pages, old shadow pages may be zapped by this function even when the TDP MMU is enabled. Since shadow pages are never allocated by kvm_tdp_mmu_map(), a race condition can be avoided by not calling make_mmu_pages_available() if the TDP MMU is currently in use. I encountered this when repeatedly starting and stopping nested VM. It can be artificially caused by allocating a large number of nested TDP SPTEs. For example, the following BUG and general protection fault are caused in the host kernel. pte_list_remove: 00000000cd54fc10 many->many ------------[ cut here ]------------ kernel BUG at arch/x86/kvm/mmu/mmu.c:963! invalid opcode: 0000 [#1] PREEMPT SMP NOPTI RIP: 0010:pte_list_remove.cold+0x16/0x48 [kvm] Call Trace: <TASK> drop_spte+0xe0/0x180 [kvm] mmu_page_zap_pte+0x4f/0x140 [kvm] __kvm_mmu_prepare_zap_page+0x62/0x3e0 [kvm] kvm_mmu_zap_oldest_mmu_pages+0x7d/0xf0 [kvm] direct_page_fault+0x3cb/0x9b0 [kvm] kvm_tdp_page_fault+0x2c/0xa0 [kvm] kvm_mmu_page_fault+0x207/0x930 [kvm] npf_interception+0x47/0xb0 [kvm_amd] svm_invoke_exit_handler+0x13c/0x1a0 [kvm_amd] svm_handle_exit+0xfc/0x2c0 [kvm_amd] kvm_arch_vcpu_ioctl_run+0xa79/0x1780 [kvm] kvm_vcpu_ioctl+0x29b/0x6f0 [kvm] __x64_sys_ioctl+0x95/0xd0 do_syscall_64+0x5c/0x90 general protection fault, probably for non-canonical address 0xdead000000000122: 0000 [#1] PREEMPT SMP NOPTI RIP: 0010:kvm_mmu_commit_zap_page.part.0+0x4b/0xe0 [kvm] Call Trace: <TASK> kvm_mmu_zap_oldest_mmu_pages+0xae/0xf0 [kvm] direct_page_fault+0x3cb/0x9b0 [kvm] kvm_tdp_page_fault+0x2c/0xa0 [kvm] kvm_mmu_page_fault+0x207/0x930 [kvm] npf_interception+0x47/0xb0 [kvm_amd] CVE: CVE-2022-45869 Fixes: a2855afc7ee8 ("KVM: x86/mmu: Allow parallel page faults for the TDP MMU") Signed-off-by: Kazuki Takiguchi <takiguchi.kazuki171@gmail.com> Cc: stable@vger.kernel.org Signed-off-by: Paolo Bonzini <pbonzini@redhat.com>
2022-11-23Merge tag 'v6.1-rockchip-dtsfixes1' of ↵Arnd Bergmann
git://git.kernel.org/pub/scm/linux/kernel/git/mmind/linux-rockchip into arm/fixes Fixes to make the automated binding tools happier (node-names, undocumented + unneeded properties) and fixes for non-working devices on some boards. * tag 'v6.1-rockchip-dtsfixes1' of git://git.kernel.org/pub/scm/linux/kernel/git/mmind/linux-rockchip: arm64: dts: rockchip: Fix Pine64 Quartz4-B PMIC interrupt ARM: dts: rockchip: rk3188: fix lcdc1-rgb24 node name arm64: dts: rockchip: fix ir-receiver node names ARM: dts: rockchip: fix ir-receiver node names arm64: dts: rockchip: fix adc-keys sub node names ARM: dts: rockchip: fix adc-keys sub node names arm: dts: rockchip: remove clock-frequency from rtc arm: dts: rockchip: fix node name for hym8563 rtc arm64: dts: rockchip: remove clock-frequency from rtc arm64: dts: rockchip: fix node name for hym8563 rtc arm64: dts: rockchip: lower rk3399-puma-haikou SD controller clock frequency arm64: dts: rockchip: keep I2S1 disabled for GPIO function on ROCK Pi 4 series arm64: dts: rockchip: fix quartz64-a bluetooth configuration arm64: dts: rockchip: add enable-strobe-pulldown to emmc phy on nanopi4 arm64: dts: rockchip: remove i2c5 from rk3566-roc-pc arm64: dts: rockchip: Fix i2c3 pinctrl on rk3566-roc-pc arm64: dts: rockchip: Fix gmac failure of rgmii-id from rk3566-roc-pc arm64: dts: rockchip: Drop RK3399-Scarlet's repeated ec_ap_int_l definition Link: https://lore.kernel.org/r/6274427.GXAFRqVoOG@phil Signed-off-by: Arnd Bergmann <arnd@arndb.de>
2022-11-22x86/ioremap: Fix page aligned size calculation in __ioremap_caller()Michael Kelley
Current code re-calculates the size after aligning the starting and ending physical addresses on a page boundary. But the re-calculation also embeds the masking of high order bits that exceed the size of the physical address space (via PHYSICAL_PAGE_MASK). If the masking removes any high order bits, the size calculation results in a huge value that is likely to immediately fail. Fix this by re-calculating the page-aligned size first. Then mask any high order bits using PHYSICAL_PAGE_MASK. Fixes: ffa71f33a820 ("x86, ioremap: Fix incorrect physical address handling in PAE mode") Signed-off-by: Michael Kelley <mikelley@microsoft.com> Signed-off-by: Borislav Petkov <bp@suse.de> Acked-by: Dave Hansen <dave.hansen@linux.intel.com> Cc: <stable@kernel.org> Link: https://lore.kernel.org/r/1668624097-14884-2-git-send-email-mikelley@microsoft.com
2022-11-21Merge tag 'am335x-pcm-953-regulators' of ↵Arnd Bergmann
git://git.kernel.org/pub/scm/linux/kernel/git/tmlind/linux-omap into arm/fixes Regulator changes for am335x-pcm-953 This is for deferred probe issue on am335x-pcm-953 sdhci-omap regulator. * tag 'am335x-pcm-953-regulators' of git://git.kernel.org/pub/scm/linux/kernel/git/tmlind/linux-omap: ARM: dts: am335x-pcm-953: Define fixed regulators in root node Link: https://lore.kernel.org/r/pull-1669036672-530717@atomide.com Signed-off-by: Arnd Bergmann <arnd@arndb.de>
2022-11-21x86/pm: Add enumeration check before spec MSRs save/restore setupPawan Gupta
pm_save_spec_msr() keeps a list of all the MSRs which _might_ need to be saved and restored at hibernate and resume. However, it has zero awareness of CPU support for these MSRs. It mostly works by unconditionally attempting to manipulate these MSRs and relying on rdmsrl_safe() being able to handle a #GP on CPUs where the support is unavailable. However, it's possible for reads (RDMSR) to be supported for a given MSR while writes (WRMSR) are not. In this case, msr_build_context() sees a successful read (RDMSR) and marks the MSR as valid. Then, later, a write (WRMSR) fails, producing a nasty (but harmless) error message. This causes restore_processor_state() to try and restore it, but writing this MSR is not allowed on the Intel Atom N2600 leading to: unchecked MSR access error: WRMSR to 0x122 (tried to write 0x0000000000000002) \ at rIP: 0xffffffff8b07a574 (native_write_msr+0x4/0x20) Call Trace: <TASK> restore_processor_state x86_acpi_suspend_lowlevel acpi_suspend_enter suspend_devices_and_enter pm_suspend.cold state_store kernfs_fop_write_iter vfs_write ksys_write do_syscall_64 ? do_syscall_64 ? up_read ? lock_is_held_type ? asm_exc_page_fault ? lockdep_hardirqs_on entry_SYSCALL_64_after_hwframe To fix this, add the corresponding X86_FEATURE bit for each MSR. Avoid trying to manipulate the MSR when the feature bit is clear. This required adding a X86_FEATURE bit for MSRs that do not have one already, but it's a small price to pay. [ bp: Move struct msr_enumeration inside the only function that uses it. ] Fixes: 73924ec4d560 ("x86/pm: Save the MSR validity status at context setup") Reported-by: Hans de Goede <hdegoede@redhat.com> Signed-off-by: Pawan Gupta <pawan.kumar.gupta@linux.intel.com> Signed-off-by: Borislav Petkov <bp@suse.de> Reviewed-by: Dave Hansen <dave.hansen@linux.intel.com> Acked-by: Rafael J. Wysocki <rafael.j.wysocki@intel.com> Cc: <stable@kernel.org> Link: https://lore.kernel.org/r/c24db75d69df6e66c0465e13676ad3f2837a2ed8.1668539735.git.pawan.kumar.gupta@linux.intel.com
2022-11-21x86/tsx: Add a feature bit for TSX control MSR supportPawan Gupta
Support for the TSX control MSR is enumerated in MSR_IA32_ARCH_CAPABILITIES. This is different from how other CPU features are enumerated i.e. via CPUID. Currently, a call to tsx_ctrl_is_supported() is required for enumerating the feature. In the absence of a feature bit for TSX control, any code that relies on checking feature bits directly will not work. In preparation for adding a feature bit check in MSR save/restore during suspend/resume, set a new feature bit X86_FEATURE_TSX_CTRL when MSR_IA32_TSX_CTRL is present. Also make tsx_ctrl_is_supported() use the new feature bit to avoid any overhead of reading the MSR. [ bp: Remove tsx_ctrl_is_supported(), add room for two more feature bits in word 11 which are coming up in the next merge window. ] Suggested-by: Andrew Cooper <andrew.cooper3@citrix.com> Signed-off-by: Pawan Gupta <pawan.kumar.gupta@linux.intel.com> Signed-off-by: Borislav Petkov <bp@suse.de> Reviewed-by: Dave Hansen <dave.hansen@linux.intel.com> Cc: <stable@kernel.org> Link: https://lore.kernel.org/r/de619764e1d98afbb7a5fa58424f1278ede37b45.1668539735.git.pawan.kumar.gupta@linux.intel.com
2022-11-21Merge tag 'imx-fixes-6.1-3' of ↵Arnd Bergmann
git://git.kernel.org/pub/scm/linux/kernel/git/shawnguo/linux into arm/fixes i.MX fixes for 6.1, part 3: - Fix a small memory leak in mach-mxs code. - Correct PCIe pad configuration for imx8mp-evk board. - Fix ref/tcxo clock frequency property for imx6q-prti6q board. * tag 'imx-fixes-6.1-3' of git://git.kernel.org/pub/scm/linux/kernel/git/shawnguo/linux: ARM: dts: imx6q-prti6q: Fix ref/tcxo-clock-frequency properties arm64: dts: imx8mp-evk: correct pcie pad settings ARM: mxs: fix memory leak in mxs_machine_init() Link: https://lore.kernel.org/r/20221119073812.GQ16229@T480 Signed-off-by: Arnd Bergmann <arnd@arndb.de>
2022-11-21Merge tag 'sunxi-fixes-for-6.1-1' of ↵Arnd Bergmann
https://git.kernel.org/pub/scm/linux/kernel/git/sunxi/linux into arm/fixes - RSB bus communication fixes - missing IOMMU reference property to H6 Hantro G2 * tag 'sunxi-fixes-for-6.1-1' of https://git.kernel.org/pub/scm/linux/kernel/git/sunxi/linux: arm64: dts: allwinner: h6: Add IOMMU reference to Hantro G2 media: dt-bindings: allwinner: h6-vpu-g2: Add IOMMU reference property bus: sunxi-rsb: Support atomic transfers bus: sunxi-rsb: Remove the shutdown callback Link: https://lore.kernel.org/r/Y3ftpBFk5+fndA4B@jernej-laptop Signed-off-by: Arnd Bergmann <arnd@arndb.de>
2022-11-21Merge tag 'at91-fixes-6.1-2' of ↵Arnd Bergmann
https://git.kernel.org/pub/scm/linux/kernel/git/at91/linux into arm/fixes AT91 fixes for 6.1 #2 It contains: - fix UDC on at91sam9g20ek boards by adding vbus pin * tag 'at91-fixes-6.1-2' of https://git.kernel.org/pub/scm/linux/kernel/git/at91/linux: ARM: dts: at91: sam9g20ek: enable udc vbus gpio pinctrl Link: https://lore.kernel.org/r/20221118131205.301662-1-claudiu.beznea@microchip.com Signed-off-by: Arnd Bergmann <arnd@arndb.de>
2022-11-21LoongArch: Fix unsigned comparison with less than zeroKaiLong Wang
Eliminate the following coccicheck warning: ./arch/loongarch/kernel/unwind_prologue.c:84:5-13: WARNING: Unsigned expression compared with zero: frame_ra < 0 Signed-off-by: KaiLong Wang <wangkailong@jari.cn> Signed-off-by: Huacai Chen <chenhuacai@loongson.cn>
2022-11-21LoongArch: Set _PAGE_DIRTY only if _PAGE_MODIFIED is set in {pmd,pte}_mkwrite()Huacai Chen
Set _PAGE_DIRTY only if _PAGE_MODIFIED is set in {pmd,pte}_mkwrite(). Otherwise, _PAGE_DIRTY silences the TLB modify exception and make us have no chance to mark a pmd/pte dirty (_PAGE_MODIFIED) for software. Reviewed-by: Guo Ren <guoren@kernel.org> Signed-off-by: Huacai Chen <chenhuacai@loongson.cn>
2022-11-21LoongArch: Set _PAGE_DIRTY only if _PAGE_WRITE is set in {pmd,pte}_mkdirty()Huacai Chen
Now {pmd,pte}_mkdirty() set _PAGE_DIRTY bit unconditionally, this causes random segmentation fault after commit 0ccf7f168e17bb7e ("mm/thp: carry over dirty bit when thp splits on pmd"). The reason is: when fork(), parent process use pmd_wrprotect() to clear huge page's _PAGE_WRITE and _PAGE_DIRTY (for COW); then pte_mkdirty() set _PAGE_DIRTY as well as _PAGE_MODIFIED while splitting dirty huge pages; once _PAGE_DIRTY is set, there will be no tlb modify exception so the COW machanism fails; and at last memory corruption occurred between parent and child processes. So, we should set _PAGE_DIRTY only when _PAGE_WRITE is set in {pmd,pte}_ mkdirty(). Cc: stable@vger.kernel.org Cc: Peter Xu <peterx@redhat.com> Signed-off-by: Huacai Chen <chenhuacai@loongson.cn>
2022-11-21LoongArch: Clear FPU/SIMD thread info flags for kernel threadHuacai Chen
If a kernel thread is created by a user thread, it may carry FPU/SIMD thread info flags (TIF_USEDFPU, TIF_USEDSIMD, etc.). Then it will be considered as a fpu owner and kernel try to save its FPU/SIMD context and cause such errors: [ 41.518931] do_fpu invoked from kernel context![#1]: [ 41.523933] CPU: 1 PID: 395 Comm: iou-wrk-394 Not tainted 6.1.0-rc5+ #217 [ 41.530757] Hardware name: Loongson Loongson-3A5000-7A1000-1w-CRB/Loongson-LS3A5000-7A1000-1w-CRB, BIOS vUDK2018-LoongArch-V2.0.pre-beta8 08/18/2022 [ 41.544064] $ 0 : 0000000000000000 90000000011e9468 9000000106c7c000 9000000106c7fcf0 [ 41.552101] $ 4 : 9000000106305d40 9000000106689800 9000000106c7fd08 0000000003995818 [ 41.560138] $ 8 : 0000000000000001 90000000009a72e4 0000000000000020 fffffffffffffffc [ 41.568174] $12 : 0000000000000000 0000000000000000 0000000000000020 00000009aab7e130 [ 41.576211] $16 : 00000000000001ff 0000000000000407 0000000000000001 0000000000000000 [ 41.584247] $20 : 0000000000000000 0000000000000001 9000000106c7fd70 90000001002f0400 [ 41.592284] $24 : 0000000000000000 900000000178f740 90000000011e9834 90000001063057c0 [ 41.600320] $28 : 0000000000000000 0000000000000001 9000000006826b40 9000000106305140 [ 41.608356] era : 9000000000228848 _save_fp+0x0/0xd8 [ 41.613542] ra : 90000000011e9468 __schedule+0x568/0x8d0 [ 41.619160] CSR crmd: 000000b0 [ 41.619163] CSR prmd: 00000000 [ 41.622359] CSR euen: 00000000 [ 41.625558] CSR ecfg: 00071c1c [ 41.628756] CSR estat: 000f0000 [ 41.635239] ExcCode : f (SubCode 0) [ 41.638783] PrId : 0014c010 (Loongson-64bit) [ 41.643191] Modules linked in: acpi_ipmi vfat fat ipmi_si ipmi_devintf cfg80211 ipmi_msghandler rfkill fuse efivarfs [ 41.653734] Process iou-wrk-394 (pid: 395, threadinfo=0000000004ebe913, task=00000000636fa1be) [ 41.662375] Stack : 00000000ffff0875 9000000006800ec0 9000000006800ec0 90000000002d57e0 [ 41.670412] 0000000000000001 0000000000000000 9000000106535880 0000000000000001 [ 41.678450] 9000000105291800 0000000000000000 9000000105291838 900000000178e000 [ 41.686487] 9000000106c7fd90 9000000106305140 0000000000000001 90000000011e9834 [ 41.694523] 00000000ffff0875 90000000011f034c 9000000105291838 9000000105291830 [ 41.702561] 0000000000000000 9000000006801440 00000000ffff0875 90000000002d48c0 [ 41.710597] 9000000128800001 9000000106305140 9000000105291838 9000000105291838 [ 41.718634] 9000000105291830 9000000107811740 9000000105291848 90000000009bf1e0 [ 41.726672] 9000000105291830 9000000107811748 2d6b72772d756f69 0000000000343933 [ 41.734708] 0000000000000000 0000000000000000 0000000000000000 0000000000000000 [ 41.742745] ... [ 41.745252] Call Trace: [ 42.197868] [<9000000000228848>] _save_fp+0x0/0xd8 [ 42.205214] [<90000000011ed468>] __schedule+0x568/0x8d0 [ 42.210485] [<90000000011ed834>] schedule+0x64/0xd4 [ 42.215411] [<90000000011f434c>] schedule_timeout+0x88/0x188 [ 42.221115] [<90000000009c36d0>] io_wqe_worker+0x184/0x350 [ 42.226645] [<9000000000221cf0>] ret_from_kernel_thread+0xc/0x9c This can be easily triggered by ltp testcase syscalls/io_uring02 and it can also be easily fixed by clearing the FPU/SIMD thread info flags for kernel threads in copy_thread(). Cc: stable@vger.kernel.org Reported-by: Qi Hu <huqi@loongson.cn> Signed-off-by: Huacai Chen <chenhuacai@loongson.cn>
2022-11-21LoongArch: SMP: Change prefix from loongson3 to loongsonHuacai Chen
SMP operations can be shared by Loongson-2 series and Loongson-3 series, so we change the prefix from loongson3 to loongson for all functions and data structures. Signed-off-by: Huacai Chen <chenhuacai@loongson.cn>
2022-11-21LoongArch: Combine acpi_boot_table_init() and acpi_boot_init()Huacai Chen
Combine acpi_boot_table_init() and acpi_boot_init() since they are very simple, and we don't need to check the return value of acpi_boot_init(). Signed-off-by: Huacai Chen <chenhuacai@loongson.cn>
2022-11-21LoongArch: Makefile: Use "grep -E" instead of "egrep"Tiezhu Yang
The latest version of grep claims the egrep is now obsolete so the build now contains warnings that look like: egrep: warning: egrep is obsolescent; using grep -E Fix this up by changing the LoongArch Makefile to use "grep -E" instead. Signed-off-by: Tiezhu Yang <yangtiezhu@loongson.cn> Signed-off-by: Huacai Chen <chenhuacai@loongson.cn>