| Age | Commit message (Collapse) | Author |
|
Pull bpf fixes from Alexei Starovoitov:
"Most of the diff stat comes from Xu Kuohai's fix to emit ENDBR/BTI,
since all JITs had to be touched to move constant blinding out and
pass bpf_verifier_env in.
- Fix use-after-free in arena_vm_close on fork (Alexei Starovoitov)
- Dissociate struct_ops program with map if map_update fails (Amery
Hung)
- Fix out-of-range and off-by-one bugs in arm64 JIT (Daniel Borkmann)
- Fix precedence bug in convert_bpf_ld_abs alignment check (Daniel
Borkmann)
- Fix arg tracking for imprecise/multi-offset in BPF_ST/STX insns
(Eduard Zingerman)
- Copy token from main to subprogs to fix missing kallsyms (Eduard
Zingerman)
- Prevent double close and leak of btf objects in libbpf (Jiri Olsa)
- Fix af_unix null-ptr-deref in sockmap (Michal Luczaj)
- Fix NULL deref in map_kptr_match_type for scalar regs (Mykyta
Yatsenko)
- Avoid unnecessary IPIs. Remove redundant bpf_flush_icache() in
arm64 and riscv JITs (Puranjay Mohan)
- Fix out of bounds access. Validate node_id in arena_alloc_pages()
(Puranjay Mohan)
- Reject BPF-to-BPF calls and callbacks in arm32 JIT (Puranjay Mohan)
- Refactor all JITs to pass bpf_verifier_env to emit ENDBR/BTI for
indirect jump targets on x86-64, arm64 JITs (Xu Kuohai)
- Allow UTF-8 literals in bpf_bprintf_prepare() (Yihan Ding)"
* tag 'bpf-fixes' of git://git.kernel.org/pub/scm/linux/kernel/git/bpf/bpf: (32 commits)
bpf, arm32: Reject BPF-to-BPF calls and callbacks in the JIT
bpf: Dissociate struct_ops program with map if map_update fails
bpf: Validate node_id in arena_alloc_pages()
libbpf: Prevent double close and leak of btf objects
selftests/bpf: cover UTF-8 trace_printk output
bpf: allow UTF-8 literals in bpf_bprintf_prepare()
selftests/bpf: Reject scalar store into kptr slot
bpf: Fix NULL deref in map_kptr_match_type for scalar regs
bpf: Fix precedence bug in convert_bpf_ld_abs alignment check
bpf, arm64: Emit BTI for indirect jump target
bpf, x86: Emit ENDBR for indirect jump targets
bpf: Add helper to detect indirect jump targets
bpf: Pass bpf_verifier_env to JIT
bpf: Move constants blinding out of arch-specific JITs
bpf, sockmap: Take state lock for af_unix iter
bpf, sockmap: Fix af_unix null-ptr-deref in proto update
selftests/bpf: Extend bpf_iter_unix to attempt deadlocking
bpf, sockmap: Fix af_unix iter deadlock
bpf, sockmap: Annotate af_unix sock:: Sk_state data-races
selftests/bpf: verify kallsyms entries for token-loaded subprograms
...
|
|
Pull kvm updates from Paolo Bonzini:
"Arm:
- Add support for tracing in the standalone EL2 hypervisor code,
which should help both debugging and performance analysis. This
uses the new infrastructure for 'remote' trace buffers that can be
exposed by non-kernel entities such as firmware, and which came
through the tracing tree
- Add support for GICv5 Per Processor Interrupts (PPIs), as the
starting point for supporting the new GIC architecture in KVM
- Finally add support for pKVM protected guests, where pages are
unmapped from the host as they are faulted into the guest and can
be shared back from the guest using pKVM hypercalls. Protected
guests are created using a new machine type identifier. As the
elusive guestmem has not yet delivered on its promises, anonymous
memory is also supported
This is only a first step towards full isolation from the host; for
example, the CPU register state and DMA accesses are not yet
isolated. Because this does not really yet bring fully what it
promises, it is hidden behind CONFIG_ARM_PKVM_GUEST +
'kvm-arm.mode=protected', and also triggers TAINT_USER when a VM is
created. Caveat emptor
- Rework the dreaded user_mem_abort() function to make it more
maintainable, reducing the amount of state being exposed to the
various helpers and rendering a substantial amount of state
immutable
- Expand the Stage-2 page table dumper to support NV shadow page
tables on a per-VM basis
- Tidy up the pKVM PSCI proxy code to be slightly less hard to
follow
- Fix both SPE and TRBE in non-VHE configurations so that they do not
generate spurious, out of context table walks that ultimately lead
to very bad HW lockups
- A small set of patches fixing the Stage-2 MMU freeing in error
cases
- Tighten-up accepted SMC immediate value to be only #0 for host
SMCCC calls
- The usual cleanups and other selftest churn
LoongArch:
- Use CSR_CRMD_PLV for kvm_arch_vcpu_in_kernel()
- Add DMSINTC irqchip in kernel support
RISC-V:
- Fix steal time shared memory alignment checks
- Fix vector context allocation leak
- Fix array out-of-bounds in pmu_ctr_read() and pmu_fw_ctr_read_hi()
- Fix double-free of sdata in kvm_pmu_clear_snapshot_area()
- Fix integer overflow in kvm_pmu_validate_counter_mask()
- Fix shift-out-of-bounds in make_xfence_request()
- Fix lost write protection on huge pages during dirty logging
- Split huge pages during fault handling for dirty logging
- Skip CSR restore if VCPU is reloaded on the same core
- Implement kvm_arch_has_default_irqchip() for KVM selftests
- Factored-out ISA checks into separate sources
- Added hideleg to struct kvm_vcpu_config
- Factored-out VCPU config into separate sources
- Support configuration of per-VM HGATP mode from KVM user space
s390:
- Support for ESA (31-bit) guests inside nested hypervisors
- Remove restriction on memslot alignment, which is not needed
anymore with the new gmap code
- Fix LPSW/E to update the bear (which of course is the breaking
event address register)
x86:
- Shut up various UBSAN warnings on reading module parameter before
they were initialized
- Don't zero-allocate page tables that are used for splitting
hugepages in the TDP MMU, as KVM is guaranteed to set all SPTEs in
the page table and thus write all bytes
- As an optimization, bail early when trying to unsync 4KiB mappings
if the target gfn can just be mapped with a 2MiB hugepage
x86 generic:
- Copy single-chunk MMIO write values into struct kvm_vcpu (more
precisely struct kvm_mmio_fragment) to fix use-after-free stack
bugs where KVM would dereference stack pointer after an exit to
userspace
- Clean up and comment the emulated MMIO code to try to make it
easier to maintain (not necessarily "easy", but "easier")
- Move VMXON+VMXOFF and EFER.SVME toggling out of KVM (not *all* of
VMX and SVM enabling) as it is needed for trusted I/O
- Advertise support for AVX512 Bit Matrix Multiply (BMM) instructions
- Immediately fail the build if a required #define is missing in one
of KVM's headers that is included multiple times
- Reject SET_GUEST_DEBUG with -EBUSY if there's an already injected
exception, mostly to prevent syzkaller from abusing the uAPI to
trigger WARNs, but also because it can help prevent userspace from
unintentionally crashing the VM
- Exempt SMM from CPUID faulting on Intel, as per the spec
- Misc hardening and cleanup changes
x86 (AMD):
- Fix and optimize IRQ window inhibit handling for AVIC; make it
per-vCPU so that KVM doesn't prematurely re-enable AVIC if multiple
vCPUs have to-be-injected IRQs
- Clean up and optimize the OSVW handling, avoiding a bug in which
KVM would overwrite state when enabling virtualization on multiple
CPUs in parallel. This should not be a problem because OSVW should
usually be the same for all CPUs
- Drop a WARN in KVM_MEMORY_ENCRYPT_REG_REGION where KVM complains
about a "too large" size based purely on user input
- Clean up and harden the pinning code for KVM_MEMORY_ENCRYPT_REG_REGION
- Disallow synchronizing a VMSA of an already-launched/encrypted
vCPU, as doing so for an SNP guest will crash the host due to an
RMP violation page fault
- Overhaul KVM's APIs for detecting SEV+ guests so that VM-scoped
queries are required to hold kvm->lock, and enforce it by lockdep.
Fix various bugs where sev_guest() was not ensured to be stable for
the whole duration of a function or ioctl
- Convert a pile of kvm->lock SEV code to guard()
- Play nicer with userspace that does not enable
KVM_CAP_EXCEPTION_PAYLOAD, for which KVM needs to set CR2 and DR6
as a response to ioctls such as KVM_GET_VCPU_EVENTS (even if the
payload would end up in EXITINFO2 rather than CR2, for example).
Only set CR2 and DR6 when consumption of the payload is imminent,
but on the other hand force delivery of the payload in all paths
where userspace retrieves CR2 or DR6
- Use vcpu->arch.cr2 when updating vmcb12's CR2 on nested #VMEXIT
instead of vmcb02->save.cr2. The value is out of sync after a
save/restore or after a #PF is injected into L2
- Fix a class of nSVM bugs where some fields written by the CPU are
not synchronized from vmcb02 to cached vmcb12 after VMRUN, and so
are not up-to-date when saved by KVM_GET_NESTED_STATE
- Fix a class of bugs where the ordering between KVM_SET_NESTED_STATE
and KVM_SET_{S}REGS could cause vmcb02 to be incorrectly
initialized after save+restore
- Add a variety of missing nSVM consistency checks
- Fix several bugs where KVM failed to correctly update VMCB fields
on nested #VMEXIT
- Fix several bugs where KVM failed to correctly synthesize #UD or
#GP for SVM-related instructions
- Add support for save+restore of virtualized LBRs (on SVM)
- Refactor various helpers and macros to improve clarity and
(hopefully) make the code easier to maintain
- Aggressively sanitize fields when copying from vmcb12, to guard
against unintentionally allowing L1 to utilize yet-to-be-defined
features
- Fix several bugs where KVM botched rAX legality checks when
emulating SVM instructions. There are remaining issues in that KVM
doesn't handle size prefix overrides for 64-bit guests
- Fail emulation of VMRUN/VMLOAD/VMSAVE if mapping vmcb12 fails
instead of somewhat arbitrarily synthesizing #GP (i.e. don't double
down on AMD's architectural but sketchy behavior of generating #GP
for "unsupported" addresses)
- Cache all used vmcb12 fields to further harden against TOCTOU bugs
x86 (Intel):
- Drop obsolete branch hint prefixes from the VMX instruction macros
- Use ASM_INPUT_RM() in __vmcs_writel() to coerce clang into using a
register input when appropriate
- Code cleanups
guest_memfd:
- Don't mark guest_memfd folios as accessed, as guest_memfd doesn't
support reclaim, the memory is unevictable, and there is no storage
to write back to
LoongArch selftests:
- Add KVM PMU test cases
s390 selftests:
- Enable more memory selftests
x86 selftests:
- Add support for Hygon CPUs in KVM selftests
- Fix a bug in the MSR test where it would get false failures on
AMD/Hygon CPUs with exactly one of RDPID or RDTSCP
- Add an MADV_COLLAPSE testcase for guest_memfd as a regression test
for a bug where the kernel would attempt to collapse guest_memfd
folios against KVM's will"
* tag 'for-linus' of git://git.kernel.org/pub/scm/virt/kvm/kvm: (373 commits)
KVM: x86: use inlines instead of macros for is_sev_*guest
x86/virt: Treat SVM as unsupported when running as an SEV+ guest
KVM: SEV: Goto an existing error label if charging misc_cg for an ASID fails
KVM: SVM: Move lock-protected allocation of SEV ASID into a separate helper
KVM: SEV: use mutex guard in snp_handle_guest_req()
KVM: SEV: use mutex guard in sev_mem_enc_unregister_region()
KVM: SEV: use mutex guard in sev_mem_enc_ioctl()
KVM: SEV: use mutex guard in snp_launch_update()
KVM: SEV: Assert that kvm->lock is held when querying SEV+ support
KVM: SEV: Document that checking for SEV+ guests when reclaiming memory is "safe"
KVM: SEV: Hide "struct kvm_sev_info" behind CONFIG_KVM_AMD_SEV=y
KVM: SEV: WARN on unhandled VM type when initializing VM
KVM: LoongArch: selftests: Add PMU overflow interrupt test
KVM: LoongArch: selftests: Add basic PMU event counting test
KVM: LoongArch: selftests: Add cpucfg read/write helpers
LoongArch: KVM: Add DMSINTC inject msi to vCPU
LoongArch: KVM: Add DMSINTC device support
LoongArch: KVM: Make vcpu_is_preempted() as a macro rather than function
LoongArch: KVM: Move host CSR_GSTAT save and restore in context switch
LoongArch: KVM: Move host CSR_EENTRY save and restore in context switch
...
|
|
Pull SoC devicetree updates from Arnd Bergmann:
"A number of SoC platforms are adding modernized variants of their
already supported chips time, with a total of 12 new SoCs, and two
older SoC getting removed:
- Qualcomm Glymur is a compute SoC using 18 Oryon-2 CPU cores
- Qualcomm Mahua is a variant of Glymur with only 12 CPU cores, but
largely identical.
- Qualcomm Eliza is an embeded platform for mobile phone (SM7750) and
IOT (QC7790S/M) workloads
- Qualcomm IPQ5210 is a wireless networking SoC using Cortex-A53
cores
- Qualcomm apq8084 and ipq806x had only rudimentary support but no
actual products using them, so they are now gone.
- Axis ARTPEC-9 is a follow-up to the ARTPEC-8 embedded SoC, using
the Samsung SoC platform but now with Cortex-A55 cores
- ARM Zena is a virtual platform in FVP using Cortex-A720AE cores,
with additional versions planned to be merged in the future.
- ARM corstone-1000-a320 is a reference platform for IOT, using
low-end Cortex-A320 cores
- Microchip LAN9691 is an updated 64-bit variant of the arm32 lan966x
series of networking SoCs
- Microchip PIC64GX is an embedded RISC-V chip using SIFIVE U54 CPU
cores
- Rockchip RV1103B is the low-end 32-bit single-core vision processor
- Renesas RZ/G3L (r9a08g046) is an industrial embedded chip using
Cortex-A55 cores, similar to the G3E and G3S variants we already
supported.
- NXP S32N79 is an automotive SoC using Cortex-A78AE cores, a
significant upgrade from the older S32V and S32G series
These all come with at least one reference board or an initial product
using these, in total there are 67 newly added boards. The ones for
already supported SoCs are:
- Two more Aspeed BMC based boards
- Three older tablets based on 32-bit OMAP4 and Exynos5 SoCs
- One Set-top-box based on Allwinner H6
- 22 additional industrial/embedded boards using 64-bit NXP i.MX8M or
i.MX9 SoCs
- 20 Qualcomm SoC based machines across all possible markets:
workstation, gaming, laptop, phone, networking, reference, ...
- Three more Rockchips rk35xx based boards
- Four variants of the Toradex Verdin using TI AM62
Other notable bits are:
- A cleanup for the 32-bit Tegra paz00 board moved the last board
specific code on Tegra into equivalent dts syntax.
- There continues to be a significant number of fixes for static
checking of dtc syntax, but it feels like this is slowing down,
hopefully getting into a state where most known issues are
addressed
- Additional hardware support for many existing boards across SoC
families, notably Qualcomm, Broadcom, i.MX2, i.MX6, Rockchips,
STM32, Mediatek, Tegra, TI and Microchip"
* tag 'soc-dt-7.1' of git://git.kernel.org/pub/scm/linux/kernel/git/soc/soc: (841 commits)
arm64: dts: ti: k3: Use memory-region-names for r5f
ARM: dts: imx: Add DT overlays for DH i.MX6 DHCOM SoM and boards
ARM: dts: imx6sx: remove fallback compatible string fsl,imx28-lcdif
ARM: dts: imx25: rename node name tcq to touchscreen
ARM: dts: imx: b850v3: Disable unused usdhc4
ARM: dts: imx: b850v3: Define GPIO line names
ARM: dts: imx: b850v3: Use alphabetical sorting
ARM: dts: imx: bx50v3: Configure phy-mode to eliminate a warning
ARM: dts: imx: bx50v3: Configure switch PHY max-speed to 100Mbps
ARM: dts: imx7ulp: Add CPU clock and OPP table support
ARM: dts: imx7-mba7: Deassert BOOT_EN after boot
ARM: dts: tqma7: add boot phase properties
ARM: dts: imx7s: add boot phase properties
ARM: dts: tqma6ul[l]: correct spelling of TQ-Systems
ARM: dts: mba6ulx: add boot phase properties
ARM: dts: imx6ul[l]-tqma6ul[l]: add boot phase properties
ARM: dts: imx6ul/imx6ull: add boot phase properties
ARM: dts: imx6qdl-mba6: add boot phase properties
ARM: dts: imx6qdl-tqma6: add boot phase properties
ARM: dts: imx6qdl: add boot phase properties
...
|
|
git://git.kernel.org/pub/scm/linux/kernel/git/akpm/mm
Pull non-MM updates from Andrew Morton:
- "pid: make sub-init creation retryable" (Oleg Nesterov)
Make creation of init in a new namespace more robust by clearing away
some historical cruft which is no longer needed. Also some
documentation fixups
- "selftests/fchmodat2: Error handling and general" (Mark Brown)
Fix and a cleanup for the fchmodat2() syscall selftest
- "lib: polynomial: Move to math/ and clean up" (Andy Shevchenko)
- "hung_task: Provide runtime reset interface for hung task detector"
(Aaron Tomlin)
Give administrators the ability to zero out
/proc/sys/kernel/hung_task_detect_count
- "tools/getdelays: use the static UAPI headers from
tools/include/uapi" (Thomas Weißschuh)
Teach getdelays to use the in-kernel UAPI headers rather than the
system-provided ones
- "watchdog/hardlockup: Improvements to hardlockup" (Mayank Rungta)
Several cleanups and fixups to the hardlockup detector code and its
documentation
- "lib/bch: fix undefined behavior from signed left-shifts" (Josh Law)
A couple of small/theoretical fixes in the bch code
- "ocfs2/dlm: fix two bugs in dlm_match_regions()" (Junrui Luo)
- "cleanup the RAID5 XOR library" (Christoph Hellwig)
A quite far-reaching cleanup to this code. I can't do better than to
quote Christoph:
"The XOR library used for the RAID5 parity is a bit of a mess right
now. The main file sits in crypto/ despite not being cryptography
and not using the crypto API, with the generic implementations
sitting in include/asm-generic and the arch implementations
sitting in an asm/ header in theory. The latter doesn't work for
many cases, so architectures often build the code directly into
the core kernel, or create another module for the architecture
code.
Change this to a single module in lib/ that also contains the
architecture optimizations, similar to the library work Eric
Biggers has done for the CRC and crypto libraries later. After
that it changes to better calling conventions that allow for
smarter architecture implementations (although none is contained
here yet), and uses static_call to avoid indirection function call
overhead"
- "lib/list_sort: Clean up list_sort() scheduling workarounds"
(Kuan-Wei Chiu)
Clean up this library code by removing a hacky thing which was added
for UBIFS, which UBIFS doesn't actually need
- "Fix bugs in extract_iter_to_sg()" (Christian Ehrhardt)
Fix a few bugs in the scatterlist code, add in-kernel tests for the
now-fixed bugs and fix a leak in the test itself
- "kdump: Enable LUKS-encrypted dump target support in ARM64 and
PowerPC" (Coiby Xu)
Enable support of the LUKS-encrypted device dump target on arm64 and
powerpc
- "ocfs2: consolidate extent list validation into block read callbacks"
(Joseph Qi)
Cleanup, simplify, and make more robust ocfs2's validation of extent
list fields (Kernel test robot loves mounting corrupted fs images!)
* tag 'mm-nonmm-stable-2026-04-15-04-20' of git://git.kernel.org/pub/scm/linux/kernel/git/akpm/mm: (127 commits)
ocfs2: validate group add input before caching
ocfs2: validate bg_bits during freefrag scan
ocfs2: fix listxattr handling when the buffer is full
doc: watchdog: fix typos etc
update Sean's email address
ocfs2: use get_random_u32() where appropriate
ocfs2: split transactions in dio completion to avoid credit exhaustion
ocfs2: remove redundant l_next_free_rec check in __ocfs2_find_path()
ocfs2: validate extent block list fields during block read
ocfs2: remove empty extent list check in ocfs2_dx_dir_lookup_rec()
ocfs2: validate dx_root extent list fields during block read
ocfs2: fix use-after-free in ocfs2_fault() when VM_FAULT_RETRY
ocfs2: handle invalid dinode in ocfs2_group_extend
.get_maintainer.ignore: add Askar
ocfs2: validate bg_list extent bounds in discontig groups
checkpatch: exclude forward declarations of const structs
tools/accounting: handle truncated taskstats netlink messages
taskstats: set version in TGID exit notifications
ocfs2/heartbeat: fix slot mapping rollback leaks on error paths
arm64,ppc64le/kdump: pass dm-crypt keys to kdump kernel
...
|
|
Pass bpf_verifier_env to bpf_int_jit_compile(). The follow-up patch will
use env->insn_aux_data in the JIT stage to detect indirect jump targets.
Since bpf_prog_select_runtime() can be called by cbpf and lib/test_bpf.c
code without verifier, introduce helper __bpf_prog_select_runtime()
to accept the env parameter.
Remove the call to bpf_prog_select_runtime() in bpf_prog_load(), and
switch to call __bpf_prog_select_runtime() in the verifier, with env
variable passed. The original bpf_prog_select_runtime() is preserved for
cbpf and lib/test_bpf.c, where env is NULL.
Now all constants blinding calls are moved into the verifier, except
the cbpf and lib/test_bpf.c cases. The instructions arrays are adjusted
by bpf_patch_insn_data() function for normal cases, so there is no need
to call adjust_insn_arrays() in bpf_jit_blind_constants(). Remove it.
Reviewed-by: Anton Protopopov <a.s.protopopov@gmail.com> # v8
Reviewed-by: Emil Tsalapatis <emil@etsalapatis.com> # v12
Acked-by: Hengqi Chen <hengqi.chen@gmail.com> # v14
Signed-off-by: Xu Kuohai <xukuohai@huawei.com>
Link: https://lore.kernel.org/r/20260416064341.151802-3-xukuohai@huaweicloud.com
Signed-off-by: Alexei Starovoitov <ast@kernel.org>
|
|
During the JIT stage, constants blinding rewrites instructions but only
rewrites the private instruction copy of the JITed subprog, leaving the
global env->prog->insnsi and env->insn_aux_data untouched. This causes a
mismatch between subprog instructions and the global state, making it
difficult to use the global data in the JIT.
To avoid this mismatch, and given that all arch-specific JITs already
support constants blinding, move it to the generic verifier code, and
switch to rewrite the global env->prog->insnsi with the global states
adjusted, as other rewrites in the verifier do.
This removes the constants blinding calls in each JIT, which are largely
duplicated code across architectures.
Since constants blinding is only required for JIT, and there are two
JIT entry functions, jit_subprogs() for BPF programs with multiple
subprogs and bpf_prog_select_runtime() for programs with no subprogs,
move the constants blinding invocation into these two functions.
In the verifier path, bpf_patch_insn_data() is used to keep global
verifier auxiliary data in sync with patched instructions. A key
question is whether this global auxiliary data should be restored
on the failure path.
Besides instructions, bpf_patch_insn_data() adjusts:
- prog->aux->poke_tab
- env->insn_array_maps
- env->subprog_info
- env->insn_aux_data
For prog->aux->poke_tab, it is only used by JIT or only meaningful after
JIT succeeds, so it does not need to be restored on the failure path.
For env->insn_array_maps, when JIT fails, programs using insn arrays
are rejected by bpf_insn_array_ready() due to missing JIT addresses.
Hence, env->insn_array_maps is only meaningful for JIT and does not need
to be restored.
For subprog_info, if jit_subprogs fails and CONFIG_BPF_JIT_ALWAYS_ON
is not enabled, kernel falls back to interpreter. In this case,
env->subprog_info is used to determine subprogram stack depth. So it
must be restored on failure.
For env->insn_aux_data, it is freed by clear_insn_aux_data() at the
end of bpf_check(). Before freeing, clear_insn_aux_data() loops over
env->insn_aux_data to release jump targets recorded in it. The loop
uses env->prog->len as the array length, but this length no longer
matches the actual size of the adjusted env->insn_aux_data array after
constants blinding.
To address it, a simple approach is to keep insn_aux_data as adjusted
after failure, since it will be freed shortly, and record its actual size
for the loop in clear_insn_aux_data(). But since clear_insn_aux_data()
uses the same index to loop over both env->prog->insnsi and env->insn_aux_data,
this approach results in incorrect index for the insnsi array. So an
alternative approach is adopted: clone the original env->insn_aux_data
before blinding and restore it after failure, similar to env->prog.
For classic BPF programs, constants blinding works as before since it
is still invoked from bpf_prog_select_runtime().
Reviewed-by: Anton Protopopov <a.s.protopopov@gmail.com> # v8
Reviewed-by: Hari Bathini <hbathini@linux.ibm.com> # powerpc jit
Reviewed-by: Pu Lehui <pulehui@huawei.com> # riscv jit
Acked-by: Hengqi Chen <hengqi.chen@gmail.com> # loongarch jit
Signed-off-by: Xu Kuohai <xukuohai@huawei.com>
Link: https://lore.kernel.org/r/20260416064341.151802-2-xukuohai@huaweicloud.com
Signed-off-by: Alexei Starovoitov <ast@kernel.org>
|
|
git://git.kernel.org/pub/scm/linux/kernel/git/akpm/mm
Pull MM updates from Andrew Morton:
- "maple_tree: Replace big node with maple copy" (Liam Howlett)
Mainly prepararatory work for ongoing development but it does reduce
stack usage and is an improvement.
- "mm, swap: swap table phase III: remove swap_map" (Kairui Song)
Offers memory savings by removing the static swap_map. It also yields
some CPU savings and implements several cleanups.
- "mm: memfd_luo: preserve file seals" (Pratyush Yadav)
File seal preservation to LUO's memfd code
- "mm: zswap: add per-memcg stat for incompressible pages" (Jiayuan
Chen)
Additional userspace stats reportng to zswap
- "arch, mm: consolidate empty_zero_page" (Mike Rapoport)
Some cleanups for our handling of ZERO_PAGE() and zero_pfn
- "mm/kmemleak: Improve scan_should_stop() implementation" (Zhongqiu
Han)
A robustness improvement and some cleanups in the kmemleak code
- "Improve khugepaged scan logic" (Vernon Yang)
Improve khugepaged scan logic and reduce CPU consumption by
prioritizing scanning tasks that access memory frequently
- "Make KHO Stateless" (Jason Miu)
Simplify Kexec Handover by transitioning KHO from an xarray-based
metadata tracking system with serialization to a radix tree data
structure that can be passed directly to the next kernel
- "mm: vmscan: add PID and cgroup ID to vmscan tracepoints" (Thomas
Ballasi and Steven Rostedt)
Enhance vmscan's tracepointing
- "mm: arch/shstk: Common shadow stack mapping helper and
VM_NOHUGEPAGE" (Catalin Marinas)
Cleanup for the shadow stack code: remove per-arch code in favour of
a generic implementation
- "Fix KASAN support for KHO restored vmalloc regions" (Pasha Tatashin)
Fix a WARN() which can be emitted the KHO restores a vmalloc area
- "mm: Remove stray references to pagevec" (Tal Zussman)
Several cleanups, mainly udpating references to "struct pagevec",
which became folio_batch three years ago
- "mm: Eliminate fake head pages from vmemmap optimization" (Kiryl
Shutsemau)
Simplify the HugeTLB vmemmap optimization (HVO) by changing how tail
pages encode their relationship to the head page
- "mm/damon/core: improve DAMOS quota efficiency for core layer
filters" (SeongJae Park)
Improve two problematic behaviors of DAMOS that makes it less
efficient when core layer filters are used
- "mm/damon: strictly respect min_nr_regions" (SeongJae Park)
Improve DAMON usability by extending the treatment of the
min_nr_regions user-settable parameter
- "mm/page_alloc: pcp locking cleanup" (Vlastimil Babka)
The proper fix for a previously hotfixed SMP=n issue. Code
simplifications and cleanups ensued
- "mm: cleanups around unmapping / zapping" (David Hildenbrand)
A bunch of cleanups around unmapping and zapping. Mostly
simplifications, code movements, documentation and renaming of
zapping functions
- "support batched checking of the young flag for MGLRU" (Baolin Wang)
Batched checking of the young flag for MGLRU. It's part cleanups; one
benchmark shows large performance benefits for arm64
- "memcg: obj stock and slab stat caching cleanups" (Johannes Weiner)
memcg cleanup and robustness improvements
- "Allow order zero pages in page reporting" (Yuvraj Sakshith)
Enhance free page reporting - it is presently and undesirably order-0
pages when reporting free memory.
- "mm: vma flag tweaks" (Lorenzo Stoakes)
Cleanup work following from the recent conversion of the VMA flags to
a bitmap
- "mm/damon: add optional debugging-purpose sanity checks" (SeongJae
Park)
Add some more developer-facing debug checks into DAMON core
- "mm/damon: test and document power-of-2 min_region_sz requirement"
(SeongJae Park)
An additional DAMON kunit test and makes some adjustments to the
addr_unit parameter handling
- "mm/damon/core: make passed_sample_intervals comparisons
overflow-safe" (SeongJae Park)
Fix a hard-to-hit time overflow issue in DAMON core
- "mm/damon: improve/fixup/update ratio calculation, test and
documentation" (SeongJae Park)
A batch of misc/minor improvements and fixups for DAMON
- "mm: move vma_(kernel|mmu)_pagesize() out of hugetlb.c" (David
Hildenbrand)
Fix a possible issue with dax-device when CONFIG_HUGETLB=n. Some code
movement was required.
- "zram: recompression cleanups and tweaks" (Sergey Senozhatsky)
A somewhat random mix of fixups, recompression cleanups and
improvements in the zram code
- "mm/damon: support multiple goal-based quota tuning algorithms"
(SeongJae Park)
Extend DAMOS quotas goal auto-tuning to support multiple tuning
algorithms that users can select
- "mm: thp: reduce unnecessary start_stop_khugepaged()" (Breno Leitao)
Fix the khugpaged sysfs handling so we no longer spam the logs with
reams of junk when starting/stopping khugepaged
- "mm: improve map count checks" (Lorenzo Stoakes)
Provide some cleanups and slight fixes in the mremap, mmap and vma
code
- "mm/damon: support addr_unit on default monitoring targets for
modules" (SeongJae Park)
Extend the use of DAMON core's addr_unit tunable
- "mm: khugepaged cleanups and mTHP prerequisites" (Nico Pache)
Cleanups to khugepaged and is a base for Nico's planned khugepaged
mTHP support
- "mm: memory hot(un)plug and SPARSEMEM cleanups" (David Hildenbrand)
Code movement and cleanups in the memhotplug and sparsemem code
- "mm: remove CONFIG_ARCH_ENABLE_MEMORY_HOTREMOVE and cleanup
CONFIG_MIGRATION" (David Hildenbrand)
Rationalize some memhotplug Kconfig support
- "change young flag check functions to return bool" (Baolin Wang)
Cleanups to change all young flag check functions to return bool
- "mm/damon/sysfs: fix memory leak and NULL dereference issues" (Josh
Law and SeongJae Park)
Fix a few potential DAMON bugs
- "mm/vma: convert vm_flags_t to vma_flags_t in vma code" (Lorenzo
Stoakes)
Convert a lot of the existing use of the legacy vm_flags_t data type
to the new vma_flags_t type which replaces it. Mainly in the vma
code.
- "mm: expand mmap_prepare functionality and usage" (Lorenzo Stoakes)
Expand the mmap_prepare functionality, which is intended to replace
the deprecated f_op->mmap hook which has been the source of bugs and
security issues for some time. Cleanups, documentation, extension of
mmap_prepare into filesystem drivers
- "mm/huge_memory: refactor zap_huge_pmd()" (Lorenzo Stoakes)
Simplify and clean up zap_huge_pmd(). Additional cleanups around
vm_normal_folio_pmd() and the softleaf functionality are performed.
* tag 'mm-stable-2026-04-13-21-45' of git://git.kernel.org/pub/scm/linux/kernel/git/akpm/mm: (369 commits)
mm: fix deferred split queue races during migration
mm/khugepaged: fix issue with tracking lock
mm/huge_memory: add and use has_deposited_pgtable()
mm/huge_memory: add and use normal_or_softleaf_folio_pmd()
mm: add softleaf_is_valid_pmd_entry(), pmd_to_softleaf_folio()
mm/huge_memory: separate out the folio part of zap_huge_pmd()
mm/huge_memory: use mm instead of tlb->mm
mm/huge_memory: remove unnecessary sanity checks
mm/huge_memory: deduplicate zap deposited table call
mm/huge_memory: remove unnecessary VM_BUG_ON_PAGE()
mm/huge_memory: add a common exit path to zap_huge_pmd()
mm/huge_memory: handle buggy PMD entry in zap_huge_pmd()
mm/huge_memory: have zap_huge_pmd return a boolean, add kdoc
mm/huge: avoid big else branch in zap_huge_pmd()
mm/huge_memory: simplify vma_is_specal_huge()
mm: on remap assert that input range within the proposed VMA
mm: add mmap_action_map_kernel_pages[_full]()
uio: replace deprecated mmap hook with mmap_prepare in uio_info
drivers: hv: vmbus: replace deprecated mmap hook with mmap_prepare
mm: allow handling of stacked mmap_prepare hooks in more drivers
...
|
|
bpf_flush_icache() calls flush_icache_range() to clean the data cache
and invalidate the instruction cache for the JITed code region. However,
since commit 48a8f78c50bd ("bpf, riscv: use prog pack allocator in the
BPF JIT"), this flush is redundant.
bpf_jit_binary_pack_finalize() copies the JITed instructions to the ROX
region via bpf_arch_text_copy() -> patch_text_nosync(), and
patch_text_nosync() already calls flush_icache_range() on the written
range. The subsequent bpf_flush_icache() repeats the same cache
maintenance on an overlapping range.
Remove the redundant bpf_flush_icache() call and its now-unused
definition.
Fixes: 48a8f78c50bd ("bpf, riscv: use prog pack allocator in the BPF JIT")
Acked-by: Song Liu <song@kernel.org>
Signed-off-by: Puranjay Mohan <puranjay@kernel.org>
Reviewed-by: Pu Lehui <pulehui@huawei.com>
Tested-by: Paul Chaignon <paul.chaignon@gmail.com>
Link: https://lore.kernel.org/r/20260413191111.3426023-3-puranjay@kernel.org
Signed-off-by: Alexei Starovoitov <ast@kernel.org>
|
|
git://git.kernel.org/pub/scm/linux/kernel/git/rafael/linux-pm
Pull ACPI support updates from Rafael Wysocki:
"These include an update of the CMOS RTC driver and the related ACPI
and x86 code that, among other things, switches it over to using the
platform device interface for device binding on x86 instead of the PNP
device driver interface (which allows the code in question to be
simplified quite a bit), a major update of the ACPI Time and Alarm
Device (TAD) driver adding an RTC class device interface to it, and
updates of core ACPI drivers that remove some unnecessary and not
really useful code from them.
Apart from that, two drivers are converted to using the platform
driver interface for device binding instead of the ACPI driver one,
which is slated for removal, support for the Performance Limited
register is added to the ACPI CPPC library and there are some
janitorial updates of it and the related cpufreq CPPC driver, the ACPI
processor driver is fixed and cleaned up, and NVIDIA vendor CPER
record handler is added to the APEI GHES code.
Also, the interface for obtaining a CPU UID from ACPI is consolidated
across architectures and used for fixing a problem with the PCI TPH
Steering Tag on ARM64, there are two updates related to ACPICA, a
minor ACPI OS Services Layer (OSL) update, and a few assorted updates
related to ACPI tables parsing.
Specifics:
- Update maintainers information regarding ACPICA (Rafael Wysocki)
- Replace strncpy() with strscpy_pad() in acpi_ut_safe_strncpy()
(Kees Cook)
- Trigger an ordered system power off after encountering a fatal
error operator in AML (Armin Wolf)
- Enable ACPI FPDT parsing on LoongArch (Xi Ruoyao)
- Remove the temporary stop-gap acpi_pptt_cache_v1_full structure
from the ACPI PPTT parser (Ben Horgan)
- Add support for exposing ACPI FPDT subtables FBPT and S3PT (Nate
DeSimone)
- Address multiple assorted issues and clean up the code in the ACPI
processor idle driver (Huisong Li)
- Replace strlcat() in the ACPI processor idle drive with a better
alternative (Andy Shevchenko)
- Rearrange and clean up acpi_processor_errata_piix4() (Rafael
Wysocki)
- Move reference performance to capabilities and fix an uninitialized
variable in the ACPI CPPC library (Pengjie Zhang)
- Add support for the Performance Limited Register to the ACPI CPPC
library (Sumit Gupta)
- Add cppc_get_perf() API to read performance controls, extend
cppc_set_epp_perf() for FFH/SystemMemory, and make the ACPI CPPC
library warn on missing mandatory DESIRED_PERF register (Sumit
Gupta)
- Modify the cpufreq CPPC driver to update MIN_PERF/MAX_PERF in
target callbacks to allow it to control performance bounds via
standard scaling_min_freq and scaling_max_freq sysfs attributes and
add sysfs documentation for the Performance Limited Register to it
(Sumit Gupta)
- Add ACPI support to the platform device interface in the CMOS RTC
driver, make the ACPI core device enumeration code create a
platform device for the CMOS RTC, and drop CMOS RTC PNP device
support (Rafael Wysocki)
- Consolidate the x86-specific CMOS RTC handling with the ACPI TAD
driver and clean up the CMOS RTC ACPI address space handler (Rafael
Wysocki)
- Enable ACPI alarm in the CMOS RTC driver if advertised in ACPI FADT
and allow that driver to work without a dedicated IRQ if the ACPI
alarm is used (Rafael Wysocki)
- Clean up the ACPI TAD driver in various ways and add an RTC class
device interface, including both the RTC setting/reading and alarm
timer support, to it (Rafael Wysocki)
- Clean up the ACPI AC and ACPI PAD (processor aggregator device)
drivers (Rafael Wysocki)
- Rework checking for duplicate video bus devices and consolidate
pnp.bus_id workarounds handling in the ACPI video bus driver
(Rafael Wysocki)
- Update the ACPI core device drivers to stop setting
acpi_device_name() unnecessarily (Rafael Wysocki)
- Rearrange code using acpi_device_class() in the ACPI core device
drivers and update them to stop setting acpi_device_class()
unnecessarily (Rafael Wysocki)
- Define ACPI_AC_CLASS in one place (Rafael Wysocki)
- Convert the ni903x_wdt watchdog driver and the xen ACPI PAD driver
to bind to platform devices instead of ACPI devices (Rafael
Wysocki)
- Add devm_ghes_register_vendor_record_notifier(), use it in the PCI
hisi driver, and Add NVIDIA vendor CPER record handler (Kai-Heng
Feng)
- Consolidate the interface for obtaining a CPU UID from ACPI across
architectures and use it to address incorrect PCI TPH Steering Tag
on ARM64 resulting from the invalid assumption that the ACPI
Processor UID would always be the same as the corresponding logical
CPU ID in Linux (Chengwen Feng)"
* tag 'acpi-7.1-rc1' of git://git.kernel.org/pub/scm/linux/kernel/git/rafael/linux-pm: (73 commits)
ACPICA: Update maintainers information
watchdog: ni903x_wdt: Convert to a platform driver
ACPI: PAD: xen: Convert to a platform driver
ACPI: processor: idle: Reset cpuidle on C-state list changes
cpuidle: Extract and export no-lock variants of cpuidle_unregister_device()
PCI/TPH: Pass ACPI Processor UID to Cache Locality _DSM
ACPI: PPTT: Use acpi_get_cpu_uid() and remove get_acpi_id_for_cpu()
perf: arm_cspmu: Switch to acpi_get_cpu_uid() from get_acpi_id_for_cpu()
ACPI: Centralize acpi_get_cpu_uid() declaration in include/linux/acpi.h
x86/acpi: Add acpi_get_cpu_uid() for unified ACPI CPU UID retrieval
RISC-V: ACPI: Add acpi_get_cpu_uid() for unified ACPI CPU UID retrieval
LoongArch: Add acpi_get_cpu_uid() for unified ACPI CPU UID retrieval
arm64: acpi: Add acpi_get_cpu_uid() for unified ACPI CPU UID retrieval
ACPI: APEI: GHES: Add NVIDIA vendor CPER record handler
PCI: hisi: Use devm_ghes_register_vendor_record_notifier()
ACPI: APEI: GHES: Add devm_ghes_register_vendor_record_notifier()
ACPI: tables: Enable FPDT on LoongArch
ACPI: processor: idle: Fix NULL pointer dereference in hotplug path
ACPI: processor: idle: Reset power_setup_done flag on initialization failure
ACPI: TAD: Add alarm support to the RTC class device interface
...
|
|
git://git.kernel.org/pub/scm/linux/kernel/git/kees/linux
Pull hardening updates from Kees Cook:
- randomize_kstack: Improve implementation across arches (Ryan Roberts)
- lkdtm/fortify: Drop unneeded FORTIFY_STR_OBJECT test
- refcount: Remove unused __signed_wrap function annotations
* tag 'hardening-v7.1-rc1' of git://git.kernel.org/pub/scm/linux/kernel/git/kees/linux:
lkdtm/fortify: Drop unneeded FORTIFY_STR_OBJECT test
refcount: Remove unused __signed_wrap function annotations
randomize_kstack: Unify random source across arches
randomize_kstack: Maintain kstack_offset per task
|
|
git://git.kernel.org/pub/scm/linux/kernel/git/ebiggers/linux
Pull crypto library updates from Eric Biggers:
- Migrate more hash algorithms from the traditional crypto subsystem to
lib/crypto/
Like the algorithms migrated earlier (e.g. SHA-*), this simplifies
the implementations, improves performance, enables further
simplifications in calling code, and solves various other issues:
- AES CBC-based MACs (AES-CMAC, AES-XCBC-MAC, and AES-CBC-MAC)
- Support these algorithms in lib/crypto/ using the AES library
and the existing arm64 assembly code
- Reimplement the traditional crypto API's "cmac(aes)",
"xcbc(aes)", and "cbcmac(aes)" on top of the library
- Convert mac80211 to use the AES-CMAC library. Note: several
other subsystems can use it too and will be converted later
- Drop the broken, nonstandard, and likely unused support for
"xcbc(aes)" with key lengths other than 128 bits
- Enable optimizations by default
- GHASH
- Migrate the standalone GHASH code into lib/crypto/
- Integrate the GHASH code more closely with the very similar
POLYVAL code, and improve the generic GHASH implementation to
resist cache-timing attacks and use much less memory
- Reimplement the AES-GCM library and the "gcm" crypto_aead
template on top of the GHASH library. Remove "ghash" from the
crypto_shash API, as it's no longer needed
- Enable optimizations by default
- SM3
- Migrate the kernel's existing SM3 code into lib/crypto/, and
reimplement the traditional crypto API's "sm3" on top of it
- I don't recommend using SM3, but this cleanup is worthwhile
to organize the code the same way as other algorithms
- Testing improvements:
- Add a KUnit test suite for each of the new library APIs
- Migrate the existing ChaCha20Poly1305 test to KUnit
- Make the KUnit all_tests.config enable all crypto library tests
- Move the test kconfig options to the Runtime Testing menu
- Other updates to arch-optimized crypto code:
- Optimize SHA-256 for Zhaoxin CPUs using the Padlock Hash Engine
- Remove some MD5 implementations that are no longer worth keeping
- Drop big endian and voluntary preemption support from the arm64
code, as those configurations are no longer supported on arm64
- Make jitterentropy and samples/tsm-mr use the crypto library APIs
* tag 'libcrypto-for-linus' of git://git.kernel.org/pub/scm/linux/kernel/git/ebiggers/linux: (66 commits)
lib/crypto: arm64: Assume a little-endian kernel
arm64: fpsimd: Remove obsolete cond_yield macro
lib/crypto: arm64/sha3: Remove obsolete chunking logic
lib/crypto: arm64/sha512: Remove obsolete chunking logic
lib/crypto: arm64/sha256: Remove obsolete chunking logic
lib/crypto: arm64/sha1: Remove obsolete chunking logic
lib/crypto: arm64/poly1305: Remove obsolete chunking logic
lib/crypto: arm64/gf128hash: Remove obsolete chunking logic
lib/crypto: arm64/chacha: Remove obsolete chunking logic
lib/crypto: arm64/aes: Remove obsolete chunking logic
lib/crypto: Include <crypto/utils.h> instead of <crypto/algapi.h>
lib/crypto: aesgcm: Don't disable IRQs during AES block encryption
lib/crypto: aescfb: Don't disable IRQs during AES block encryption
lib/crypto: tests: Migrate ChaCha20Poly1305 self-test to KUnit
lib/crypto: sparc: Drop optimized MD5 code
lib/crypto: mips: Drop optimized MD5 code
lib: Move crypto library tests to Runtime Testing menu
crypto: sm3 - Remove 'struct sm3_state'
crypto: sm3 - Remove the original "sm3_block_generic()"
crypto: sm3 - Remove sm3_base.h
...
|
|
git://git.kernel.org/pub/scm/linux/kernel/git/ojeda/linux
Pull Rust updates from Miguel Ojeda:
"Toolchain and infrastructure:
- Bump the minimum Rust version to 1.85.0 (and 'bindgen' to 0.71.1).
As proposed in LPC 2025 and the Maintainers Summit [1], we are
going to follow Debian Stable's Rust versions as our minimum
versions.
Debian Trixie was released on 2025-08-09 with a Rust 1.85.0 and
'bindgen' 0.71.1 toolchain, which is a fair amount of time for e.g.
kernel developers to upgrade.
Other major distributions support a Rust version that is high
enough as well, including:
+ Arch Linux.
+ Fedora Linux.
+ Gentoo Linux.
+ Nix.
+ openSUSE Slowroll and openSUSE Tumbleweed.
+ Ubuntu 25.10 and 26.04 LTS. In addition, 24.04 LTS using
their versioned packages.
The merged patch series comes with the associated cleanups and
simplifications treewide that can be performed thanks to both
bumps, as well as documentation updates.
In addition, start using 'bindgen''s '--with-attribute-custom-enum'
feature to set the 'cfi_encoding' attribute for the 'lru_status'
enum used in Binder.
Link: https://lwn.net/Articles/1050174/ [1]
- Add experimental Kconfig option ('CONFIG_RUST_INLINE_HELPERS') that
inlines C helpers into Rust.
Essentially, it performs a step similar to LTO, but just for the
helpers, i.e. very local and fast.
It relies on 'llvm-link' and its '--internalize' flag, and requires
a compatible LLVM between Clang and 'rustc' (i.e. same major
version, 'CONFIG_RUSTC_CLANG_LLVM_COMPATIBLE'). It is only enabled
for two architectures for now.
The result is a measurable speedup in different workloads that
different users have tested. For instance, for the null block
driver, it amounts to a 2%.
- Support global per-version flags.
While we already have per-version flags in many places, we didn't
have a place to set global ones that depend on the compiler
version, i.e. in 'rust_common_flags', which sometimes is needed to
e.g. tweak the lints set per version.
Use that to allow the 'clippy::precedence' lint for Rust < 1.86.0,
since it had a change in behavior.
- Support overriding the crate name and apply it to Rust Binder,
which wanted the module to be called 'rust_binder'.
- Add the remaining '__rust_helper' annotations (started in the
previous cycle).
'kernel' crate:
- Introduce the 'const_assert!' macro: a more powerful version of
'static_assert!' that can refer to generics inside functions or
implementation bodies, e.g.:
fn f<const N: usize>() {
const_assert!(N > 1);
}
fn g<T>() {
const_assert!(size_of::<T>() > 0, "T cannot be ZST");
}
In addition, reorganize our set of build-time assertion macros
('{build,const,static_assert}!') to live in the 'build_assert'
module.
Finally, improve the docs as well to clarify how these are
different from one another and how to pick the right one to use,
and their equivalence (if any) to the existing C ones for extra
clarity.
- 'sizes' module: add 'SizeConstants' trait.
This gives us typed 'SZ_*' constants (avoiding casts) for use in
device address spaces where the address width depends on the
hardware (e.g. 32-bit MMIO windows, 64-bit GPU framebuffers, etc.),
e.g.:
let gpu_heap = 14 * u64::SZ_1M;
let mmio_window = u32::SZ_16M;
- 'clk' module: implement 'Send' and 'Sync' for 'Clk' and thus
simplify the users in Tyr and PWM.
- 'ptr' module: add 'const_align_up'.
- 'str' module: improve the documentation of the 'c_str!' macro to
explain that one should only use it for non-literal cases (for the
other case we instead use C string literals, e.g. 'c"abc"').
- Disallow the use of 'CStr::{as_ptr,from_ptr}' and clean one such
use in the 'task' module.
- 'sync' module: finish the move of 'ARef' and 'AlwaysRefCounted'
outside of the 'types' module, i.e. update the last remaining
instances and finally remove the re-exports.
- 'error' module: clarify that 'from_err_ptr' can return 'Ok(NULL)',
including runtime-tested examples.
The intention is to hopefully prevent UB that assumes the result of
the function is not 'NULL' if successful. This originated from a
case of UB I noticed in 'regulator' that created a 'NonNull' on it.
Timekeeping:
- Expand the example section in the 'HrTimer' documentation.
- Mark the 'ClockSource' trait as unsafe to ensure valid values for
'ktime_get()'.
- Add 'Delta::from_nanos()'.
'pin-init' crate:
- Replace the 'Zeroable' impls for 'Option<NonZero*>' with impls of
'ZeroableOption' for 'NonZero*'.
- Improve feature gate handling for unstable features.
- Declutter the documentation of implementations of 'Zeroable' for
tuples.
- Replace uses of 'addr_of[_mut]!' with '&raw [mut]'.
rust-analyzer:
- Add type annotations to 'generate_rust_analyzer.py'.
- Add support for scripts written in Rust ('generate_rust_target.rs',
'rustdoc_test_builder.rs', 'rustdoc_test_gen.rs').
- Refactor 'generate_rust_analyzer.py' to explicitly identify host
and target crates, improve readability, and reduce duplication.
And some other fixes, cleanups and improvements"
* tag 'rust-7.1' of git://git.kernel.org/pub/scm/linux/kernel/git/ojeda/linux: (79 commits)
rust: sizes: add SizeConstants trait for device address space constants
rust: kernel: update `file_with_nul` comment
rust: kbuild: allow `clippy::precedence` for Rust < 1.86.0
rust: kbuild: support global per-version flags
rust: declare cfi_encoding for lru_status
docs: rust: general-information: use real example
docs: rust: general-information: simplify Kconfig example
docs: rust: quick-start: remove GDB/Binutils mention
docs: rust: quick-start: remove Nix "unstable channel" note
docs: rust: quick-start: remove Gentoo "testing" note
docs: rust: quick-start: add Ubuntu 26.04 LTS and remove subsection title
docs: rust: quick-start: update minimum Ubuntu version
docs: rust: quick-start: update Ubuntu versioned packages
docs: rust: quick-start: openSUSE provides `rust-src` package nowadays
rust: kbuild: remove "dummy parameter" workaround for `bindgen` < 0.71.1
rust: kbuild: update `bindgen --rust-target` version and replace comment
rust: rust_is_available: remove warning for `bindgen` < 0.69.5 && libclang >= 19.1
rust: rust_is_available: remove warning for `bindgen` 0.66.[01]
rust: bump `bindgen` minimum supported version to 0.71.1 (Debian Trixie)
rust: block: update `const_refs_to_static` MSRV TODO comment
...
|
|
KVM/riscv changes for 7.1
- Fix steal time shared memory alignment checks
- Fix vector context allocation leak
- Fix array out-of-bounds in pmu_ctr_read() and pmu_fw_ctr_read_hi()
- Fix double-free of sdata in kvm_pmu_clear_snapshot_area()
- Fix integer overflow in kvm_pmu_validate_counter_mask()
- Fix shift-out-of-bounds in make_xfence_request()
- Fix lost write protection on huge pages during dirty logging
- Split huge pages during fault handling for dirty logging
- Skip CSR restore if VCPU is reloaded on the same core
- Implement kvm_arch_has_default_irqchip() for KVM selftests
- Factored-out ISA checks into separate sources
- Added hideleg to struct kvm_vcpu_config
- Factored-out VCPU config into separate sources
- Support configuration of per-VM HGATP mode from KVM user space
|
|
ssh://gitolite.kernel.org/pub/scm/linux/kernel/git/conor/linux into soc/dt
RISC-V devicetrees for v7.1
Generic:
Add binding coverage for Supm.
Microchip:
Add support for the picgx64 and its curiosity board. This is a PolarFire
SoC without the FPGA.
Add the missing tsu_clk for ptp on the macb on PolarFire SoC and resolve
a long-running problem with gpio interrupts being incorrectly described
on the platform.
Signed-off-by: Conor Dooley <conor.dooley@microchip.com>
* tag 'riscv-dt-for-v7.1' of ssh://gitolite.kernel.org/pub/scm/linux/kernel/git/conor/linux:
riscv: dts: microchip: update mpfs gpio interrupts to better match the SoC
riscv: dts: microchip: add tsu clock to macb on mpfs
dt-bindings: riscv: Add Supm extension description
riscv: dts: microchip: remove POLARFIRE mention in Makefile
riscv: dts: microchip: add pic64gx and its curiosity kit
dt-bindings: riscv: microchip: document the PIC64GX curiosity kit
dt-bindings: timer: sifive,clint: add pic64gx compatibility
riscv: dts: microchip: add pinctrl nodes for mpfs/icicle kit
Signed-off-by: Krzysztof Kozlowski <krzk@kernel.org>
|
|
into soc/dt
RISC-V SpacemiT DT changes for 7.1
For K3 SoC
- Add I2C support
- Add PMIC regulator tree
- Add ethernet support
- Add pinctrl/GPIO/Clock
- Enable full UART support
For K1 SoC
On Milk-V Jupiter
- Enable PCIe/USB on
- Enable QSPI/SPI NOR
- Enable EEPROM, LEDs
Others
- Fix PMIC supply properties
- Fix PCIe missing power regulator
* tag 'spacemit-dt-for-7.1-1' of https://github.com/spacemit-com/linux:
dts: riscv: spacemit: k3: add P1 PMIC regulator tree
dts: riscv: spacemit: k3: Add i2c nodes
riscv: dts: spacemit: enable PCIe ports on Milk-V Jupiter
riscv: dts: spacemit: enable USB 3 ports on Milk-V Jupiter
riscv: dts: spacemit: enable QSPI and add SPI NOR on Milk-V Jupiter
riscv: dts: spacemit: add i2c aliases on Milk-V Jupiter
riscv: dts: spacemit: add 24c04 eeprom on Milk-V Jupiter
riscv: dts: spacemit: add LEDs for Milk-V Jupiter board
riscv: dts: spacemit: Add ethernet device for K3
riscv: dts: spacemit: drop incorrect pinctrl for combo PHY
riscv: dts: spacemit: reorder phy nodes for K1
riscv: dts: spacemit: k3: add full resource to UART
riscv: dts: spacemit: k3: add GPIO support
riscv: dts: spacemit: k3: add pinctrl support
riscv: dts: spacemit: k3: add clock tree
dt-bindings: serial: 8250: spacemit: fix clock property for K3 SoC
riscv: dts: spacemit: Add 'linux,pci-domain' to PCIe nodes for K1
riscv: dts: spacemit: adapt regulator node name to preferred form
riscv: dts: spacemit: Update PMIC supply properties for BPI-F3 and Jupiter
riscv: dts: spacemit: pcie: fix missing power regulator
Signed-off-by: Linus Walleij <linusw@kernel.org>
|
|
https://github.com/Rust-for-Linux/linux into rust-next
Pull timekeeping updates from Andreas Hindborg:
- Expand the example section in the 'HrTimer' documentation.
- Mark the 'ClockSource' trait as unsafe to ensure valid values for
'ktime_get()'.
- Add 'Delta::from_nanos()'.
This is a back merge since the pull request has a newer base -- we will
avoid that in the future.
And, given it is a back merge, it happens to resolve the "subtle" conflict
around '--remap-path-{prefix,scope}' that I discussed in linux-next [1],
plus a few other common conflicts. The result matches what we did for
next-20260407.
The actual diffstat (i.e. using a temporary merge of upstream first) is:
rust/kernel/time.rs | 32 ++++-
rust/kernel/time/hrtimer.rs | 336 ++++++++++++++++++++++++++++++++++++++++++++
2 files changed, 362 insertions(+), 6 deletions(-)
Link: https://lore.kernel.org/linux-next/CANiq72kdxB=W3_CV1U44oOK3SssztPo2wLDZt6LP94TEO+Kj4g@mail.gmail.com/ [1]
* tag 'rust-timekeeping-for-v7.1' of https://github.com/Rust-for-Linux/linux:
hrtimer: add usage examples to documentation
rust: time: make ClockSource unsafe trait
rust/time: Add Delta::from_nanos()
|
|
With the Rust version bump in place, several Kconfig conditions based on
`RUSTC_VERSION` are always true.
Thus simplify them.
The minimum supported major LLVM version by our new Rust minimum version
is now LLVM 18, instead of LLVM 16. However, there are no possible
cleanups for `RUSTC_LLVM_VERSION`.
Reviewed-by: Tamir Duberstein <tamird@kernel.org>
Reviewed-by: Gary Guo <gary@garyguo.net>
Link: https://patch.msgid.link/20260405235309.418950-9-ojeda@kernel.org
Signed-off-by: Miguel Ojeda <ojeda@kernel.org>
|
|
Update acpi/pptt.c to use acpi_get_cpu_uid() and remove unused
get_acpi_id_for_cpu() from arm64/loongarch/riscv, completing PPTT's
migration to the unified ACPI CPU UID interface
Signed-off-by: Chengwen Feng <fengchengwen@huawei.com>
Reviewed-by: Jonathan Cameron <jonathan.cameron@huawei.com>
Link: https://patch.msgid.link/20260401081640.26875-8-fengchengwen@huawei.com
Signed-off-by: Rafael J. Wysocki <rafael.j.wysocki@intel.com>
|
|
Centralize acpi_get_cpu_uid() in include/linux/acpi.h (global scope) and
remove arch-specific declarations from arm64/loongarch/riscv/x86
asm/acpi.h. This unifies the interface across architectures and
simplifies maintenance by eliminating duplicate prototypes.
Signed-off-by: Chengwen Feng <fengchengwen@huawei.com>
Reviewed-by: Jonathan Cameron <jonathan.cameron@huawei.com>
Link: https://patch.msgid.link/20260401081640.26875-6-fengchengwen@huawei.com
Signed-off-by: Rafael J. Wysocki <rafael.j.wysocki@intel.com>
|
|
As a step towards unifying the interface for retrieving ACPI CPU UID
across architectures, introduce a new function acpi_get_cpu_uid() for
riscv. While at it, add input validation to make the code more robust.
And also update acpi_numa.c and rhct.c to use the new interface instead
of the legacy get_acpi_id_for_cpu().
Signed-off-by: Chengwen Feng <fengchengwen@huawei.com>
Reviewed-by: Jonathan Cameron <jonathan.cameron@huawei.com>
Link: https://patch.msgid.link/20260401081640.26875-4-fengchengwen@huawei.com
Signed-off-by: Rafael J. Wysocki <rafael.j.wysocki@intel.com>
|
|
The make_xfence_request() function uses a shift operation to check if a
vCPU is in the hart mask:
if (!(hmask & (1UL << (vcpu->vcpu_id - hbase))))
However, when the difference between vcpu_id and hbase
is >= BITS_PER_LONG, the shift operation causes undefined behavior.
This was detected by UBSAN:
UBSAN: shift-out-of-bounds in arch/riscv/kvm/tlb.c:343:23
shift exponent 256 is too large for 64-bit type 'long unsigned int'
Fix this by adding a bounds check before the shift operation.
This bug was found by fuzzing the KVM RISC-V interface.
Fixes: 13acfec2dbcc ("RISC-V: KVM: Add remote HFENCE functions based on VCPU requests")
Signed-off-by: Jiakai Xu <jiakaiPeanut@gmail.com>
Signed-off-by: Jiakai Xu <xujiakai2025@iscas.ac.cn>
Reviewed-by: Andrew Jones <andrew.jones@oss.qualcomm.com>
Link: https://lore.kernel.org/r/20260403232011.2394966-1-xujiakai2025@iscas.ac.cn
Signed-off-by: Anup Patel <anup@brainfault.org>
|
|
In order to be able to do this, we need to change VM_DATA_DEFAULT_FLAGS
and friends and update the architecture-specific definitions also.
We then have to update some KSM logic to handle VMA flags, and introduce
VMA_STACK_FLAGS to define the vma_flags_t equivalent of VM_STACK_FLAGS.
We also introduce two helper functions for use during the time we are
converting legacy flags to vma_flags_t values - vma_flags_to_legacy() and
legacy_to_vma_flags().
This enables us to iteratively make changes to break these changes up into
separate parts.
We use these explicitly here to keep VM_STACK_FLAGS around for certain
users which need to maintain the legacy vm_flags_t values for the time
being.
We are no longer able to rely on the simple VM_xxx being set to zero if
the feature is not enabled, so in the case of VM_DROPPABLE we introduce
VMA_DROPPABLE as the vma_flags_t equivalent, which is set to
EMPTY_VMA_FLAGS if the droppable flag is not available.
While we're here, we make the description of do_brk_flags() into a kdoc
comment, as it almost was already.
We use vma_flags_to_legacy() to not need to update the vm_get_page_prot()
logic as this time.
Note that in create_init_stack_vma() we have to replace the BUILD_BUG_ON()
with a VM_WARN_ON_ONCE() as the tested values are no longer build time
available.
We also update mprotect_fixup() to use VMA flags where possible, though we
have to live with a little duplication between vm_flags_t and vma_flags_t
values for the time being until further conversions are made.
While we're here, update VM_SPECIAL to be defined in terms of
VMA_SPECIAL_FLAGS now we have vma_flags_to_legacy().
Finally, we update the VMA tests to reflect these changes.
Link: https://lkml.kernel.org/r/d02e3e45d9a33d7904b149f5604904089fd640ae.1774034900.git.ljs@kernel.org
Signed-off-by: Lorenzo Stoakes (Oracle) <ljs@kernel.org>
Acked-by: Paul Moore <paul@paul-moore.com> [SELinux]
Acked-by: Vlastimil Babka (SUSE) <vbabka@kernel.org>
Cc: Albert Ou <aou@eecs.berkeley.edu>
Cc: Alexander Gordeev <agordeev@linux.ibm.com>
Cc: Alexandre Ghiti <alex@ghiti.fr>
Cc: Al Viro <viro@zeniv.linux.org.uk>
Cc: Anton Ivanov <anton.ivanov@cambridgegreys.com>
Cc: "Borislav Petkov (AMD)" <bp@alien8.de>
Cc: Catalin Marinas <catalin.marinas@arm.com>
Cc: Chengming Zhou <chengming.zhou@linux.dev>
Cc: Christian Borntraeger <borntraeger@linux.ibm.com>
Cc: Christian Brauner <brauner@kernel.org>
Cc: David Hildenbrand <david@kernel.org>
Cc: Dinh Nguyen <dinguyen@kernel.org>
Cc: Heiko Carstens <hca@linux.ibm.com>
Cc: "H. Peter Anvin" <hpa@zytor.com>
Cc: Huacai Chen <chenhuacai@kernel.org>
Cc: Ingo Molnar <mingo@redhat.com>
Cc: Jan Kara <jack@suse.cz>
Cc: Jann Horn <jannh@google.com>
Cc: Johannes Berg <johannes@sipsolutions.net>
Cc: Kees Cook <kees@kernel.org>
Cc: Liam Howlett <liam.howlett@oracle.com>
Cc: Madhavan Srinivasan <maddy@linux.ibm.com>
Cc: Michael Ellerman <mpe@ellerman.id.au>
Cc: Michal Hocko <mhocko@suse.com>
Cc: Mike Rapoport <rppt@kernel.org>
Cc: Nicholas Piggin <npiggin@gmail.com>
Cc: Ondrej Mosnacek <omosnace@redhat.com>
Cc: Palmer Dabbelt <palmer@dabbelt.com>
Cc: Pedro Falcato <pfalcato@suse.de>
Cc: Richard Weinberger <richard@nod.at>
Cc: Russell King <linux@armlinux.org.uk>
Cc: Stephen Smalley <stephen.smalley.work@gmail.com>
Cc: Suren Baghdasaryan <surenb@google.com>
Cc: Sven Schnelle <svens@linux.ibm.com>
Cc: Thomas Bogendoerfer <tsbogend@alpha.franken.de>
Cc: Vasily Gorbik <gor@linux.ibm.com>
Cc: Vineet Gupta <vgupta@kernel.org>
Cc: WANG Xuerui <kernel@xen0n.name>
Cc: Will Deacon <will@kernel.org>
Cc: xu xin <xu.xin16@zte.com.cn>
Signed-off-by: Andrew Morton <akpm@linux-foundation.org>
|
|
The pudp_test_and_clear_young() is used to clear the young flag, returning
whether the young flag was set for this PUD entry. Change the return type
to bool to make the intention clearer.
Link: https://lkml.kernel.org/r/2c56fe52c1bf9404145274d7e91d4a65060f6c7c.1774075004.git.baolin.wang@linux.alibaba.com
Signed-off-by: Baolin Wang <baolin.wang@linux.alibaba.com>
Reviewed-by: Ritesh Harjani (IBM) <ritesh.list@gmail.com>
Reviewed-by: Lorenzo Stoakes (Oracle) <ljs@kernel.org>
Acked-by: David Hildenbrand (Arm) <david@kernel.org>
Cc: Liam Howlett <liam.howlett@oracle.com>
Cc: Michal Hocko <mhocko@suse.com>
Cc: Mike Rapoport <rppt@kernel.org>
Cc: Suren Baghdasaryan <surenb@google.com>
Signed-off-by: Andrew Morton <akpm@linux-foundation.org>
|
|
Callers use pmdp_test_and_clear_young() to clear the young flag and check
whether it was set for this PMD entry. Change the return type to bool to
make the intention clearer.
Link: https://lkml.kernel.org/r/f1d31307a13365d3d0fed5809727dcc2dd59631b.1774075004.git.baolin.wang@linux.alibaba.com
Signed-off-by: Baolin Wang <baolin.wang@linux.alibaba.com>
Reviewed-by: Ritesh Harjani (IBM) <ritesh.list@gmail.com>
Reviewed-by: Lorenzo Stoakes (Oracle) <ljs@kernel.org>
Acked-by: David Hildenbrand (Arm) <david@kernel.org>
Cc: Liam Howlett <liam.howlett@oracle.com>
Cc: Michal Hocko <mhocko@suse.com>
Cc: Mike Rapoport <rppt@kernel.org>
Cc: Suren Baghdasaryan <surenb@google.com>
Signed-off-by: Andrew Morton <akpm@linux-foundation.org>
|
|
The ptep_clear_flush_young() and clear_flush_young_ptes() are used to
clear the young flag and flush the TLB, returning whether the young flag
was set. Change the return type to bool to make the intention clearer.
Link: https://lkml.kernel.org/r/24af5144b96103631594501f77d4525f2475c1be.1774075004.git.baolin.wang@linux.alibaba.com
Signed-off-by: Baolin Wang <baolin.wang@linux.alibaba.com>
Reviewed-by: Ritesh Harjani (IBM) <ritesh.list@gmail.com>
Reviewed-by: Lorenzo Stoakes (Oracle) <ljs@kernel.org>
Acked-by: David Hildenbrand (Arm) <david@kernel.org>
Cc: Liam Howlett <liam.howlett@oracle.com>
Cc: Michal Hocko <mhocko@suse.com>
Cc: Mike Rapoport <rppt@kernel.org>
Cc: Suren Baghdasaryan <surenb@google.com>
Signed-off-by: Andrew Morton <akpm@linux-foundation.org>
|
|
Patch series "change young flag check functions to return bool", v2.
This is a cleanup patchset to change all young flag check functions to
return bool, as discussed with David in the previous thread[1]. Since
callers only care about whether the young flag was set, returning bool
makes the intention clearer. No functional changes intended.
This patch (of 6):
Callers use ptep_test_and_clear_young() to clear the young flag and check
whether it was set. Change the return type to bool to make the intention
clearer.
Link: https://lkml.kernel.org/r/cover.1774075004.git.baolin.wang@linux.alibaba.com
Link: https://lkml.kernel.org/r/57e70efa9703d43959aa645246ea3cbdba14fa17.1774075004.git.baolin.wang@linux.alibaba.com
Signed-off-by: Baolin Wang <baolin.wang@linux.alibaba.com>
Reviewed-by: Ritesh Harjani (IBM) <ritesh.list@gmail.com>
Reviewed-by: Lorenzo Stoakes (Oracle) <ljs@kernel.org>
Acked-by: David Hildenbrand (Arm) <david@kernel.org>
Cc: Liam Howlett <liam.howlett@oracle.com>
Cc: Michal Hocko <mhocko@suse.com>
Cc: Mike Rapoport <rppt@kernel.org>
Cc: Suren Baghdasaryan <surenb@google.com>
Signed-off-by: Andrew Morton <akpm@linux-foundation.org>
|
|
Patch series "mm: remove CONFIG_ARCH_ENABLE_MEMORY_HOTREMOVE and cleanup
CONFIG_MIGRATION".
While working on memory hotplug code cleanups, I realized that
CONFIG_ARCH_ENABLE_MEMORY_HOTREMOVE is not really required anymore.
Changing that revealed some rather nasty looking CONFIG_MIGRATION
handling.
Let's clean that up by introducing a dedicated CONFIG_NUMA_MIGRATION
option and reducing the dependencies that CONFIG_MIGRATION has.
This patch (of 2):
All architectures that select CONFIG_ARCH_ENABLE_MEMORY_HOTREMOVE also
select CONFIG_ARCH_ENABLE_MEMORY_HOTPLUG. So we can just remove
CONFIG_ARCH_ENABLE_MEMORY_HOTREMOVE.
For CONFIG_MIGRATION, make it depend on CONFIG_MEMORY_HOTREMOVE instead,
and make CONFIG_MEMORY_HOTREMOVE select CONFIG_MIGRATION (just like
CONFIG_CMA and CONFIG_COMPACTION already do).
We'll clean up CONFIG_MIGRATION next.
Link: https://lkml.kernel.org/r/20260319-config_migration-v1-0-42270124966f@kernel.org
Link: https://lkml.kernel.org/r/20260319-config_migration-v1-1-42270124966f@kernel.org
Signed-off-by: David Hildenbrand (Arm) <david@kernel.org>
Acked-by: Zi Yan <ziy@nvidia.com>
Reviewed-by: Lorenzo Stoakes (Oracle) <ljs@kernel.org>
Reviewed-by: Joshua Hahn <joshua.hahnjy@gmail.com>
Reviewed-by: Gregory Price <gourry@gourry.net>
Reviewed-by: Mike Rapoport (Microsoft) <rppt@kernel.org>
Reviewed-by: Jonathan Cameron <jonathan.cameron@huawei.com>
Cc: Albert Ou <aou@eecs.berkeley.edu>
Cc: Alexander Gordeev <agordeev@linux.ibm.com>
Cc: Alexandre Ghiti <alex@ghiti.fr>
Cc: Alistair Popple <apopple@nvidia.com>
Cc: "Borislav Petkov (AMD)" <bp@alien8.de>
Cc: Byungchul Park <byungchul@sk.com>
Cc: Catalin Marinas <catalin.marinas@arm.com>
Cc: Christian Borntraeger <borntraeger@linux.ibm.com>
Cc: Heiko Carstens <hca@linux.ibm.com>
Cc: "H. Peter Anvin" <hpa@zytor.com>
Cc: Huacai Chen <chenhuacai@kernel.org>
Cc: "Huang, Ying" <ying.huang@linux.alibaba.com>
Cc: Ingo Molnar <mingo@redhat.com>
Cc: Liam Howlett <liam.howlett@oracle.com>
Cc: Madhavan Srinivasan <maddy@linux.ibm.com>
Cc: Matthew Brost <matthew.brost@intel.com>
Cc: Michael Ellerman <mpe@ellerman.id.au>
Cc: Michal Hocko <mhocko@suse.com>
Cc: Nicholas Piggin <npiggin@gmail.com>
Cc: Palmer Dabbelt <palmer@dabbelt.com>
Cc: Rakie Kim <rakie.kim@sk.com>
Cc: Sebastian Andrzej Siewior <bigeasy@linutronix.de>
Cc: Steven Rostedt <rostedt@goodmis.org>
Cc: Suren Baghdasaryan <surenb@google.com>
Cc: Sven Schnelle <svens@linux.ibm.com>
Cc: Vasily Gorbik <gor@linux.ibm.com>
Cc: WANG Xuerui <kernel@xen0n.name>
Cc: Will Deacon <will@kernel.org>
Signed-off-by: Andrew Morton <akpm@linux-foundation.org>
|
|
The upcoming change to the HugeTLB vmemmap optimization (HVO) requires
struct pages of the head page to be naturally aligned with regard to the
folio size.
Align vmemmap to the newly introduced MAX_FOLIO_VMEMMAP_ALIGN.
Link: https://lkml.kernel.org/r/20260227194302.274384-6-kas@kernel.org
Signed-off-by: Kiryl Shutsemau <kas@kernel.org>
Cc: Albert Ou <aou@eecs.berkeley.edu>
Cc: Alexandre Ghiti <alex@ghiti.fr>
Cc: Baoquan He <bhe@redhat.com>
Cc: Christoph Lameter <cl@gentwo.org>
Cc: David Hildenbrand (arm) <david@kernel.org>
Cc: David Rientjes <rientjes@google.com>
Cc: Frank van der Linden <fvdl@google.com>
Cc: Harry Yoo <harry.yoo@oracle.com>
Cc: Huacai Chen <chenhuacai@kernel.org>
Cc: Johannes Weiner <hannes@cmpxchg.org>
Cc: Jonathan Corbet <corbet@lwn.net>
Cc: Lorenzo Stoakes <lorenzo.stoakes@oracle.com>
Cc: Matthew Wilcox (Oracle) <willy@infradead.org>
Cc: Michal Hocko <mhocko@suse.com>
Cc: Mike Rapoport <rppt@kernel.org>
Cc: Muchun Song <muchun.song@linux.dev>
Cc: Oscar Salvador <osalvador@suse.de>
Cc: Palmer Dabbelt <palmer@dabbelt.com>
Cc: Paul Walmsley <paul.walmsley@sifive.com>
Cc: Roman Gushchin <roman.gushchin@linux.dev>
Cc: Usama Arif <usamaarif642@gmail.com>
Cc: Vlastimil Babka <vbabka@suse.cz>
Cc: WANG Xuerui <kernel@xen0n.name>
Cc: Zi Yan <ziy@nvidia.com>
Signed-off-by: Andrew Morton <akpm@linux-foundation.org>
|
|
Replace part of the allocate_shadow_stack() content with a call to
vm_mmap_shadow_stack(). There is no functional change.
Link: https://lkml.kernel.org/r/20260225161404.3157851-4-catalin.marinas@arm.com
Signed-off-by: Catalin Marinas <catalin.marinas@arm.com>
Tested-by: Deepak Gupta <debug@rivosinc.com>
Reviewed-by: David Hildenbrand (Arm) <david@kernel.org>
Cc: Paul Walmsley <pjw@kernel.org>
Cc: Palmer Dabbelt <palmer@dabbelt.com>
Cc: Albert Ou <aou@eecs.berkeley.edu>
Cc: Alexandre Ghiti <alex@ghiti.fr>
Cc: "Borislav Petkov (AMD)" <bp@alien8.de>
Cc: Dave Hansen <dave.hansen@linux.intel.com>
Cc: "Edgecombe, Rick P" <rick.p.edgecombe@intel.com>
Cc: "H. Peter Anvin" <hpa@zytor.com>
Cc: Ingo Molnar <mingo@redhat.com>
Cc: "Liam R. Howlett" <Liam.Howlett@oracle.com>
Cc: Lorenzo Stoakes <lorenzo.stoakes@oracle.com>
Cc: Mark Brown <broonie@kernel.org>
Cc: Michal Hocko <mhocko@suse.com>
Cc: Mike Rapoport <rppt@kernel.org>
Cc: Suren Baghdasaryan <surenb@google.com>
Cc: Thomas Gleixner <tglx@kernel.org>
Cc: Vlastimil Babka <vbabka@suse.cz>
Cc: Will Deacon <will@kernel.org>
Signed-off-by: Andrew Morton <akpm@linux-foundation.org>
|
|
Reduce 22 declarations of empty_zero_page to 3 and 23 declarations of
ZERO_PAGE() to 4.
Every architecture defines empty_zero_page that way or another, but for the
most of them it is always a page aligned page in BSS and most definitions
of ZERO_PAGE do virt_to_page(empty_zero_page).
Move Linus vetted x86 definition of empty_zero_page and ZERO_PAGE() to the
core MM and drop these definitions in architectures that do not implement
colored zero page (MIPS and s390).
ZERO_PAGE() remains a macro because turning it to a wrapper for a static
inline causes severe pain in header dependencies.
For the most part the change is mechanical, with these being noteworthy:
* alpha: aliased empty_zero_page with ZERO_PGE that was also used for boot
parameters. Switching to a generic empty_zero_page removes the aliasing
and keeps ZERO_PGE for boot parameters only
* arm64: uses __pa_symbol() in ZERO_PAGE() so that definition of
ZERO_PAGE() is kept intact.
* m68k/parisc/um: allocated empty_zero_page from memblock,
although they do not support zero page coloring and having it in BSS
will work fine.
* sparc64 can have empty_zero_page in BSS rather allocate it, but it
can't use virt_to_page() for BSS. Keep it's definition of ZERO_PAGE()
but instead of allocating it, make mem_map_zero point to
empty_zero_page.
* sh: used empty_zero_page for boot parameters at the very early boot.
Rename the parameters page to boot_params_page and let sh use the generic
empty_zero_page.
* hexagon: had an amusing comment about empty_zero_page
/* A handy thing to have if one has the RAM. Declared in head.S */
that unfortunately had to go :)
Link: https://lkml.kernel.org/r/20260211103141.3215197-4-rppt@kernel.org
Signed-off-by: Mike Rapoport (Microsoft) <rppt@kernel.org>
Acked-by: Helge Deller <deller@gmx.de> [parisc]
Tested-by: Helge Deller <deller@gmx.de> [parisc]
Reviewed-by: Christophe Leroy (CS GROUP) <chleroy@kernel.org>
Acked-by: Dave Hansen <dave.hansen@linux.intel.com>
Acked-by: Catalin Marinas <catalin.marinas@arm.com>
Acked-by: Magnus Lindholm <linmag7@gmail.com> [alpha]
Acked-by: Dinh Nguyen <dinguyen@kernel.org> [nios2]
Acked-by: Andreas Larsson <andreas@gaisler.com> [sparc]
Acked-by: David Hildenbrand (Arm) <david@kernel.org>
Acked-by: Liam R. Howlett <Liam.Howlett@oracle.com>
Cc: "Borislav Petkov (AMD)" <bp@alien8.de>
Cc: David S. Miller <davem@davemloft.net>
Cc: Geert Uytterhoeven <geert@linux-m68k.org>
Cc: Guo Ren <guoren@kernel.org>
Cc: Huacai Chen <chenhuacai@kernel.org>
Cc: Ingo Molnar <mingo@redhat.com>
Cc: Johannes Berg <johannes@sipsolutions.net>
Cc: John Paul Adrian Glaubitz <glaubitz@physik.fu-berlin.de>
Cc: Lorenzo Stoakes <lorenzo.stoakes@oracle.com>
Cc: Madhavan Srinivasan <maddy@linux.ibm.com>
Cc: Matt Turner <mattst88@gmail.com>
Cc: Max Filippov <jcmvbkbc@gmail.com>
Cc: Michael Ellerman <mpe@ellerman.id.au>
Cc: Michal Hocko <mhocko@suse.com>
Cc: Michal Simek <monstr@monstr.eu>
Cc: Palmer Dabbelt <palmer@dabbelt.com>
Cc: Richard Weinberger <richard@nod.at>
Cc: Russell King <linux@armlinux.org.uk>
Cc: Stafford Horne <shorne@gmail.com>
Cc: Suren Baghdasaryan <surenb@google.com>
Cc: Vineet Gupta <vgupta@kernel.org>
Cc: Vlastimil Babka <vbabka@suse.cz>
Cc: Will Deacon <will@kernel.org>
Signed-off-by: Andrew Morton <akpm@linux-foundation.org>
|
|
Per Linus' comments requesting the replacement of "INDIR_BR_LP" in the
indirect branch tracking prctl()s with something more readable, and
suggesting the use of the speculation control prctl()s as an exemplar,
reimplement the prctl()s and related constants that control per-task
forward-edge control flow integrity.
This primarily involves two changes. First, the prctls are
restructured to resemble the style of the speculative execution
workaround control prctls PR_{GET,SET}_SPECULATION_CTRL, to make them
easier to extend in the future. Second, the "indir_br_lp" abbrevation
is expanded to "branch_landing_pads" to be less telegraphic. The
kselftest and documentation is adjusted accordingly.
Link: https://lore.kernel.org/linux-riscv/CAHk-=whhSLGZAx3N5jJpb4GLFDqH_QvS07D+6BnkPWmCEzTAgw@mail.gmail.com/
Cc: Deepak Gupta <debug@rivosinc.com>
Cc: Linus Torvalds <torvalds@linux-foundation.org>
Cc: Mark Brown <broonie@kernel.org>
Signed-off-by: Paul Walmsley <pjw@kernel.org>
|
|
Similar to the recent change to expand "LP" to "branch landing pad",
let's expand "SS" in the ptrace uapi macros to "shadow stack" as well.
This aligns with the existing prctl() arguments, which use the
expanded "shadow stack" names, rather than just the abbreviation.
Link: https://lore.kernel.org/linux-riscv/CAHk-=whhSLGZAx3N5jJpb4GLFDqH_QvS07D+6BnkPWmCEzTAgw@mail.gmail.com/
Cc: Deepak Gupta <debug@rivosinc.com>
Signed-off-by: Paul Walmsley <pjw@kernel.org>
|
|
Per Linus' comments about the unreadability of abbreviations such as
"indir_br_lp", rename the three prctl() implementation functions to be more
explicit. This involves renaming "indir_br_lp_status" in the function
names to "branch_landing_pad_state".
While here, add _prctl_ into the function names, following the
speculation control prctl implementation functions.
Link: https://lore.kernel.org/linux-riscv/CAHk-=whhSLGZAx3N5jJpb4GLFDqH_QvS07D+6BnkPWmCEzTAgw@mail.gmail.com/
Cc: Deepak Gupta <debug@rivosinc.com>
Cc: Linus Torvalds <torvalds@linux-foundation.org>
Cc: Mark Brown <broonie@kernel.org>
Signed-off-by: Paul Walmsley <pjw@kernel.org>
|
|
Per Linus' comments about the unreadability of abbreviations such as
"LP", rename the RISC-V ptrace landing pad CFI macro names to be more
explicit. This primarily involves expanding "LP" in the names to some
variant of "branch landing pad."
Link: https://lore.kernel.org/linux-riscv/CAHk-=whhSLGZAx3N5jJpb4GLFDqH_QvS07D+6BnkPWmCEzTAgw@mail.gmail.com/
Cc: Deepak Gupta <debug@rivosinc.com>
Signed-off-by: Paul Walmsley <pjw@kernel.org>
|
|
When libc locks the CFI status through the following prctl:
- PR_LOCK_SHADOW_STACK_STATUS
- PR_LOCK_INDIR_BR_LP_STATUS
A newly execd address space will inherit the lock status
if it does not clear the lock bits. Since the lock bits
remain set, libc will later fail to enable the landing
pad and shadow stack.
Signed-off-by: Zong Li <zong.li@sifive.com>
Link: https://patch.msgid.link/20260323065640.4045713-1-zong.li@sifive.com
[pjw@kernel.org: ensure we unlock before changing state; cleaned up subject line]
Signed-off-by: Paul Walmsley <pjw@kernel.org>
|
|
A CFI-related macro defined in arch/riscv/uapi/asm/ptrace.h misspells
"PTRACE" as "PRACE"; fix this.
Fixes: 2af7c9cf021c ("riscv/ptrace: expose riscv CFI status and state via ptrace and in core files")
Cc: Deepak Gupta <debug@rivosinc.com>
Signed-off-by: Paul Walmsley <pjw@kernel.org>
|
|
Fix the build of non-kernel code that includes the RISC-V ptrace uapi
header, and the RISC-V validate_v_ptrace.c kselftest, by using the
_BITUL() macro rather than BIT(). BIT() is not available outside
the kernel.
Based on patches and comments from Charlie Jenkins, Michael Neuling,
and Andreas Schwab.
Fixes: 30eb191c895b ("selftests: riscv: verify ptrace rejects invalid vector csr inputs")
Fixes: 2af7c9cf021c ("riscv/ptrace: expose riscv CFI status and state via ptrace and in core files")
Cc: Andreas Schwab <schwab@suse.de>
Cc: Michael Neuling <mikey@neuling.org>
Cc: Charlie Jenkins <thecharlesjenkins@gmail.com>
Link: https://patch.msgid.link/20260330024248.449292-1-mikey@neuling.org
Link: https://lore.kernel.org/linux-riscv/20260309-fix_selftests-v2-1-9d5a553a531e@gmail.com/
Link: https://lore.kernel.org/linux-riscv/20260309-fix_selftests-v2-3-9d5a553a531e@gmail.com/
Signed-off-by: Paul Walmsley <pjw@kernel.org>
|
|
In set_tagged_addr_ctrl(), when PR_TAGGED_ADDR_ENABLE is not set, pmlen
is correctly set to 0, but it forgets to reset pmm. This results in the
CPU pmm state not corresponding to the software pmlen state.
Fix this by resetting pmm along with pmlen.
Fixes: 2e1743085887 ("riscv: Add support for the tagged address ABI")
Signed-off-by: Zishun Yi <vulab@iscas.ac.cn>
Reviewed-by: Samuel Holland <samuel.holland@sifive.com>
Link: https://patch.msgid.link/20260322160022.21908-1-vulab@iscas.ac.cn
Signed-off-by: Paul Walmsley <pjw@kernel.org>
|
|
Similar as commit 284922f4c563 ("x86: uaccess: don't use runtime-const
rewriting in modules") does, make riscv's runtime const not usable by
modules too, to "make sure this doesn't get forgotten the next time
somebody wants to do runtime constant optimizations". The reason is
well explained in the above commit: "The runtime-const infrastructure
was never designed to handle the modular case, because the constant
fixup is only done at boot time for core kernel code."
Signed-off-by: Jisheng Zhang <jszhang@kernel.org>
Link: https://patch.msgid.link/20260221023731.3476-1-jszhang@kernel.org
Signed-off-by: Paul Walmsley <pjw@kernel.org>
|
|
Similarly to commit 8d09e2d569f6 ("arm64: patching: avoid early
page_to_phys()"), avoid using phys_to_page() for the kernel address case
in patch_map().
Since this is called from apply_boot_alternatives() in setup_arch(), and
commit 4267739cabb8 ("arch, mm: consolidate initialization of SPARSE
memory model") has moved sparse_init() to after setup_arch(),
phys_to_page() is not available there yet, and it panics on boot with
SPARSEMEM on RV32, which does not use SPARSEMEM_VMEMMAP.
Reported-by: Thomas Weißschuh <thomas.weissschuh@linutronix.de>
Closes: https://lore.kernel.org/r/20260223144108-dcace0b9-02e8-4b67-a7ce-f263bed36f26@linutronix.de/
Fixes: 4267739cabb8 ("arch, mm: consolidate initialization of SPARSE memory model")
Suggested-by: Mike Rapoport <rppt@kernel.org>
Signed-off-by: Vivian Wang <wangruikang@iscas.ac.cn>
Acked-by: Mike Rapoport (Microsoft) <rppt@kernel.org>
Tested-by: Thomas Weißschuh <thomas.weissschuh@linutronix.de>
Link: https://patch.msgid.link/20260310-riscv-sparsemem-alternatives-fix-v1-1-659d5dd257e2@iscas.ac.cn
[pjw@kernel.org: fix the subject line to align with the patch description]
Signed-off-by: Paul Walmsley <pjw@kernel.org>
|
|
Fix several bugs in the RISC-V kgdb implementation:
- The element of dbg_reg_def[] that is supposed to pertain to the S1
register embeds instead the struct pt_regs offset of the A1
register. Fix this to use the S1 register offset in struct pt_regs.
- The sleeping_thread_to_gdb_regs() function copies the value of the
S10 register into the gdb_regs[] array element meant for the S9
register, and copies the value of the S11 register into the array
element meant for the S10 register. It also neglects to copy the
value of the S11 register. Fix all of these issues.
Fixes: fe89bd2be8667 ("riscv: Add KGDB support")
Cc: Vincent Chen <vincent.chen@sifive.com>
Link: https://patch.msgid.link/fde376f8-bcfd-bfe4-e467-07d8f7608d05@kernel.org
Signed-off-by: Paul Walmsley <pjw@kernel.org>
|
|
Reuse KVM_CAP_VM_GPA_BITS to advertise and select the effective
G-stage GPA width for a VM.
KVM_CHECK_EXTENSION(KVM_CAP_VM_GPA_BITS) returns the effective GPA
bits for a VM, KVM_ENABLE_CAP(KVM_CAP_VM_GPA_BITS) allows userspace
to downsize the effective GPA width by selecting a smaller G-stage
page table format:
- gpa_bits <= 41 selects Sv39x4 (pgd_levels=3)
- gpa_bits <= 50 selects Sv48x4 (pgd_levels=4)
- gpa_bits <= 59 selects Sv57x4 (pgd_levels=5)
Reject the request with -EINVAL for unsupported values and with -EBUSY
if vCPUs have been created or any memslot is populated.
Signed-off-by: Fangyu Yu <fangyu.yu@linux.alibaba.com>
Reviewed-by: Andrew Jones <andrew.jones@oss.qualcomm.com>
Reviewed-by: Guo Ren <guoren@kernel.org>
Reviewed-by: Nutty Liu <nutty.liu@hotmail.com>
Reviewed-by: Anup Patel <anup@brainfault.org>
Link: https://lore.kernel.org/r/20260403153019.9916-4-fangyu.yu@linux.alibaba.com
Signed-off-by: Anup Patel <anup@brainfault.org>
|
|
Gstage page-table helpers frequently chase gstage->kvm->arch to
fetch pgd_levels. This adds noise and repeats the same dereference
chain in hot paths.
Add pgd_levels to struct kvm_gstage and initialize it from kvm->arch
when setting up a gstage instance. Introduce kvm_riscv_gstage_init()
to centralize initialization and switch gstage code to use
gstage->pgd_levels.
Suggested-by: Anup Patel <anup@brainfault.org>
Signed-off-by: Fangyu Yu <fangyu.yu@linux.alibaba.com>
Reviewed-by: Anup Patel <anup@brainfault.org>
Reviewed-by: Nutty Liu <nutty.liu@hotmail.com>
Link: https://lore.kernel.org/r/20260403153019.9916-3-fangyu.yu@linux.alibaba.com
Signed-off-by: Anup Patel <anup@brainfault.org>
|
|
Introduces one per-VM architecture-specific fields to support runtime
configuration of the G-stage page table format:
- kvm->arch.pgd_levels: the corresponding number of page table levels
for the selected mode.
These fields replace the previous global variables
kvm_riscv_gstage_mode and kvm_riscv_gstage_pgd_levels, enabling different
virtual machines to independently select their G-stage page table format
instead of being forced to share the maximum mode detected by the kernel
at boot time.
Signed-off-by: Fangyu Yu <fangyu.yu@linux.alibaba.com>
Reviewed-by: Andrew Jones <andrew.jones@oss.qualcomm.com>
Reviewed-by: Anup Patel <anup@brainfault.org>
Reviewed-by: Guo Ren <guoren@kernel.org>
Reviewed-by: Nutty Liu <nutty.liu@hotmail.com>
Link: https://lore.kernel.org/r/20260403153019.9916-2-fangyu.yu@linux.alibaba.com
Signed-off-by: Anup Patel <anup@brainfault.org>
|
|
The hstateen0 will be programmed differently for guest HS-mode
and guest VS/VU-mode so don't check hstateen0.SSTATEEN0 bit when
updating sstateen0 CSR in kvm_riscv_vcpu_swap_in_guest_state()
and kvm_riscv_vcpu_swap_in_host_state().
Signed-off-by: Anup Patel <anup.patel@oss.qualcomm.com>
Reviewed-by: Radim Krčmář <radim.krcmar@oss.qualcomm.com>
Link: https://lore.kernel.org/r/20260120080013.2153519-10-anup.patel@oss.qualcomm.com
Signed-off-by: Anup Patel <anup@brainfault.org>
|
|
The VCPU config deals with hideleg, hedeleg, henvcfg, and hstateenX
CSR configuration for each VCPU. Factor-out VCPU config into separate
sources so that VCPU config can do things differently for guest HS-mode
and guest VS/VU-mode.
Signed-off-by: Anup Patel <anup.patel@oss.qualcomm.com>
Reviewed-by: Radim Krčmář <radim.krcmar@oss.qualcomm.com>
Link: https://lore.kernel.org/r/20260120080013.2153519-9-anup.patel@oss.qualcomm.com
Signed-off-by: Anup Patel <anup@brainfault.org>
|
|
The hideleg CSR state when VCPU is running in guest VS/VU-mode will
be different from when it is running in guest HS-mode. To achieve
this, add hideleg to struct kvm_vcpu_config and re-program hideleg
CSR upon every kvm_arch_vcpu_load().
Signed-off-by: Anup Patel <anup.patel@oss.qualcomm.com>
Reviewed-by: Radim Krčmář <radim.krcmar@oss.qualcomm.com>
Link: https://lore.kernel.org/r/20260120080013.2153519-8-anup.patel@oss.qualcomm.com
Signed-off-by: Anup Patel <anup@brainfault.org>
|
|
The KVM_RISCV_TIMER_STATE_xyz defines specify possible values of the
"state" member in struct kvm_riscv_timer so move these defines closer
to struct kvm_riscv_timer in uapi/asm/kvm.h.
Signed-off-by: Anup Patel <anup.patel@oss.qualcomm.com>
Link: https://lore.kernel.org/r/20260120080013.2153519-7-anup.patel@oss.qualcomm.com
Signed-off-by: Anup Patel <anup@brainfault.org>
|
|
The KVM ISA extension related checks are not VCPU specific and
should be factored out of vcpu_onereg.c into separate sources.
Signed-off-by: Anup Patel <anup.patel@oss.qualcomm.com>
Reviewed-by: Radim Krčmář <radim.krcmar@oss.qualcomm.com>
Link: https://lore.kernel.org/r/20260120080013.2153519-6-anup.patel@oss.qualcomm.com
Signed-off-by: Anup Patel <anup@brainfault.org>
|
|
Rename kvm_riscv_vcpu_isa_check_host() to kvm_riscv_isa_check_host()
and use it as common function with KVM RISC-V to check isa extensions
supported by host.
Signed-off-by: Anup Patel <anup.patel@oss.qualcomm.com>
Reviewed-by: Radim Krčmář <radim.krcmar@oss.qualcomm.com>
Link: https://lore.kernel.org/r/20260120080013.2153519-5-anup.patel@oss.qualcomm.com
Signed-off-by: Anup Patel <anup@brainfault.org>
|