| Age | Commit message (Collapse) | Author |
|
Pull bpf fixes from Alexei Starovoitov:
"Most of the diff stat comes from Xu Kuohai's fix to emit ENDBR/BTI,
since all JITs had to be touched to move constant blinding out and
pass bpf_verifier_env in.
- Fix use-after-free in arena_vm_close on fork (Alexei Starovoitov)
- Dissociate struct_ops program with map if map_update fails (Amery
Hung)
- Fix out-of-range and off-by-one bugs in arm64 JIT (Daniel Borkmann)
- Fix precedence bug in convert_bpf_ld_abs alignment check (Daniel
Borkmann)
- Fix arg tracking for imprecise/multi-offset in BPF_ST/STX insns
(Eduard Zingerman)
- Copy token from main to subprogs to fix missing kallsyms (Eduard
Zingerman)
- Prevent double close and leak of btf objects in libbpf (Jiri Olsa)
- Fix af_unix null-ptr-deref in sockmap (Michal Luczaj)
- Fix NULL deref in map_kptr_match_type for scalar regs (Mykyta
Yatsenko)
- Avoid unnecessary IPIs. Remove redundant bpf_flush_icache() in
arm64 and riscv JITs (Puranjay Mohan)
- Fix out of bounds access. Validate node_id in arena_alloc_pages()
(Puranjay Mohan)
- Reject BPF-to-BPF calls and callbacks in arm32 JIT (Puranjay Mohan)
- Refactor all JITs to pass bpf_verifier_env to emit ENDBR/BTI for
indirect jump targets on x86-64, arm64 JITs (Xu Kuohai)
- Allow UTF-8 literals in bpf_bprintf_prepare() (Yihan Ding)"
* tag 'bpf-fixes' of git://git.kernel.org/pub/scm/linux/kernel/git/bpf/bpf: (32 commits)
bpf, arm32: Reject BPF-to-BPF calls and callbacks in the JIT
bpf: Dissociate struct_ops program with map if map_update fails
bpf: Validate node_id in arena_alloc_pages()
libbpf: Prevent double close and leak of btf objects
selftests/bpf: cover UTF-8 trace_printk output
bpf: allow UTF-8 literals in bpf_bprintf_prepare()
selftests/bpf: Reject scalar store into kptr slot
bpf: Fix NULL deref in map_kptr_match_type for scalar regs
bpf: Fix precedence bug in convert_bpf_ld_abs alignment check
bpf, arm64: Emit BTI for indirect jump target
bpf, x86: Emit ENDBR for indirect jump targets
bpf: Add helper to detect indirect jump targets
bpf: Pass bpf_verifier_env to JIT
bpf: Move constants blinding out of arch-specific JITs
bpf, sockmap: Take state lock for af_unix iter
bpf, sockmap: Fix af_unix null-ptr-deref in proto update
selftests/bpf: Extend bpf_iter_unix to attempt deadlocking
bpf, sockmap: Fix af_unix iter deadlock
bpf, sockmap: Annotate af_unix sock:: Sk_state data-races
selftests/bpf: verify kallsyms entries for token-loaded subprograms
...
|
|
Pull kvm updates from Paolo Bonzini:
"Arm:
- Add support for tracing in the standalone EL2 hypervisor code,
which should help both debugging and performance analysis. This
uses the new infrastructure for 'remote' trace buffers that can be
exposed by non-kernel entities such as firmware, and which came
through the tracing tree
- Add support for GICv5 Per Processor Interrupts (PPIs), as the
starting point for supporting the new GIC architecture in KVM
- Finally add support for pKVM protected guests, where pages are
unmapped from the host as they are faulted into the guest and can
be shared back from the guest using pKVM hypercalls. Protected
guests are created using a new machine type identifier. As the
elusive guestmem has not yet delivered on its promises, anonymous
memory is also supported
This is only a first step towards full isolation from the host; for
example, the CPU register state and DMA accesses are not yet
isolated. Because this does not really yet bring fully what it
promises, it is hidden behind CONFIG_ARM_PKVM_GUEST +
'kvm-arm.mode=protected', and also triggers TAINT_USER when a VM is
created. Caveat emptor
- Rework the dreaded user_mem_abort() function to make it more
maintainable, reducing the amount of state being exposed to the
various helpers and rendering a substantial amount of state
immutable
- Expand the Stage-2 page table dumper to support NV shadow page
tables on a per-VM basis
- Tidy up the pKVM PSCI proxy code to be slightly less hard to
follow
- Fix both SPE and TRBE in non-VHE configurations so that they do not
generate spurious, out of context table walks that ultimately lead
to very bad HW lockups
- A small set of patches fixing the Stage-2 MMU freeing in error
cases
- Tighten-up accepted SMC immediate value to be only #0 for host
SMCCC calls
- The usual cleanups and other selftest churn
LoongArch:
- Use CSR_CRMD_PLV for kvm_arch_vcpu_in_kernel()
- Add DMSINTC irqchip in kernel support
RISC-V:
- Fix steal time shared memory alignment checks
- Fix vector context allocation leak
- Fix array out-of-bounds in pmu_ctr_read() and pmu_fw_ctr_read_hi()
- Fix double-free of sdata in kvm_pmu_clear_snapshot_area()
- Fix integer overflow in kvm_pmu_validate_counter_mask()
- Fix shift-out-of-bounds in make_xfence_request()
- Fix lost write protection on huge pages during dirty logging
- Split huge pages during fault handling for dirty logging
- Skip CSR restore if VCPU is reloaded on the same core
- Implement kvm_arch_has_default_irqchip() for KVM selftests
- Factored-out ISA checks into separate sources
- Added hideleg to struct kvm_vcpu_config
- Factored-out VCPU config into separate sources
- Support configuration of per-VM HGATP mode from KVM user space
s390:
- Support for ESA (31-bit) guests inside nested hypervisors
- Remove restriction on memslot alignment, which is not needed
anymore with the new gmap code
- Fix LPSW/E to update the bear (which of course is the breaking
event address register)
x86:
- Shut up various UBSAN warnings on reading module parameter before
they were initialized
- Don't zero-allocate page tables that are used for splitting
hugepages in the TDP MMU, as KVM is guaranteed to set all SPTEs in
the page table and thus write all bytes
- As an optimization, bail early when trying to unsync 4KiB mappings
if the target gfn can just be mapped with a 2MiB hugepage
x86 generic:
- Copy single-chunk MMIO write values into struct kvm_vcpu (more
precisely struct kvm_mmio_fragment) to fix use-after-free stack
bugs where KVM would dereference stack pointer after an exit to
userspace
- Clean up and comment the emulated MMIO code to try to make it
easier to maintain (not necessarily "easy", but "easier")
- Move VMXON+VMXOFF and EFER.SVME toggling out of KVM (not *all* of
VMX and SVM enabling) as it is needed for trusted I/O
- Advertise support for AVX512 Bit Matrix Multiply (BMM) instructions
- Immediately fail the build if a required #define is missing in one
of KVM's headers that is included multiple times
- Reject SET_GUEST_DEBUG with -EBUSY if there's an already injected
exception, mostly to prevent syzkaller from abusing the uAPI to
trigger WARNs, but also because it can help prevent userspace from
unintentionally crashing the VM
- Exempt SMM from CPUID faulting on Intel, as per the spec
- Misc hardening and cleanup changes
x86 (AMD):
- Fix and optimize IRQ window inhibit handling for AVIC; make it
per-vCPU so that KVM doesn't prematurely re-enable AVIC if multiple
vCPUs have to-be-injected IRQs
- Clean up and optimize the OSVW handling, avoiding a bug in which
KVM would overwrite state when enabling virtualization on multiple
CPUs in parallel. This should not be a problem because OSVW should
usually be the same for all CPUs
- Drop a WARN in KVM_MEMORY_ENCRYPT_REG_REGION where KVM complains
about a "too large" size based purely on user input
- Clean up and harden the pinning code for KVM_MEMORY_ENCRYPT_REG_REGION
- Disallow synchronizing a VMSA of an already-launched/encrypted
vCPU, as doing so for an SNP guest will crash the host due to an
RMP violation page fault
- Overhaul KVM's APIs for detecting SEV+ guests so that VM-scoped
queries are required to hold kvm->lock, and enforce it by lockdep.
Fix various bugs where sev_guest() was not ensured to be stable for
the whole duration of a function or ioctl
- Convert a pile of kvm->lock SEV code to guard()
- Play nicer with userspace that does not enable
KVM_CAP_EXCEPTION_PAYLOAD, for which KVM needs to set CR2 and DR6
as a response to ioctls such as KVM_GET_VCPU_EVENTS (even if the
payload would end up in EXITINFO2 rather than CR2, for example).
Only set CR2 and DR6 when consumption of the payload is imminent,
but on the other hand force delivery of the payload in all paths
where userspace retrieves CR2 or DR6
- Use vcpu->arch.cr2 when updating vmcb12's CR2 on nested #VMEXIT
instead of vmcb02->save.cr2. The value is out of sync after a
save/restore or after a #PF is injected into L2
- Fix a class of nSVM bugs where some fields written by the CPU are
not synchronized from vmcb02 to cached vmcb12 after VMRUN, and so
are not up-to-date when saved by KVM_GET_NESTED_STATE
- Fix a class of bugs where the ordering between KVM_SET_NESTED_STATE
and KVM_SET_{S}REGS could cause vmcb02 to be incorrectly
initialized after save+restore
- Add a variety of missing nSVM consistency checks
- Fix several bugs where KVM failed to correctly update VMCB fields
on nested #VMEXIT
- Fix several bugs where KVM failed to correctly synthesize #UD or
#GP for SVM-related instructions
- Add support for save+restore of virtualized LBRs (on SVM)
- Refactor various helpers and macros to improve clarity and
(hopefully) make the code easier to maintain
- Aggressively sanitize fields when copying from vmcb12, to guard
against unintentionally allowing L1 to utilize yet-to-be-defined
features
- Fix several bugs where KVM botched rAX legality checks when
emulating SVM instructions. There are remaining issues in that KVM
doesn't handle size prefix overrides for 64-bit guests
- Fail emulation of VMRUN/VMLOAD/VMSAVE if mapping vmcb12 fails
instead of somewhat arbitrarily synthesizing #GP (i.e. don't double
down on AMD's architectural but sketchy behavior of generating #GP
for "unsupported" addresses)
- Cache all used vmcb12 fields to further harden against TOCTOU bugs
x86 (Intel):
- Drop obsolete branch hint prefixes from the VMX instruction macros
- Use ASM_INPUT_RM() in __vmcs_writel() to coerce clang into using a
register input when appropriate
- Code cleanups
guest_memfd:
- Don't mark guest_memfd folios as accessed, as guest_memfd doesn't
support reclaim, the memory is unevictable, and there is no storage
to write back to
LoongArch selftests:
- Add KVM PMU test cases
s390 selftests:
- Enable more memory selftests
x86 selftests:
- Add support for Hygon CPUs in KVM selftests
- Fix a bug in the MSR test where it would get false failures on
AMD/Hygon CPUs with exactly one of RDPID or RDTSCP
- Add an MADV_COLLAPSE testcase for guest_memfd as a regression test
for a bug where the kernel would attempt to collapse guest_memfd
folios against KVM's will"
* tag 'for-linus' of git://git.kernel.org/pub/scm/virt/kvm/kvm: (373 commits)
KVM: x86: use inlines instead of macros for is_sev_*guest
x86/virt: Treat SVM as unsupported when running as an SEV+ guest
KVM: SEV: Goto an existing error label if charging misc_cg for an ASID fails
KVM: SVM: Move lock-protected allocation of SEV ASID into a separate helper
KVM: SEV: use mutex guard in snp_handle_guest_req()
KVM: SEV: use mutex guard in sev_mem_enc_unregister_region()
KVM: SEV: use mutex guard in sev_mem_enc_ioctl()
KVM: SEV: use mutex guard in snp_launch_update()
KVM: SEV: Assert that kvm->lock is held when querying SEV+ support
KVM: SEV: Document that checking for SEV+ guests when reclaiming memory is "safe"
KVM: SEV: Hide "struct kvm_sev_info" behind CONFIG_KVM_AMD_SEV=y
KVM: SEV: WARN on unhandled VM type when initializing VM
KVM: LoongArch: selftests: Add PMU overflow interrupt test
KVM: LoongArch: selftests: Add basic PMU event counting test
KVM: LoongArch: selftests: Add cpucfg read/write helpers
LoongArch: KVM: Add DMSINTC inject msi to vCPU
LoongArch: KVM: Add DMSINTC device support
LoongArch: KVM: Make vcpu_is_preempted() as a macro rather than function
LoongArch: KVM: Move host CSR_GSTAT save and restore in context switch
LoongArch: KVM: Move host CSR_EENTRY save and restore in context switch
...
|
|
git://git.kernel.org/pub/scm/linux/kernel/git/akpm/mm
Pull non-MM updates from Andrew Morton:
- "pid: make sub-init creation retryable" (Oleg Nesterov)
Make creation of init in a new namespace more robust by clearing away
some historical cruft which is no longer needed. Also some
documentation fixups
- "selftests/fchmodat2: Error handling and general" (Mark Brown)
Fix and a cleanup for the fchmodat2() syscall selftest
- "lib: polynomial: Move to math/ and clean up" (Andy Shevchenko)
- "hung_task: Provide runtime reset interface for hung task detector"
(Aaron Tomlin)
Give administrators the ability to zero out
/proc/sys/kernel/hung_task_detect_count
- "tools/getdelays: use the static UAPI headers from
tools/include/uapi" (Thomas Weißschuh)
Teach getdelays to use the in-kernel UAPI headers rather than the
system-provided ones
- "watchdog/hardlockup: Improvements to hardlockup" (Mayank Rungta)
Several cleanups and fixups to the hardlockup detector code and its
documentation
- "lib/bch: fix undefined behavior from signed left-shifts" (Josh Law)
A couple of small/theoretical fixes in the bch code
- "ocfs2/dlm: fix two bugs in dlm_match_regions()" (Junrui Luo)
- "cleanup the RAID5 XOR library" (Christoph Hellwig)
A quite far-reaching cleanup to this code. I can't do better than to
quote Christoph:
"The XOR library used for the RAID5 parity is a bit of a mess right
now. The main file sits in crypto/ despite not being cryptography
and not using the crypto API, with the generic implementations
sitting in include/asm-generic and the arch implementations
sitting in an asm/ header in theory. The latter doesn't work for
many cases, so architectures often build the code directly into
the core kernel, or create another module for the architecture
code.
Change this to a single module in lib/ that also contains the
architecture optimizations, similar to the library work Eric
Biggers has done for the CRC and crypto libraries later. After
that it changes to better calling conventions that allow for
smarter architecture implementations (although none is contained
here yet), and uses static_call to avoid indirection function call
overhead"
- "lib/list_sort: Clean up list_sort() scheduling workarounds"
(Kuan-Wei Chiu)
Clean up this library code by removing a hacky thing which was added
for UBIFS, which UBIFS doesn't actually need
- "Fix bugs in extract_iter_to_sg()" (Christian Ehrhardt)
Fix a few bugs in the scatterlist code, add in-kernel tests for the
now-fixed bugs and fix a leak in the test itself
- "kdump: Enable LUKS-encrypted dump target support in ARM64 and
PowerPC" (Coiby Xu)
Enable support of the LUKS-encrypted device dump target on arm64 and
powerpc
- "ocfs2: consolidate extent list validation into block read callbacks"
(Joseph Qi)
Cleanup, simplify, and make more robust ocfs2's validation of extent
list fields (Kernel test robot loves mounting corrupted fs images!)
* tag 'mm-nonmm-stable-2026-04-15-04-20' of git://git.kernel.org/pub/scm/linux/kernel/git/akpm/mm: (127 commits)
ocfs2: validate group add input before caching
ocfs2: validate bg_bits during freefrag scan
ocfs2: fix listxattr handling when the buffer is full
doc: watchdog: fix typos etc
update Sean's email address
ocfs2: use get_random_u32() where appropriate
ocfs2: split transactions in dio completion to avoid credit exhaustion
ocfs2: remove redundant l_next_free_rec check in __ocfs2_find_path()
ocfs2: validate extent block list fields during block read
ocfs2: remove empty extent list check in ocfs2_dx_dir_lookup_rec()
ocfs2: validate dx_root extent list fields during block read
ocfs2: fix use-after-free in ocfs2_fault() when VM_FAULT_RETRY
ocfs2: handle invalid dinode in ocfs2_group_extend
.get_maintainer.ignore: add Askar
ocfs2: validate bg_list extent bounds in discontig groups
checkpatch: exclude forward declarations of const structs
tools/accounting: handle truncated taskstats netlink messages
taskstats: set version in TGID exit notifications
ocfs2/heartbeat: fix slot mapping rollback leaks on error paths
arm64,ppc64le/kdump: pass dm-crypt keys to kdump kernel
...
|
|
Pass bpf_verifier_env to bpf_int_jit_compile(). The follow-up patch will
use env->insn_aux_data in the JIT stage to detect indirect jump targets.
Since bpf_prog_select_runtime() can be called by cbpf and lib/test_bpf.c
code without verifier, introduce helper __bpf_prog_select_runtime()
to accept the env parameter.
Remove the call to bpf_prog_select_runtime() in bpf_prog_load(), and
switch to call __bpf_prog_select_runtime() in the verifier, with env
variable passed. The original bpf_prog_select_runtime() is preserved for
cbpf and lib/test_bpf.c, where env is NULL.
Now all constants blinding calls are moved into the verifier, except
the cbpf and lib/test_bpf.c cases. The instructions arrays are adjusted
by bpf_patch_insn_data() function for normal cases, so there is no need
to call adjust_insn_arrays() in bpf_jit_blind_constants(). Remove it.
Reviewed-by: Anton Protopopov <a.s.protopopov@gmail.com> # v8
Reviewed-by: Emil Tsalapatis <emil@etsalapatis.com> # v12
Acked-by: Hengqi Chen <hengqi.chen@gmail.com> # v14
Signed-off-by: Xu Kuohai <xukuohai@huawei.com>
Link: https://lore.kernel.org/r/20260416064341.151802-3-xukuohai@huaweicloud.com
Signed-off-by: Alexei Starovoitov <ast@kernel.org>
|
|
During the JIT stage, constants blinding rewrites instructions but only
rewrites the private instruction copy of the JITed subprog, leaving the
global env->prog->insnsi and env->insn_aux_data untouched. This causes a
mismatch between subprog instructions and the global state, making it
difficult to use the global data in the JIT.
To avoid this mismatch, and given that all arch-specific JITs already
support constants blinding, move it to the generic verifier code, and
switch to rewrite the global env->prog->insnsi with the global states
adjusted, as other rewrites in the verifier do.
This removes the constants blinding calls in each JIT, which are largely
duplicated code across architectures.
Since constants blinding is only required for JIT, and there are two
JIT entry functions, jit_subprogs() for BPF programs with multiple
subprogs and bpf_prog_select_runtime() for programs with no subprogs,
move the constants blinding invocation into these two functions.
In the verifier path, bpf_patch_insn_data() is used to keep global
verifier auxiliary data in sync with patched instructions. A key
question is whether this global auxiliary data should be restored
on the failure path.
Besides instructions, bpf_patch_insn_data() adjusts:
- prog->aux->poke_tab
- env->insn_array_maps
- env->subprog_info
- env->insn_aux_data
For prog->aux->poke_tab, it is only used by JIT or only meaningful after
JIT succeeds, so it does not need to be restored on the failure path.
For env->insn_array_maps, when JIT fails, programs using insn arrays
are rejected by bpf_insn_array_ready() due to missing JIT addresses.
Hence, env->insn_array_maps is only meaningful for JIT and does not need
to be restored.
For subprog_info, if jit_subprogs fails and CONFIG_BPF_JIT_ALWAYS_ON
is not enabled, kernel falls back to interpreter. In this case,
env->subprog_info is used to determine subprogram stack depth. So it
must be restored on failure.
For env->insn_aux_data, it is freed by clear_insn_aux_data() at the
end of bpf_check(). Before freeing, clear_insn_aux_data() loops over
env->insn_aux_data to release jump targets recorded in it. The loop
uses env->prog->len as the array length, but this length no longer
matches the actual size of the adjusted env->insn_aux_data array after
constants blinding.
To address it, a simple approach is to keep insn_aux_data as adjusted
after failure, since it will be freed shortly, and record its actual size
for the loop in clear_insn_aux_data(). But since clear_insn_aux_data()
uses the same index to loop over both env->prog->insnsi and env->insn_aux_data,
this approach results in incorrect index for the insnsi array. So an
alternative approach is adopted: clone the original env->insn_aux_data
before blinding and restore it after failure, similar to env->prog.
For classic BPF programs, constants blinding works as before since it
is still invoked from bpf_prog_select_runtime().
Reviewed-by: Anton Protopopov <a.s.protopopov@gmail.com> # v8
Reviewed-by: Hari Bathini <hbathini@linux.ibm.com> # powerpc jit
Reviewed-by: Pu Lehui <pulehui@huawei.com> # riscv jit
Acked-by: Hengqi Chen <hengqi.chen@gmail.com> # loongarch jit
Signed-off-by: Xu Kuohai <xukuohai@huawei.com>
Link: https://lore.kernel.org/r/20260416064341.151802-2-xukuohai@huaweicloud.com
Signed-off-by: Alexei Starovoitov <ast@kernel.org>
|
|
git://git.kernel.org/pub/scm/linux/kernel/git/akpm/mm
Pull MM updates from Andrew Morton:
- "maple_tree: Replace big node with maple copy" (Liam Howlett)
Mainly prepararatory work for ongoing development but it does reduce
stack usage and is an improvement.
- "mm, swap: swap table phase III: remove swap_map" (Kairui Song)
Offers memory savings by removing the static swap_map. It also yields
some CPU savings and implements several cleanups.
- "mm: memfd_luo: preserve file seals" (Pratyush Yadav)
File seal preservation to LUO's memfd code
- "mm: zswap: add per-memcg stat for incompressible pages" (Jiayuan
Chen)
Additional userspace stats reportng to zswap
- "arch, mm: consolidate empty_zero_page" (Mike Rapoport)
Some cleanups for our handling of ZERO_PAGE() and zero_pfn
- "mm/kmemleak: Improve scan_should_stop() implementation" (Zhongqiu
Han)
A robustness improvement and some cleanups in the kmemleak code
- "Improve khugepaged scan logic" (Vernon Yang)
Improve khugepaged scan logic and reduce CPU consumption by
prioritizing scanning tasks that access memory frequently
- "Make KHO Stateless" (Jason Miu)
Simplify Kexec Handover by transitioning KHO from an xarray-based
metadata tracking system with serialization to a radix tree data
structure that can be passed directly to the next kernel
- "mm: vmscan: add PID and cgroup ID to vmscan tracepoints" (Thomas
Ballasi and Steven Rostedt)
Enhance vmscan's tracepointing
- "mm: arch/shstk: Common shadow stack mapping helper and
VM_NOHUGEPAGE" (Catalin Marinas)
Cleanup for the shadow stack code: remove per-arch code in favour of
a generic implementation
- "Fix KASAN support for KHO restored vmalloc regions" (Pasha Tatashin)
Fix a WARN() which can be emitted the KHO restores a vmalloc area
- "mm: Remove stray references to pagevec" (Tal Zussman)
Several cleanups, mainly udpating references to "struct pagevec",
which became folio_batch three years ago
- "mm: Eliminate fake head pages from vmemmap optimization" (Kiryl
Shutsemau)
Simplify the HugeTLB vmemmap optimization (HVO) by changing how tail
pages encode their relationship to the head page
- "mm/damon/core: improve DAMOS quota efficiency for core layer
filters" (SeongJae Park)
Improve two problematic behaviors of DAMOS that makes it less
efficient when core layer filters are used
- "mm/damon: strictly respect min_nr_regions" (SeongJae Park)
Improve DAMON usability by extending the treatment of the
min_nr_regions user-settable parameter
- "mm/page_alloc: pcp locking cleanup" (Vlastimil Babka)
The proper fix for a previously hotfixed SMP=n issue. Code
simplifications and cleanups ensued
- "mm: cleanups around unmapping / zapping" (David Hildenbrand)
A bunch of cleanups around unmapping and zapping. Mostly
simplifications, code movements, documentation and renaming of
zapping functions
- "support batched checking of the young flag for MGLRU" (Baolin Wang)
Batched checking of the young flag for MGLRU. It's part cleanups; one
benchmark shows large performance benefits for arm64
- "memcg: obj stock and slab stat caching cleanups" (Johannes Weiner)
memcg cleanup and robustness improvements
- "Allow order zero pages in page reporting" (Yuvraj Sakshith)
Enhance free page reporting - it is presently and undesirably order-0
pages when reporting free memory.
- "mm: vma flag tweaks" (Lorenzo Stoakes)
Cleanup work following from the recent conversion of the VMA flags to
a bitmap
- "mm/damon: add optional debugging-purpose sanity checks" (SeongJae
Park)
Add some more developer-facing debug checks into DAMON core
- "mm/damon: test and document power-of-2 min_region_sz requirement"
(SeongJae Park)
An additional DAMON kunit test and makes some adjustments to the
addr_unit parameter handling
- "mm/damon/core: make passed_sample_intervals comparisons
overflow-safe" (SeongJae Park)
Fix a hard-to-hit time overflow issue in DAMON core
- "mm/damon: improve/fixup/update ratio calculation, test and
documentation" (SeongJae Park)
A batch of misc/minor improvements and fixups for DAMON
- "mm: move vma_(kernel|mmu)_pagesize() out of hugetlb.c" (David
Hildenbrand)
Fix a possible issue with dax-device when CONFIG_HUGETLB=n. Some code
movement was required.
- "zram: recompression cleanups and tweaks" (Sergey Senozhatsky)
A somewhat random mix of fixups, recompression cleanups and
improvements in the zram code
- "mm/damon: support multiple goal-based quota tuning algorithms"
(SeongJae Park)
Extend DAMOS quotas goal auto-tuning to support multiple tuning
algorithms that users can select
- "mm: thp: reduce unnecessary start_stop_khugepaged()" (Breno Leitao)
Fix the khugpaged sysfs handling so we no longer spam the logs with
reams of junk when starting/stopping khugepaged
- "mm: improve map count checks" (Lorenzo Stoakes)
Provide some cleanups and slight fixes in the mremap, mmap and vma
code
- "mm/damon: support addr_unit on default monitoring targets for
modules" (SeongJae Park)
Extend the use of DAMON core's addr_unit tunable
- "mm: khugepaged cleanups and mTHP prerequisites" (Nico Pache)
Cleanups to khugepaged and is a base for Nico's planned khugepaged
mTHP support
- "mm: memory hot(un)plug and SPARSEMEM cleanups" (David Hildenbrand)
Code movement and cleanups in the memhotplug and sparsemem code
- "mm: remove CONFIG_ARCH_ENABLE_MEMORY_HOTREMOVE and cleanup
CONFIG_MIGRATION" (David Hildenbrand)
Rationalize some memhotplug Kconfig support
- "change young flag check functions to return bool" (Baolin Wang)
Cleanups to change all young flag check functions to return bool
- "mm/damon/sysfs: fix memory leak and NULL dereference issues" (Josh
Law and SeongJae Park)
Fix a few potential DAMON bugs
- "mm/vma: convert vm_flags_t to vma_flags_t in vma code" (Lorenzo
Stoakes)
Convert a lot of the existing use of the legacy vm_flags_t data type
to the new vma_flags_t type which replaces it. Mainly in the vma
code.
- "mm: expand mmap_prepare functionality and usage" (Lorenzo Stoakes)
Expand the mmap_prepare functionality, which is intended to replace
the deprecated f_op->mmap hook which has been the source of bugs and
security issues for some time. Cleanups, documentation, extension of
mmap_prepare into filesystem drivers
- "mm/huge_memory: refactor zap_huge_pmd()" (Lorenzo Stoakes)
Simplify and clean up zap_huge_pmd(). Additional cleanups around
vm_normal_folio_pmd() and the softleaf functionality are performed.
* tag 'mm-stable-2026-04-13-21-45' of git://git.kernel.org/pub/scm/linux/kernel/git/akpm/mm: (369 commits)
mm: fix deferred split queue races during migration
mm/khugepaged: fix issue with tracking lock
mm/huge_memory: add and use has_deposited_pgtable()
mm/huge_memory: add and use normal_or_softleaf_folio_pmd()
mm: add softleaf_is_valid_pmd_entry(), pmd_to_softleaf_folio()
mm/huge_memory: separate out the folio part of zap_huge_pmd()
mm/huge_memory: use mm instead of tlb->mm
mm/huge_memory: remove unnecessary sanity checks
mm/huge_memory: deduplicate zap deposited table call
mm/huge_memory: remove unnecessary VM_BUG_ON_PAGE()
mm/huge_memory: add a common exit path to zap_huge_pmd()
mm/huge_memory: handle buggy PMD entry in zap_huge_pmd()
mm/huge_memory: have zap_huge_pmd return a boolean, add kdoc
mm/huge: avoid big else branch in zap_huge_pmd()
mm/huge_memory: simplify vma_is_specal_huge()
mm: on remap assert that input range within the proposed VMA
mm: add mmap_action_map_kernel_pages[_full]()
uio: replace deprecated mmap hook with mmap_prepare in uio_info
drivers: hv: vmbus: replace deprecated mmap hook with mmap_prepare
mm: allow handling of stacked mmap_prepare hooks in more drivers
...
|
|
git://git.kernel.org/pub/scm/linux/kernel/git/netdev/net-next
Pull networking updates from Jakub Kicinski:
"Core & protocols:
- Support HW queue leasing, allowing containers to be granted access
to HW queues for zero-copy operations and AF_XDP
- Number of code moves to help the compiler with inlining. Avoid
output arguments for returning drop reason where possible
- Rework drop handling within qdiscs to include more metadata about
the reason and dropping qdisc in the tracepoints
- Remove the rtnl_lock use from IP Multicast Routing
- Pack size information into the Rx Flow Steering table pointer
itself. This allows making the table itself a flat array of u32s,
thus making the table allocation size a power of two
- Report TCP delayed ack timer information via socket diag
- Add ip_local_port_step_width sysctl to allow distributing the
randomly selected ports more evenly throughout the allowed space
- Add support for per-route tunsrc in IPv6 segment routing
- Start work of switching sockopt handling to iov_iter
- Improve dynamic recvbuf sizing in MPTCP, limit burstiness and avoid
buffer size drifting up
- Support MSG_EOR in MPTCP
- Add stp_mode attribute to the bridge driver for STP mode selection.
This addresses concerns about call_usermodehelper() usage
- Remove UDP-Lite support (as announced in 2023)
- Remove support for building IPv6 as a module. Remove the now
unnecessary function calling indirection
Cross-tree stuff:
- Move Michael MIC code from generic crypto into wireless, it's
considered insecure but some WiFi networks still need it
Netfilter:
- Switch nft_fib_ipv6 module to no longer need temporary dst_entry
object allocations by using fib6_lookup() + RCU.
Florian W reports this gets us ~13% higher packet rate
- Convert IPVS's global __ip_vs_mutex to per-net service_mutex and
switch the service tables to be per-net. Convert some code that
walks the service lists to use RCU instead of the service_mutex
- Add more opinionated input validation to lower security exposure
- Make IPVS hash tables to be per-netns and resizable
Wireless:
- Finished assoc frame encryption/EPPKE/802.1X-over-auth
- Radar detection improvements
- Add 6 GHz incumbent signal detection APIs
- Multi-link support for FILS, probe response templates and client
probing
- New APIs and mac80211 support for NAN (Neighbor Aware Networking,
aka Wi-Fi Aware) so less work must be in firmware
Driver API:
- Add numerical ID for devlink instances (to avoid having to create
fake bus/device pairs just to have an ID). Support shared devlink
instances which span multiple PFs
- Add standard counters for reporting pause storm events (implement
in mlx5 and fbnic)
- Add configuration API for completion writeback buffering (implement
in mana)
- Support driver-initiated change of RSS context sizes
- Support DPLL monitoring input frequency (implement in zl3073x)
- Support per-port resources in devlink (implement in mlx5)
Misc:
- Expand the YAML spec for Netfilter
Drivers
- Software:
- macvlan: support multicast rx for bridge ports with shared
source MAC address
- team: decouple receive and transmit enablement for IEEE 802.3ad
LACP "independent control"
- Ethernet high-speed NICs:
- nVidia/Mellanox:
- support high order pages in zero-copy mode (for payload
coalescing)
- support multiple packets in a page (for systems with 64kB
pages)
- Broadcom 25-400GE (bnxt):
- implement XDP RSS hash metadata extraction
- add software fallback for UDP GSO, lowering the IOMMU cost
- Broadcom 800GE (bnge):
- add link status and configuration handling
- add various HW and SW statistics
- Marvell/Cavium:
- NPC HW block support for cn20k
- Huawei (hinic3):
- add mailbox / control queue
- add rx VLAN offload
- add driver info and link management
- Ethernet NICs:
- Marvell/Aquantia:
- support reading SFP module info on some AQC100 cards
- Realtek PCI (r8169):
- add support for RTL8125cp
- Realtek USB (r8152):
- support for the RTL8157 5Gbit chip
- add 2500baseT EEE status/configuration support
- Ethernet NICs embedded and off-the-shelf IP:
- Synopsys (stmmac):
- cleanup and reorganize SerDes handling and PCS support
- cleanup descriptor handling and per-platform data
- cleanup and consolidate MDIO defines and handling
- shrink driver memory use for internal structures
- improve Tx IRQ coalescing
- improve TCP segmentation handling
- add support for Spacemit K3
- Cadence (macb):
- support PHYs that have inband autoneg disabled with GEM
- support IEEE 802.3az EEE
- rework usrio capabilities and handling
- AMD (xgbe):
- improve power management for S0i3
- improve TX resilience for link-down handling
- Virtual:
- Google cloud vNIC:
- support larger ring sizes in DQO-QPL mode
- improve HW-GRO handling
- support UDP GSO for DQO format
- PCIe NTB:
- support queue count configuration
- Ethernet PHYs:
- automatically disable PHY autonomous EEE if MAC is in charge
- Broadcom:
- add BCM84891/BCM84892 support
- Micrel:
- support for LAN9645X internal PHY
- Realtek:
- add RTL8224 pair order support
- support PHY LEDs on RTL8211F-VD
- support spread spectrum clocking (SSC)
- Maxlinear:
- add PHY-level statistics via ethtool
- Ethernet switches:
- Maxlinear (mxl862xx):
- support for bridge offloading
- support for VLANs
- support driver statistics
- Bluetooth:
- large number of fixes and new device IDs
- Mediatek:
- support MT6639 (MT7927)
- support MT7902 SDIO
- WiFi:
- Intel (iwlwifi):
- UNII-9 and continuing UHR work
- MediaTek (mt76):
- mt7996/mt7925 MLO fixes/improvements
- mt7996 NPU support (HW eth/wifi traffic offload)
- Qualcomm (ath12k):
- monitor mode support on IPQ5332
- basic hwmon temperature reporting
- support IPQ5424
- Realtek:
- add USB RX aggregation to improve performance
- add USB TX flow control by tracking in-flight URBs
- Cellular:
- IPA v5.2 support"
* tag 'net-next-7.1' of git://git.kernel.org/pub/scm/linux/kernel/git/netdev/net-next: (1561 commits)
net: pse-pd: fix kernel-doc function name for pse_control_find_by_id()
wireguard: device: use exit_rtnl callback instead of manual rtnl_lock in pre_exit
wireguard: allowedips: remove redundant space
tools: ynl: add sample for wireguard
wireguard: allowedips: Use kfree_rcu() instead of call_rcu()
MAINTAINERS: Add netkit selftest files
selftests/net: Add additional test coverage in nk_qlease
selftests/net: Split netdevsim tests from HW tests in nk_qlease
tools/ynl: Make YnlFamily closeable as a context manager
net: airoha: Add missing PPE configurations in airoha_ppe_hw_init()
net: airoha: Fix VIP configuration for AN7583 SoC
net: caif: clear client service pointer on teardown
net: strparser: fix skb_head leak in strp_abort_strp()
net: usb: cdc-phonet: fix skb frags[] overflow in rx_complete()
selftests/bpf: add test for xdp_master_redirect with bond not up
net, bpf: fix null-ptr-deref in xdp_master_redirect() for down master
net: airoha: Remove PCE_MC_EN_MASK bit in REG_FE_PCE_CFG configuration
sctp: disable BH before calling udp_tunnel_xmit_skb()
sctp: fix missing encap_port propagation for GSO fragments
net: airoha: Rely on net_device pointer in ETS callbacks
...
|
|
git://git.kernel.org/pub/scm/linux/kernel/git/tip/tip
Pull vdso updates from Thomas Gleixner:
- Make the handling of compat functions consistent and more robust
- Rework the underlying data store so that it is dynamically allocated,
which allows the conversion of the last holdout SPARC64 to the
generic VDSO implementation
- Rework the SPARC64 VDSO to utilize the generic implementation
- Mop up the left overs of the non-generic VDSO support in the core
code
- Expand the VDSO selftest and make them more robust
- Allow time namespaces to be enabled independently of the generic VDSO
support, which was not possible before due to SPARC64 not using it
- Various cleanups and improvements in the related code
* tag 'timers-vdso-2026-04-12' of git://git.kernel.org/pub/scm/linux/kernel/git/tip/tip: (51 commits)
timens: Use task_lock guard in timens_get*()
timens: Use mutex guard in proc_timens_set_offset()
timens: Simplify some calls to put_time_ns()
timens: Add a __free() wrapper for put_time_ns()
timens: Remove dependency on the vDSO
vdso/timens: Move functions to new file
selftests: vDSO: vdso_test_correctness: Add a test for time()
selftests: vDSO: vdso_test_correctness: Use facilities from parse_vdso.c
selftests: vDSO: vdso_test_correctness: Handle different tv_usec types
selftests: vDSO: vdso_test_correctness: Drop SYS_getcpu fallbacks
selftests: vDSO: vdso_test_gettimeofday: Remove nolibc checks
Revert "selftests: vDSO: parse_vdso: Use UAPI headers instead of libc headers"
random: vDSO: Remove ifdeffery
random: vDSO: Trim vDSO includes
vdso/datapage: Trim down unnecessary includes
vdso/datapage: Remove inclusion of gettimeofday.h
vdso/helpers: Explicitly include vdso/processor.h
vdso/gettimeofday: Add explicit includes
random: vDSO: Add explicit includes
MIPS: vdso: Explicitly include asm/vdso/vdso.h
...
|
|
git://git.kernel.org/pub/scm/linux/kernel/git/tip/tip
Pull interrupt chip driver updates from Thomas Gleixner:
- A large refactoring for the Renesas RZV2H driver to add new interrupt
types cleanly
- A large refactoring for the Renesas RZG2L driver to add support the
new RZ/G3L variant
- Add support for the new NXP S32N79 chip in the IMX irq-steer driver
- Add support for the Apple AICv3 variant
- Enhance the Loongson PCH LPC driver so it can be used on MIPS with
device tree firmware
- Allow the PIC32 EVIC driver to be built independent of MIPS in
compile tests
- The usual small fixes and enhancements all over the place
* tag 'irq-drivers-2026-04-12' of git://git.kernel.org/pub/scm/linux/kernel/git/tip/tip: (46 commits)
irqchip/irq-pic32-evic: Add __maybe_unused for board_bind_eic_interrupt in COMPILE_TEST
irqchip/renesas-rzv2h: Kill icu_err string
irqchip/renesas-rzv2h: Kill swint_names[]
irqchip/renesas-rzv2h: Kill swint_idx[]
irqchip/renesas-rzg2l: Add NMI support
irqchip/renesas-rzg2l: Clear the shared interrupt bit in rzg2l_irqc_free()
irqchip/renesas-rzg2l: Replace raw_spin_{lock,unlock} with guard() in rzg2l_irq_set_type()
irqchip/gic-v3: Print a warning for out-of-range interrupt numbers
irqchip/renesas-rzg2l: Add shared interrupt support
irqchip/renesas-rzg2l: Add RZ/G3L support
irqchip/renesas-rzg2l: Drop IRQC_IRQ_COUNT macro
irqchip/renesas-rzg2l: Drop IRQC_TINT_START macro
irqchip/renesas-rzg2l: Drop IRQC_NUM_IRQ macro
irqchip/renesas-rzg2l: Dynamically allocate fwspec array
irqchip/renesas-rzg2l: Split rzfive_irqc_{mask,unmask} into separate IRQ and TINT handlers
irqchip/renesas-rzg2l: Split rzfive_tint_irq_endisable() into separate IRQ and TINT helpers
irqchip/renesas-rzg2l: Replace rzg2l_irqc_irq_{enable,disable} with TINT-specific handlers
irqchip/renesas-rzg2l: Split set_type handler into separate IRQ and TINT functions
irqchip/renesas-rzg2l: Split EOI handler into separate IRQ and TINT functions
irqchip/renesas-rzg2l: Replace single irq_chip with per-region irq_chip instances
...
|
|
git://git.kernel.org/pub/scm/linux/kernel/git/tip/tip
Pull core irq updates from Thomas Gleixner:
- Invoke add_interrupt_randomness() in handle_percpu_devid_irq() and
cleanup the workaround in the Hyper-V driver, which would now invoke
it twice on ARM64. Removing it from the driver requires to add it to
the x86 system vector entry point
- Remove the pointles cpu_read_lock() around reading CPU possible mask,
which is read only after init
- Add documentation for the interaction between device tree bindings
and the interrupt type defines in irq.h
- Delete stale defines in the matrix allocator and the equivalent in
loongarch
* tag 'irq-core-2026-04-12' of git://git.kernel.org/pub/scm/linux/kernel/git/tip/tip:
Drivers: hv: Move add_interrupt_randomness() to hypervisor callback sysvec
genirq/chip: Invoke add_interrupt_randomness() in handle_percpu_devid_irq()
genirq/affinity: Remove cpus_read_lock() while reading cpu_possible_mask
genirq/matrix, LoongArch: Delete IRQ_MATRIX_BITS leftovers
genirq: Document interaction between <linux/irq.h> and DT binding defines
|
|
git://git.kernel.org/pub/scm/linux/kernel/git/kbuild/linux
Pull Kbuild/Kconfig updates from Nicolas Schier:
"Kbuild:
- reject unexpected values for LLVM=
- uapi: remove usage of toolchain headers
- switch from '-fms-extensions' to '-fms-anonymous-structs' when
available (currently: clang >= 23.0.0)
- reduce the number of compiler-generated suffixes for clang thin-lto
build
- reduce output spam ("GEN Makefile") when building out of tree
- improve portability for testing headers
- also test UAPI headers against C++ compilers
- drop build ID architecture allow-list in vdso_install
- only run checksyscalls when necessary
- update the debug information notes in reproducible-builds.rst
- expand inlining hints with -fdiagnostics-show-inlining-chain
Kconfig:
- forbid multiple entries with the same symbol in a choice
- error out on duplicated kconfig inclusion"
* tag 'kbuild-7.1-1' of git://git.kernel.org/pub/scm/linux/kernel/git/kbuild/linux: (35 commits)
kbuild: expand inlining hints with -fdiagnostics-show-inlining-chain
kconfig: forbid multiple entries with the same symbol in a choice
Documentation: kbuild: Update the debug information notes in reproducible-builds.rst
checksyscalls: move instance functionality into generic code
checksyscalls: only run when necessary
checksyscalls: fail on all intermediate errors
checksyscalls: move path to reference table to a variable
kbuild: vdso_install: drop build ID architecture allow-list
kbuild: vdso_install: gracefully handle images without build ID
kbuild: vdso_install: hide readelf warnings
kbuild: vdso_install: split out the readelf invocation
kbuild: uapi: also test UAPI headers against C++ compilers
kbuild: uapi: provide a C++ compatible dummy definition of NULL
kbuild: uapi: handle UML in architecture-specific exclusion lists
kbuild: uapi: move all include path flags together
kbuild: uapi: move some compiler arguments out of the command definition
check-uapi: use dummy libc includes
check-uapi: honor ${CROSS_COMPILE} setting
check-uapi: link into shared objects
kbuild: reduce output spam when building out of tree
...
|
|
Pull bitmap updates from Yury Norov:
- new API: bitmap_weight_from() and bitmap_weighted_xor() (Yury)
- drop unused __find_nth_andnot_bit() (Yury)
- new tests and test improvements (Andy, Akinobu, Yury)
- fixes for count_zeroes API (Yury)
- cleanup bitmap_print_to_pagebuf() mess (Yury)
- documentation updates (Andy, Kai, Kit).
* tag 'bitmap-for-v7.1' of https://github.com/norov/linux: (24 commits)
bitops: Update kernel-doc for sign_extendXX()
powerpc/xive: simplify xive_spapr_debug_show()
thermal: intel: switch cpumask_get() to using cpumask_print_to_pagebuf()
coresight: don't use bitmap_print_to_pagebuf()
lib/prime_numbers: drop temporary buffer in dump_primes()
drm/xe: switch xe_pagefault_queue_init() to using bitmap_weighted_or()
ice: use bitmap_empty() in ice_vf_has_no_qs_ena
ice: use bitmap_weighted_xor() in ice_find_free_recp_res_idx()
bitmap: introduce bitmap_weighted_xor()
bitmap: add test_zero_nbits()
bitmap: exclude nbits == 0 cases from bitmap test
bitmap: test bitmap_weight() for more
asm-generic/bitops: Fix a comment typo in instrumented-atomic.h
bitops: fix kernel-doc parameter name for parity8()
lib: count_zeros: unify count_{leading,trailing}_zeros()
lib: count_zeros: fix 32/64-bit inconsistency in count_trailing_zeros()
lib: crypto: fix comments for count_leading_zeros()
x86/topology: use bitmap_weight_from()
bitmap: add bitmap_weight_from()
lib/find_bit_benchmark: avoid clearing randomly filled bitmap in test_find_first_bit()
...
|
|
git://git.kernel.org/pub/scm/linux/kernel/git/rafael/linux-pm
Pull ACPI support updates from Rafael Wysocki:
"These include an update of the CMOS RTC driver and the related ACPI
and x86 code that, among other things, switches it over to using the
platform device interface for device binding on x86 instead of the PNP
device driver interface (which allows the code in question to be
simplified quite a bit), a major update of the ACPI Time and Alarm
Device (TAD) driver adding an RTC class device interface to it, and
updates of core ACPI drivers that remove some unnecessary and not
really useful code from them.
Apart from that, two drivers are converted to using the platform
driver interface for device binding instead of the ACPI driver one,
which is slated for removal, support for the Performance Limited
register is added to the ACPI CPPC library and there are some
janitorial updates of it and the related cpufreq CPPC driver, the ACPI
processor driver is fixed and cleaned up, and NVIDIA vendor CPER
record handler is added to the APEI GHES code.
Also, the interface for obtaining a CPU UID from ACPI is consolidated
across architectures and used for fixing a problem with the PCI TPH
Steering Tag on ARM64, there are two updates related to ACPICA, a
minor ACPI OS Services Layer (OSL) update, and a few assorted updates
related to ACPI tables parsing.
Specifics:
- Update maintainers information regarding ACPICA (Rafael Wysocki)
- Replace strncpy() with strscpy_pad() in acpi_ut_safe_strncpy()
(Kees Cook)
- Trigger an ordered system power off after encountering a fatal
error operator in AML (Armin Wolf)
- Enable ACPI FPDT parsing on LoongArch (Xi Ruoyao)
- Remove the temporary stop-gap acpi_pptt_cache_v1_full structure
from the ACPI PPTT parser (Ben Horgan)
- Add support for exposing ACPI FPDT subtables FBPT and S3PT (Nate
DeSimone)
- Address multiple assorted issues and clean up the code in the ACPI
processor idle driver (Huisong Li)
- Replace strlcat() in the ACPI processor idle drive with a better
alternative (Andy Shevchenko)
- Rearrange and clean up acpi_processor_errata_piix4() (Rafael
Wysocki)
- Move reference performance to capabilities and fix an uninitialized
variable in the ACPI CPPC library (Pengjie Zhang)
- Add support for the Performance Limited Register to the ACPI CPPC
library (Sumit Gupta)
- Add cppc_get_perf() API to read performance controls, extend
cppc_set_epp_perf() for FFH/SystemMemory, and make the ACPI CPPC
library warn on missing mandatory DESIRED_PERF register (Sumit
Gupta)
- Modify the cpufreq CPPC driver to update MIN_PERF/MAX_PERF in
target callbacks to allow it to control performance bounds via
standard scaling_min_freq and scaling_max_freq sysfs attributes and
add sysfs documentation for the Performance Limited Register to it
(Sumit Gupta)
- Add ACPI support to the platform device interface in the CMOS RTC
driver, make the ACPI core device enumeration code create a
platform device for the CMOS RTC, and drop CMOS RTC PNP device
support (Rafael Wysocki)
- Consolidate the x86-specific CMOS RTC handling with the ACPI TAD
driver and clean up the CMOS RTC ACPI address space handler (Rafael
Wysocki)
- Enable ACPI alarm in the CMOS RTC driver if advertised in ACPI FADT
and allow that driver to work without a dedicated IRQ if the ACPI
alarm is used (Rafael Wysocki)
- Clean up the ACPI TAD driver in various ways and add an RTC class
device interface, including both the RTC setting/reading and alarm
timer support, to it (Rafael Wysocki)
- Clean up the ACPI AC and ACPI PAD (processor aggregator device)
drivers (Rafael Wysocki)
- Rework checking for duplicate video bus devices and consolidate
pnp.bus_id workarounds handling in the ACPI video bus driver
(Rafael Wysocki)
- Update the ACPI core device drivers to stop setting
acpi_device_name() unnecessarily (Rafael Wysocki)
- Rearrange code using acpi_device_class() in the ACPI core device
drivers and update them to stop setting acpi_device_class()
unnecessarily (Rafael Wysocki)
- Define ACPI_AC_CLASS in one place (Rafael Wysocki)
- Convert the ni903x_wdt watchdog driver and the xen ACPI PAD driver
to bind to platform devices instead of ACPI devices (Rafael
Wysocki)
- Add devm_ghes_register_vendor_record_notifier(), use it in the PCI
hisi driver, and Add NVIDIA vendor CPER record handler (Kai-Heng
Feng)
- Consolidate the interface for obtaining a CPU UID from ACPI across
architectures and use it to address incorrect PCI TPH Steering Tag
on ARM64 resulting from the invalid assumption that the ACPI
Processor UID would always be the same as the corresponding logical
CPU ID in Linux (Chengwen Feng)"
* tag 'acpi-7.1-rc1' of git://git.kernel.org/pub/scm/linux/kernel/git/rafael/linux-pm: (73 commits)
ACPICA: Update maintainers information
watchdog: ni903x_wdt: Convert to a platform driver
ACPI: PAD: xen: Convert to a platform driver
ACPI: processor: idle: Reset cpuidle on C-state list changes
cpuidle: Extract and export no-lock variants of cpuidle_unregister_device()
PCI/TPH: Pass ACPI Processor UID to Cache Locality _DSM
ACPI: PPTT: Use acpi_get_cpu_uid() and remove get_acpi_id_for_cpu()
perf: arm_cspmu: Switch to acpi_get_cpu_uid() from get_acpi_id_for_cpu()
ACPI: Centralize acpi_get_cpu_uid() declaration in include/linux/acpi.h
x86/acpi: Add acpi_get_cpu_uid() for unified ACPI CPU UID retrieval
RISC-V: ACPI: Add acpi_get_cpu_uid() for unified ACPI CPU UID retrieval
LoongArch: Add acpi_get_cpu_uid() for unified ACPI CPU UID retrieval
arm64: acpi: Add acpi_get_cpu_uid() for unified ACPI CPU UID retrieval
ACPI: APEI: GHES: Add NVIDIA vendor CPER record handler
PCI: hisi: Use devm_ghes_register_vendor_record_notifier()
ACPI: APEI: GHES: Add devm_ghes_register_vendor_record_notifier()
ACPI: tables: Enable FPDT on LoongArch
ACPI: processor: idle: Fix NULL pointer dereference in hotplug path
ACPI: processor: idle: Reset power_setup_done flag on initialization failure
ACPI: TAD: Add alarm support to the RTC class device interface
...
|
|
git://git.kernel.org/pub/scm/linux/kernel/git/kees/linux
Pull hardening updates from Kees Cook:
- randomize_kstack: Improve implementation across arches (Ryan Roberts)
- lkdtm/fortify: Drop unneeded FORTIFY_STR_OBJECT test
- refcount: Remove unused __signed_wrap function annotations
* tag 'hardening-v7.1-rc1' of git://git.kernel.org/pub/scm/linux/kernel/git/kees/linux:
lkdtm/fortify: Drop unneeded FORTIFY_STR_OBJECT test
refcount: Remove unused __signed_wrap function annotations
randomize_kstack: Unify random source across arches
randomize_kstack: Maintain kstack_offset per task
|
|
git://git.kernel.org/pub/scm/linux/kernel/git/ebiggers/linux
Pull crypto library updates from Eric Biggers:
- Migrate more hash algorithms from the traditional crypto subsystem to
lib/crypto/
Like the algorithms migrated earlier (e.g. SHA-*), this simplifies
the implementations, improves performance, enables further
simplifications in calling code, and solves various other issues:
- AES CBC-based MACs (AES-CMAC, AES-XCBC-MAC, and AES-CBC-MAC)
- Support these algorithms in lib/crypto/ using the AES library
and the existing arm64 assembly code
- Reimplement the traditional crypto API's "cmac(aes)",
"xcbc(aes)", and "cbcmac(aes)" on top of the library
- Convert mac80211 to use the AES-CMAC library. Note: several
other subsystems can use it too and will be converted later
- Drop the broken, nonstandard, and likely unused support for
"xcbc(aes)" with key lengths other than 128 bits
- Enable optimizations by default
- GHASH
- Migrate the standalone GHASH code into lib/crypto/
- Integrate the GHASH code more closely with the very similar
POLYVAL code, and improve the generic GHASH implementation to
resist cache-timing attacks and use much less memory
- Reimplement the AES-GCM library and the "gcm" crypto_aead
template on top of the GHASH library. Remove "ghash" from the
crypto_shash API, as it's no longer needed
- Enable optimizations by default
- SM3
- Migrate the kernel's existing SM3 code into lib/crypto/, and
reimplement the traditional crypto API's "sm3" on top of it
- I don't recommend using SM3, but this cleanup is worthwhile
to organize the code the same way as other algorithms
- Testing improvements:
- Add a KUnit test suite for each of the new library APIs
- Migrate the existing ChaCha20Poly1305 test to KUnit
- Make the KUnit all_tests.config enable all crypto library tests
- Move the test kconfig options to the Runtime Testing menu
- Other updates to arch-optimized crypto code:
- Optimize SHA-256 for Zhaoxin CPUs using the Padlock Hash Engine
- Remove some MD5 implementations that are no longer worth keeping
- Drop big endian and voluntary preemption support from the arm64
code, as those configurations are no longer supported on arm64
- Make jitterentropy and samples/tsm-mr use the crypto library APIs
* tag 'libcrypto-for-linus' of git://git.kernel.org/pub/scm/linux/kernel/git/ebiggers/linux: (66 commits)
lib/crypto: arm64: Assume a little-endian kernel
arm64: fpsimd: Remove obsolete cond_yield macro
lib/crypto: arm64/sha3: Remove obsolete chunking logic
lib/crypto: arm64/sha512: Remove obsolete chunking logic
lib/crypto: arm64/sha256: Remove obsolete chunking logic
lib/crypto: arm64/sha1: Remove obsolete chunking logic
lib/crypto: arm64/poly1305: Remove obsolete chunking logic
lib/crypto: arm64/gf128hash: Remove obsolete chunking logic
lib/crypto: arm64/chacha: Remove obsolete chunking logic
lib/crypto: arm64/aes: Remove obsolete chunking logic
lib/crypto: Include <crypto/utils.h> instead of <crypto/algapi.h>
lib/crypto: aesgcm: Don't disable IRQs during AES block encryption
lib/crypto: aescfb: Don't disable IRQs during AES block encryption
lib/crypto: tests: Migrate ChaCha20Poly1305 self-test to KUnit
lib/crypto: sparc: Drop optimized MD5 code
lib/crypto: mips: Drop optimized MD5 code
lib: Move crypto library tests to Runtime Testing menu
crypto: sm3 - Remove 'struct sm3_state'
crypto: sm3 - Remove the original "sm3_block_generic()"
crypto: sm3 - Remove sm3_base.h
...
|
|
Implement irqfd that deliver msi to vCPU and vCPU dmsintc irq injection.
Add pch_msi_set_irq() choice dmsintc to set msi irq by the msg_addr and
implement dmsintc set msi irq.
Signed-off-by: Song Gao <gaosong@loongson.cn>
Signed-off-by: Huacai Chen <chenhuacai@loongson.cn>
|
|
Add device model for DMSINTC interrupt controller, implement basic
create/destroy/set_attr interfaces, and register device model to kvm
device table.
Reviewed-by: Bibo Mao <maobibo@loongson.cn>
Signed-off-by: Song Gao <gaosong@loongson.cn>
Signed-off-by: Huacai Chen <chenhuacai@loongson.cn>
|
|
vcpu_is_preempted() is performance sensitive that called in function
osq_lock(), here make it as a macro. So that parameter is not parsed
at most time, it can avoid cache line thrashing across numa nodes.
Here is part of UnixBench result on Loongson-3C5000 DualWay machine with
32 cores and 2 numa nodes.
original inline macro
execl 7025.7 6991.2 7242.3
fstime 474.6 703.1 1071
From the test result, making vcpu_is_preempted() as a macro is the best,
and there is some improvment compared with the original function method.
Signed-off-by: Bibo Mao <maobibo@loongson.cn>
Signed-off-by: Huacai Chen <chenhuacai@loongson.cn>
|
|
CSR register LOONGARCH_CSR_GSTAT stores guest VMID information. With
existing implementation method, VMID is per vCPU, similar with ASID in
kernel. LOONGARCH_CSR_GSTAT is written at VM entry even if VMID is not
changed.
Here move LOONGARCH_CSR_GSTAT save/restore in vCPU context switch, and
update LOONGARCH_CSR_GSTAT only when VMID is updated at VM entry. At
most time VM enter/exit is much more frequent than vCPU thread context
switch.
Signed-off-by: Bibo Mao <maobibo@loongson.cn>
Signed-off-by: Huacai Chen <chenhuacai@loongson.cn>
|
|
CSR register LOONGARCH_CSR_EENTRY is shared between host CPU and guest
vCPU, KVM need save and restore LOONGARCH_CSR_EENTRY register. Here move
LOONGARCH_CSR_EENTRY saving in to context switch function rather than VM
entry.
At most time VM enter/exit is much more frequent than vCPU thread context
switch.
Signed-off-by: Bibo Mao <maobibo@loongson.cn>
Signed-off-by: Huacai Chen <chenhuacai@loongson.cn>
|
|
Add kvm_request_pending() checking firstly in kvm_late_check_requests(),
at most time there is no pending request, then the following pending bit
checking can be skipped.
Also embed function kvm_check_pmu() in to kvm_late_check_requests(), and
put it after the kvm_request_pending() checking.
Signed-off-by: Bibo Mao <maobibo@loongson.cn>
Signed-off-by: Huacai Chen <chenhuacai@loongson.cn>
|
|
The function reads LOONGARCH_CSR_CRMD but uses CSR_PRMD_PPLV to
extract the privilege level. While both masks have the same value
(0x3), CSR_CRMD_PLV is the semantically correct constant for CRMD.
Cc: stable@vger.kernel.org
Reviewed-by: Bibo Mao <maobibo@loongson.cn>
Signed-off-by: Tao Cui <cuitao@kylinos.cn>
Signed-off-by: Huacai Chen <chenhuacai@loongson.cn>
|
|
The entire git history for this driver looks like tree-wide
and automated cleanups. There's even more coming now with
AI, so let's try to delete it instead.
Acked-by: Jes Sorensen <jes@trained-monkey.org>
Acked-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org>
Link: https://patch.msgid.link/20260403220501.2263835-1-kuba@kernel.org
Signed-off-by: Jakub Kicinski <kuba@kernel.org>
|
|
Update acpi/pptt.c to use acpi_get_cpu_uid() and remove unused
get_acpi_id_for_cpu() from arm64/loongarch/riscv, completing PPTT's
migration to the unified ACPI CPU UID interface
Signed-off-by: Chengwen Feng <fengchengwen@huawei.com>
Reviewed-by: Jonathan Cameron <jonathan.cameron@huawei.com>
Link: https://patch.msgid.link/20260401081640.26875-8-fengchengwen@huawei.com
Signed-off-by: Rafael J. Wysocki <rafael.j.wysocki@intel.com>
|
|
Centralize acpi_get_cpu_uid() in include/linux/acpi.h (global scope) and
remove arch-specific declarations from arm64/loongarch/riscv/x86
asm/acpi.h. This unifies the interface across architectures and
simplifies maintenance by eliminating duplicate prototypes.
Signed-off-by: Chengwen Feng <fengchengwen@huawei.com>
Reviewed-by: Jonathan Cameron <jonathan.cameron@huawei.com>
Link: https://patch.msgid.link/20260401081640.26875-6-fengchengwen@huawei.com
Signed-off-by: Rafael J. Wysocki <rafael.j.wysocki@intel.com>
|
|
As a step towards unifying the interface for retrieving ACPI CPU UID
across architectures, introduce a new function acpi_get_cpu_uid() for
loongarch. While at it, add input validation to make the code more
robust.
Signed-off-by: Chengwen Feng <fengchengwen@huawei.com>
Reviewed-by: Jonathan Cameron <jonathan.cameron@huawei.com>
Link: https://patch.msgid.link/20260401081640.26875-3-fengchengwen@huawei.com
Signed-off-by: Rafael J. Wysocki <rafael.j.wysocki@intel.com>
|
|
In order to be able to do this, we need to change VM_DATA_DEFAULT_FLAGS
and friends and update the architecture-specific definitions also.
We then have to update some KSM logic to handle VMA flags, and introduce
VMA_STACK_FLAGS to define the vma_flags_t equivalent of VM_STACK_FLAGS.
We also introduce two helper functions for use during the time we are
converting legacy flags to vma_flags_t values - vma_flags_to_legacy() and
legacy_to_vma_flags().
This enables us to iteratively make changes to break these changes up into
separate parts.
We use these explicitly here to keep VM_STACK_FLAGS around for certain
users which need to maintain the legacy vm_flags_t values for the time
being.
We are no longer able to rely on the simple VM_xxx being set to zero if
the feature is not enabled, so in the case of VM_DROPPABLE we introduce
VMA_DROPPABLE as the vma_flags_t equivalent, which is set to
EMPTY_VMA_FLAGS if the droppable flag is not available.
While we're here, we make the description of do_brk_flags() into a kdoc
comment, as it almost was already.
We use vma_flags_to_legacy() to not need to update the vm_get_page_prot()
logic as this time.
Note that in create_init_stack_vma() we have to replace the BUILD_BUG_ON()
with a VM_WARN_ON_ONCE() as the tested values are no longer build time
available.
We also update mprotect_fixup() to use VMA flags where possible, though we
have to live with a little duplication between vm_flags_t and vma_flags_t
values for the time being until further conversions are made.
While we're here, update VM_SPECIAL to be defined in terms of
VMA_SPECIAL_FLAGS now we have vma_flags_to_legacy().
Finally, we update the VMA tests to reflect these changes.
Link: https://lkml.kernel.org/r/d02e3e45d9a33d7904b149f5604904089fd640ae.1774034900.git.ljs@kernel.org
Signed-off-by: Lorenzo Stoakes (Oracle) <ljs@kernel.org>
Acked-by: Paul Moore <paul@paul-moore.com> [SELinux]
Acked-by: Vlastimil Babka (SUSE) <vbabka@kernel.org>
Cc: Albert Ou <aou@eecs.berkeley.edu>
Cc: Alexander Gordeev <agordeev@linux.ibm.com>
Cc: Alexandre Ghiti <alex@ghiti.fr>
Cc: Al Viro <viro@zeniv.linux.org.uk>
Cc: Anton Ivanov <anton.ivanov@cambridgegreys.com>
Cc: "Borislav Petkov (AMD)" <bp@alien8.de>
Cc: Catalin Marinas <catalin.marinas@arm.com>
Cc: Chengming Zhou <chengming.zhou@linux.dev>
Cc: Christian Borntraeger <borntraeger@linux.ibm.com>
Cc: Christian Brauner <brauner@kernel.org>
Cc: David Hildenbrand <david@kernel.org>
Cc: Dinh Nguyen <dinguyen@kernel.org>
Cc: Heiko Carstens <hca@linux.ibm.com>
Cc: "H. Peter Anvin" <hpa@zytor.com>
Cc: Huacai Chen <chenhuacai@kernel.org>
Cc: Ingo Molnar <mingo@redhat.com>
Cc: Jan Kara <jack@suse.cz>
Cc: Jann Horn <jannh@google.com>
Cc: Johannes Berg <johannes@sipsolutions.net>
Cc: Kees Cook <kees@kernel.org>
Cc: Liam Howlett <liam.howlett@oracle.com>
Cc: Madhavan Srinivasan <maddy@linux.ibm.com>
Cc: Michael Ellerman <mpe@ellerman.id.au>
Cc: Michal Hocko <mhocko@suse.com>
Cc: Mike Rapoport <rppt@kernel.org>
Cc: Nicholas Piggin <npiggin@gmail.com>
Cc: Ondrej Mosnacek <omosnace@redhat.com>
Cc: Palmer Dabbelt <palmer@dabbelt.com>
Cc: Pedro Falcato <pfalcato@suse.de>
Cc: Richard Weinberger <richard@nod.at>
Cc: Russell King <linux@armlinux.org.uk>
Cc: Stephen Smalley <stephen.smalley.work@gmail.com>
Cc: Suren Baghdasaryan <surenb@google.com>
Cc: Sven Schnelle <svens@linux.ibm.com>
Cc: Thomas Bogendoerfer <tsbogend@alpha.franken.de>
Cc: Vasily Gorbik <gor@linux.ibm.com>
Cc: Vineet Gupta <vgupta@kernel.org>
Cc: WANG Xuerui <kernel@xen0n.name>
Cc: Will Deacon <will@kernel.org>
Cc: xu xin <xu.xin16@zte.com.cn>
Signed-off-by: Andrew Morton <akpm@linux-foundation.org>
|
|
Patch series "mm: remove CONFIG_ARCH_ENABLE_MEMORY_HOTREMOVE and cleanup
CONFIG_MIGRATION".
While working on memory hotplug code cleanups, I realized that
CONFIG_ARCH_ENABLE_MEMORY_HOTREMOVE is not really required anymore.
Changing that revealed some rather nasty looking CONFIG_MIGRATION
handling.
Let's clean that up by introducing a dedicated CONFIG_NUMA_MIGRATION
option and reducing the dependencies that CONFIG_MIGRATION has.
This patch (of 2):
All architectures that select CONFIG_ARCH_ENABLE_MEMORY_HOTREMOVE also
select CONFIG_ARCH_ENABLE_MEMORY_HOTPLUG. So we can just remove
CONFIG_ARCH_ENABLE_MEMORY_HOTREMOVE.
For CONFIG_MIGRATION, make it depend on CONFIG_MEMORY_HOTREMOVE instead,
and make CONFIG_MEMORY_HOTREMOVE select CONFIG_MIGRATION (just like
CONFIG_CMA and CONFIG_COMPACTION already do).
We'll clean up CONFIG_MIGRATION next.
Link: https://lkml.kernel.org/r/20260319-config_migration-v1-0-42270124966f@kernel.org
Link: https://lkml.kernel.org/r/20260319-config_migration-v1-1-42270124966f@kernel.org
Signed-off-by: David Hildenbrand (Arm) <david@kernel.org>
Acked-by: Zi Yan <ziy@nvidia.com>
Reviewed-by: Lorenzo Stoakes (Oracle) <ljs@kernel.org>
Reviewed-by: Joshua Hahn <joshua.hahnjy@gmail.com>
Reviewed-by: Gregory Price <gourry@gourry.net>
Reviewed-by: Mike Rapoport (Microsoft) <rppt@kernel.org>
Reviewed-by: Jonathan Cameron <jonathan.cameron@huawei.com>
Cc: Albert Ou <aou@eecs.berkeley.edu>
Cc: Alexander Gordeev <agordeev@linux.ibm.com>
Cc: Alexandre Ghiti <alex@ghiti.fr>
Cc: Alistair Popple <apopple@nvidia.com>
Cc: "Borislav Petkov (AMD)" <bp@alien8.de>
Cc: Byungchul Park <byungchul@sk.com>
Cc: Catalin Marinas <catalin.marinas@arm.com>
Cc: Christian Borntraeger <borntraeger@linux.ibm.com>
Cc: Heiko Carstens <hca@linux.ibm.com>
Cc: "H. Peter Anvin" <hpa@zytor.com>
Cc: Huacai Chen <chenhuacai@kernel.org>
Cc: "Huang, Ying" <ying.huang@linux.alibaba.com>
Cc: Ingo Molnar <mingo@redhat.com>
Cc: Liam Howlett <liam.howlett@oracle.com>
Cc: Madhavan Srinivasan <maddy@linux.ibm.com>
Cc: Matthew Brost <matthew.brost@intel.com>
Cc: Michael Ellerman <mpe@ellerman.id.au>
Cc: Michal Hocko <mhocko@suse.com>
Cc: Nicholas Piggin <npiggin@gmail.com>
Cc: Palmer Dabbelt <palmer@dabbelt.com>
Cc: Rakie Kim <rakie.kim@sk.com>
Cc: Sebastian Andrzej Siewior <bigeasy@linutronix.de>
Cc: Steven Rostedt <rostedt@goodmis.org>
Cc: Suren Baghdasaryan <surenb@google.com>
Cc: Sven Schnelle <svens@linux.ibm.com>
Cc: Vasily Gorbik <gor@linux.ibm.com>
Cc: WANG Xuerui <kernel@xen0n.name>
Cc: Will Deacon <will@kernel.org>
Signed-off-by: Andrew Morton <akpm@linux-foundation.org>
|
|
The upcoming change to the HugeTLB vmemmap optimization (HVO) requires
struct pages of the head page to be naturally aligned with regard to the
folio size.
Align vmemmap to MAX_FOLIO_VMEMMAP_ALIGN.
Link: https://lkml.kernel.org/r/20260227194302.274384-7-kas@kernel.org
Signed-off-by: Kiryl Shutsemau <kas@kernel.org>
Cc: Albert Ou <aou@eecs.berkeley.edu>
Cc: Alexandre Ghiti <alex@ghiti.fr>
Cc: Baoquan He <bhe@redhat.com>
Cc: Christoph Lameter <cl@gentwo.org>
Cc: David Hildenbrand (arm) <david@kernel.org>
Cc: David Rientjes <rientjes@google.com>
Cc: Frank van der Linden <fvdl@google.com>
Cc: Harry Yoo <harry.yoo@oracle.com>
Cc: Huacai Chen <chenhuacai@kernel.org>
Cc: Johannes Weiner <hannes@cmpxchg.org>
Cc: Jonathan Corbet <corbet@lwn.net>
Cc: Lorenzo Stoakes <lorenzo.stoakes@oracle.com>
Cc: Matthew Wilcox (Oracle) <willy@infradead.org>
Cc: Michal Hocko <mhocko@suse.com>
Cc: Mike Rapoport <rppt@kernel.org>
Cc: Muchun Song <muchun.song@linux.dev>
Cc: Oscar Salvador <osalvador@suse.de>
Cc: Palmer Dabbelt <palmer@dabbelt.com>
Cc: Paul Walmsley <paul.walmsley@sifive.com>
Cc: Roman Gushchin <roman.gushchin@linux.dev>
Cc: Usama Arif <usamaarif642@gmail.com>
Cc: Vlastimil Babka <vbabka@suse.cz>
Cc: WANG Xuerui <kernel@xen0n.name>
Cc: Zi Yan <ziy@nvidia.com>
Signed-off-by: Andrew Morton <akpm@linux-foundation.org>
|
|
Reduce 22 declarations of empty_zero_page to 3 and 23 declarations of
ZERO_PAGE() to 4.
Every architecture defines empty_zero_page that way or another, but for the
most of them it is always a page aligned page in BSS and most definitions
of ZERO_PAGE do virt_to_page(empty_zero_page).
Move Linus vetted x86 definition of empty_zero_page and ZERO_PAGE() to the
core MM and drop these definitions in architectures that do not implement
colored zero page (MIPS and s390).
ZERO_PAGE() remains a macro because turning it to a wrapper for a static
inline causes severe pain in header dependencies.
For the most part the change is mechanical, with these being noteworthy:
* alpha: aliased empty_zero_page with ZERO_PGE that was also used for boot
parameters. Switching to a generic empty_zero_page removes the aliasing
and keeps ZERO_PGE for boot parameters only
* arm64: uses __pa_symbol() in ZERO_PAGE() so that definition of
ZERO_PAGE() is kept intact.
* m68k/parisc/um: allocated empty_zero_page from memblock,
although they do not support zero page coloring and having it in BSS
will work fine.
* sparc64 can have empty_zero_page in BSS rather allocate it, but it
can't use virt_to_page() for BSS. Keep it's definition of ZERO_PAGE()
but instead of allocating it, make mem_map_zero point to
empty_zero_page.
* sh: used empty_zero_page for boot parameters at the very early boot.
Rename the parameters page to boot_params_page and let sh use the generic
empty_zero_page.
* hexagon: had an amusing comment about empty_zero_page
/* A handy thing to have if one has the RAM. Declared in head.S */
that unfortunately had to go :)
Link: https://lkml.kernel.org/r/20260211103141.3215197-4-rppt@kernel.org
Signed-off-by: Mike Rapoport (Microsoft) <rppt@kernel.org>
Acked-by: Helge Deller <deller@gmx.de> [parisc]
Tested-by: Helge Deller <deller@gmx.de> [parisc]
Reviewed-by: Christophe Leroy (CS GROUP) <chleroy@kernel.org>
Acked-by: Dave Hansen <dave.hansen@linux.intel.com>
Acked-by: Catalin Marinas <catalin.marinas@arm.com>
Acked-by: Magnus Lindholm <linmag7@gmail.com> [alpha]
Acked-by: Dinh Nguyen <dinguyen@kernel.org> [nios2]
Acked-by: Andreas Larsson <andreas@gaisler.com> [sparc]
Acked-by: David Hildenbrand (Arm) <david@kernel.org>
Acked-by: Liam R. Howlett <Liam.Howlett@oracle.com>
Cc: "Borislav Petkov (AMD)" <bp@alien8.de>
Cc: David S. Miller <davem@davemloft.net>
Cc: Geert Uytterhoeven <geert@linux-m68k.org>
Cc: Guo Ren <guoren@kernel.org>
Cc: Huacai Chen <chenhuacai@kernel.org>
Cc: Ingo Molnar <mingo@redhat.com>
Cc: Johannes Berg <johannes@sipsolutions.net>
Cc: John Paul Adrian Glaubitz <glaubitz@physik.fu-berlin.de>
Cc: Lorenzo Stoakes <lorenzo.stoakes@oracle.com>
Cc: Madhavan Srinivasan <maddy@linux.ibm.com>
Cc: Matt Turner <mattst88@gmail.com>
Cc: Max Filippov <jcmvbkbc@gmail.com>
Cc: Michael Ellerman <mpe@ellerman.id.au>
Cc: Michal Hocko <mhocko@suse.com>
Cc: Michal Simek <monstr@monstr.eu>
Cc: Palmer Dabbelt <palmer@dabbelt.com>
Cc: Richard Weinberger <richard@nod.at>
Cc: Russell King <linux@armlinux.org.uk>
Cc: Stafford Horne <shorne@gmail.com>
Cc: Suren Baghdasaryan <surenb@google.com>
Cc: Vineet Gupta <vgupta@kernel.org>
Cc: Vlastimil Babka <vbabka@suse.cz>
Cc: Will Deacon <will@kernel.org>
Signed-off-by: Andrew Morton <akpm@linux-foundation.org>
|
|
to be able to merge the hyper-v patch related to randomness.
|
|
Move the asm/xor.h headers to lib/raid/xor/$(SRCARCH)/xor_arch.h and
include/linux/raid/xor_impl.h to lib/raid/xor/xor_impl.h so that the
xor.ko module implementation is self-contained in lib/raid/.
As this remove the asm-generic mechanism a new kconfig symbol is added to
indicate that a architecture-specific implementations exists, and
xor_arch.h should be included.
Link: https://lkml.kernel.org/r/20260327061704.3707577-22-hch@lst.de
Signed-off-by: Christoph Hellwig <hch@lst.de>
Reviewed-by: Eric Biggers <ebiggers@kernel.org>
Tested-by: Eric Biggers <ebiggers@kernel.org>
Cc: Albert Ou <aou@eecs.berkeley.edu>
Cc: Alexander Gordeev <agordeev@linux.ibm.com>
Cc: Alexandre Ghiti <alex@ghiti.fr>
Cc: Andreas Larsson <andreas@gaisler.com>
Cc: Anton Ivanov <anton.ivanov@cambridgegreys.com>
Cc: Ard Biesheuvel <ardb@kernel.org>
Cc: Arnd Bergmann <arnd@arndb.de>
Cc: "Borislav Petkov (AMD)" <bp@alien8.de>
Cc: Catalin Marinas <catalin.marinas@arm.com>
Cc: Chris Mason <clm@fb.com>
Cc: Christian Borntraeger <borntraeger@linux.ibm.com>
Cc: Dan Williams <dan.j.williams@intel.com>
Cc: David S. Miller <davem@davemloft.net>
Cc: David Sterba <dsterba@suse.com>
Cc: Heiko Carstens <hca@linux.ibm.com>
Cc: Herbert Xu <herbert@gondor.apana.org.au>
Cc: "H. Peter Anvin" <hpa@zytor.com>
Cc: Huacai Chen <chenhuacai@kernel.org>
Cc: Ingo Molnar <mingo@redhat.com>
Cc: Jason A. Donenfeld <jason@zx2c4.com>
Cc: Johannes Berg <johannes@sipsolutions.net>
Cc: Li Nan <linan122@huawei.com>
Cc: Madhavan Srinivasan <maddy@linux.ibm.com>
Cc: Magnus Lindholm <linmag7@gmail.com>
Cc: Matt Turner <mattst88@gmail.com>
Cc: Michael Ellerman <mpe@ellerman.id.au>
Cc: Nicholas Piggin <npiggin@gmail.com>
Cc: Palmer Dabbelt <palmer@dabbelt.com>
Cc: Richard Henderson <richard.henderson@linaro.org>
Cc: Richard Weinberger <richard@nod.at>
Cc: Russell King <linux@armlinux.org.uk>
Cc: Song Liu <song@kernel.org>
Cc: Sven Schnelle <svens@linux.ibm.com>
Cc: Ted Ts'o <tytso@mit.edu>
Cc: Vasily Gorbik <gor@linux.ibm.com>
Cc: WANG Xuerui <kernel@xen0n.name>
Cc: Will Deacon <will@kernel.org>
Signed-off-by: Andrew Morton <akpm@linux-foundation.org>
|
|
Move the optimized XOR into lib/raid and include it it in xor.ko instead
of always building it into the main kernel image.
Link: https://lkml.kernel.org/r/20260327061704.3707577-15-hch@lst.de
Signed-off-by: Christoph Hellwig <hch@lst.de>
Reviewed-by: Eric Biggers <ebiggers@kernel.org>
Tested-by: Eric Biggers <ebiggers@kernel.org>
Cc: Albert Ou <aou@eecs.berkeley.edu>
Cc: Alexander Gordeev <agordeev@linux.ibm.com>
Cc: Alexandre Ghiti <alex@ghiti.fr>
Cc: Andreas Larsson <andreas@gaisler.com>
Cc: Anton Ivanov <anton.ivanov@cambridgegreys.com>
Cc: Ard Biesheuvel <ardb@kernel.org>
Cc: Arnd Bergmann <arnd@arndb.de>
Cc: "Borislav Petkov (AMD)" <bp@alien8.de>
Cc: Catalin Marinas <catalin.marinas@arm.com>
Cc: Chris Mason <clm@fb.com>
Cc: Christian Borntraeger <borntraeger@linux.ibm.com>
Cc: Dan Williams <dan.j.williams@intel.com>
Cc: David S. Miller <davem@davemloft.net>
Cc: David Sterba <dsterba@suse.com>
Cc: Heiko Carstens <hca@linux.ibm.com>
Cc: Herbert Xu <herbert@gondor.apana.org.au>
Cc: "H. Peter Anvin" <hpa@zytor.com>
Cc: Huacai Chen <chenhuacai@kernel.org>
Cc: Ingo Molnar <mingo@redhat.com>
Cc: Jason A. Donenfeld <jason@zx2c4.com>
Cc: Johannes Berg <johannes@sipsolutions.net>
Cc: Li Nan <linan122@huawei.com>
Cc: Madhavan Srinivasan <maddy@linux.ibm.com>
Cc: Magnus Lindholm <linmag7@gmail.com>
Cc: Matt Turner <mattst88@gmail.com>
Cc: Michael Ellerman <mpe@ellerman.id.au>
Cc: Nicholas Piggin <npiggin@gmail.com>
Cc: Palmer Dabbelt <palmer@dabbelt.com>
Cc: Richard Henderson <richard.henderson@linaro.org>
Cc: Richard Weinberger <richard@nod.at>
Cc: Russell King <linux@armlinux.org.uk>
Cc: Song Liu <song@kernel.org>
Cc: Sven Schnelle <svens@linux.ibm.com>
Cc: Ted Ts'o <tytso@mit.edu>
Cc: Vasily Gorbik <gor@linux.ibm.com>
Cc: WANG Xuerui <kernel@xen0n.name>
Cc: Will Deacon <will@kernel.org>
Signed-off-by: Andrew Morton <akpm@linux-foundation.org>
|
|
Drop the pretty confusing historic XOR_TRY_TEMPLATES and
XOR_SELECT_TEMPLATE, and instead let the architectures provide a
arch_xor_init that calls either xor_register to register candidates or
xor_force to force a specific implementation.
Link: https://lkml.kernel.org/r/20260327061704.3707577-10-hch@lst.de
Signed-off-by: Christoph Hellwig <hch@lst.de>
Reviewed-by: Eric Biggers <ebiggers@kernel.org>
Tested-by: Eric Biggers <ebiggers@kernel.org>
Cc: Albert Ou <aou@eecs.berkeley.edu>
Cc: Alexander Gordeev <agordeev@linux.ibm.com>
Cc: Alexandre Ghiti <alex@ghiti.fr>
Cc: Andreas Larsson <andreas@gaisler.com>
Cc: Anton Ivanov <anton.ivanov@cambridgegreys.com>
Cc: Ard Biesheuvel <ardb@kernel.org>
Cc: Arnd Bergmann <arnd@arndb.de>
Cc: "Borislav Petkov (AMD)" <bp@alien8.de>
Cc: Catalin Marinas <catalin.marinas@arm.com>
Cc: Chris Mason <clm@fb.com>
Cc: Christian Borntraeger <borntraeger@linux.ibm.com>
Cc: Dan Williams <dan.j.williams@intel.com>
Cc: David S. Miller <davem@davemloft.net>
Cc: David Sterba <dsterba@suse.com>
Cc: Heiko Carstens <hca@linux.ibm.com>
Cc: Herbert Xu <herbert@gondor.apana.org.au>
Cc: "H. Peter Anvin" <hpa@zytor.com>
Cc: Huacai Chen <chenhuacai@kernel.org>
Cc: Ingo Molnar <mingo@redhat.com>
Cc: Jason A. Donenfeld <jason@zx2c4.com>
Cc: Johannes Berg <johannes@sipsolutions.net>
Cc: Li Nan <linan122@huawei.com>
Cc: Madhavan Srinivasan <maddy@linux.ibm.com>
Cc: Magnus Lindholm <linmag7@gmail.com>
Cc: Matt Turner <mattst88@gmail.com>
Cc: Michael Ellerman <mpe@ellerman.id.au>
Cc: Nicholas Piggin <npiggin@gmail.com>
Cc: Palmer Dabbelt <palmer@dabbelt.com>
Cc: Richard Henderson <richard.henderson@linaro.org>
Cc: Richard Weinberger <richard@nod.at>
Cc: Russell King <linux@armlinux.org.uk>
Cc: Song Liu <song@kernel.org>
Cc: Sven Schnelle <svens@linux.ibm.com>
Cc: Ted Ts'o <tytso@mit.edu>
Cc: Vasily Gorbik <gor@linux.ibm.com>
Cc: WANG Xuerui <kernel@xen0n.name>
Cc: Will Deacon <will@kernel.org>
Signed-off-by: Andrew Morton <akpm@linux-foundation.org>
|
|
Loongson 7A PCH chips all contain a LPC controller, which is used in
some devices to connect legacy ISA devices (e.g. 8259 PS/2 controller).
The LPC irqchip driver will register LPC interrupts at the fixed range
0~15, and the PCH PIC irqchip driver uses dynamic allocation. However the
LPC interrupt numbers are currently not exempted from dynamic allocation.
The current setup work by accident because the LPC interrupt controller is
the first consumer of PIC interrupt controller, and the PIC interrupt
number is allocated after LPC interrupts are registered. Such setup is
fragile and will stop to work when the LPC irqchip driver is reworked.
Override arch_dynirq_lower_bound() to reserve LPC interrupts from dynamic
allocation, to prevent interrupt number collision and allow rework of the
LPC irqchip driver.
Signed-off-by: Icenowy Zheng <zhengxingda@iscas.ac.cn>
Signed-off-by: Thomas Gleixner <tglx@kernel.org>
Reviewed-by: Huacai Chen <chenhuacai@loongson.cn>
Link: https://patch.msgid.link/20260321092032.3502701-3-zhengxingda@iscas.ac.cn
|
|
In function kvm_eiointc_regs_access(), the register base address is
caculated from array base address plus offset, the offset is absolute
value from the base address. The data type of array base address is
u64, it should be converted into the "void *" type and then plus the
offset.
Cc: <stable@vger.kernel.org>
Fixes: d3e43a1f34ac ("LoongArch: KVM: Use 64-bit register definition for EIOINTC").
Reported-by: Aurelien Jarno <aurel32@debian.org>
Link: https://bugs.debian.org/cgi-bin/bugreport.cgi?bug=1131431
Signed-off-by: Bibo Mao <maobibo@loongson.cn>
Signed-off-by: Huacai Chen <chenhuacai@loongson.cn>
|
|
EIOINTC's coremap in eiointc_update_sw_coremap() can be empty, currently
we get a cpuid with -1 in this case, but we actually need 0 because it's
similar as the case that cpuid >= 4.
This fix an out-of-bounds access to kvm_arch::phyid_map::phys_map[].
Cc: <stable@vger.kernel.org>
Fixes: 3956a52bc05bd81 ("LoongArch: KVM: Add EIOINTC read and write functions")
Reported-by: Aurelien Jarno <aurel32@debian.org>
Link: https://bugs.debian.org/cgi-bin/bugreport.cgi?bug=1131431
Signed-off-by: Huacai Chen <chenhuacai@loongson.cn>
|
|
kvm_get_vcpu_by_cpuid() takes a cpuid parameter whose type is int, so
cpuid can be negative. Let kvm_get_vcpu_by_cpuid() return NULL for this
case so as to make it more robust.
This fix an out-of-bounds access to kvm_arch::phyid_map::phys_map[].
Cc: <stable@vger.kernel.org>
Fixes: 73516e9da512adc ("LoongArch: KVM: Add vcpu mapping from physical cpuid")
Reported-by: Aurelien Jarno <aurel32@debian.org>
Link: https://bugs.debian.org/cgi-bin/bugreport.cgi?bug=1131431
Signed-off-by: Huacai Chen <chenhuacai@loongson.cn>
|
|
With -fno-asynchronous-unwind-tables and --no-eh-frame-hdr (the default
of the linker), the GNU_EH_FRAME segment (specified by vdso.lds.S) is
empty. This is not valid, as the current DWARF specification mandates
the first byte of the EH frame to be the version number 1. It causes
some unwinders to complain, for example the ClickHouse query profiler
spams the log with messages:
clickhouse-server[365854]: libunwind: unsupported .eh_frame_hdr
version: 127 at 7ffffffb0000
Here "127" is just the byte located at the p_vaddr (0, i.e. the
beginning of the vDSO) of the empty GNU_EH_FRAME segment. Cross-
checking with /proc/365854/maps has also proven 7ffffffb0000 is the
start of vDSO in the process VM image.
In LoongArch the -fno-asynchronous-unwind-tables option seems just a
MIPS legacy, and MIPS only uses this option to satisfy the MIPS-specific
"genvdso" program, per the commit cfd75c2db17e ("MIPS: VDSO: Explicitly
use -fno-asynchronous-unwind-tables"). IIRC it indicates some inherent
limitation of the MIPS ELF ABI and has nothing to do with LoongArch. So
we can simply flip it over to -fasynchronous-unwind-tables and pass
--eh-frame-hdr for linking the vDSO, allowing the profilers to unwind the
stack for statistics even if the sample point is taken when the PC is in
the vDSO.
However simply adjusting the options above would exploit an issue: when
the libgcc unwinder saw the invalid GNU_EH_FRAME segment, it silently
falled back to a machine-specific routine to match the code pattern of
rt_sigreturn() and extract the registers saved in the sigframe if the
code pattern is matched. As unwinding from signal handlers is vital for
libgcc to support pthread cancellation etc., the fall-back routine had
been silently keeping the LoongArch Linux systems functioning since
Linux 5.19. But when we start to emit GNU_EH_FRAME with the correct
format, fall-back routine will no longer be used and libgcc will fail
to unwind the sigframe, and unwinding from signal handlers will no
longer work, causing dozens of glibc test failures. To make it possible
to unwind from signal handlers again, it's necessary to code the unwind
info in __vdso_rt_sigreturn via .cfi_* directives.
The offsets in the .cfi_* directives depend on the layout of struct
sigframe, notably the offset of sigcontext in the sigframe. To use the
offset in the assembly file, factor out struct sigframe into a header to
allow asm-offsets.c to output the offset for assembly.
To work around a long-term issue in the libgcc unwinder (the pc is
unconditionally substracted by 1: doing so is technically incorrect for
a signal frame), a nop instruction is included with the two real
instructions in __vdso_rt_sigreturn in the same FDE PC range. The same
hack has been used on x86 for a long time.
Cc: stable@vger.kernel.org
Fixes: c6b99bed6b8f ("LoongArch: Add VDSO and VSYSCALL support")
Signed-off-by: Xi Ruoyao <xry111@xry111.site>
Signed-off-by: Huacai Chen <chenhuacai@loongson.cn>
|
|
1. Hardware limitation: GPU, DC and VPU are typically PCI device 06.0,
06.1 and 06.2. They share some hardware resources, so when configure the
PCI 06.0 device BAR1, DMA memory access cannot be performed through this
BAR, otherwise it will cause hardware abnormalities.
2. In typical scenarios of reboot or S3/S4, DC access to memory through
BAR is not prohibited, resulting in GPU DMA hangs.
3. Workaround method: When configuring the 06.0 device BAR1, turn off
the memory access of DC, GPU and VPU (via DC's CRTC registers).
Cc: stable@vger.kernel.org
Signed-off-by: Qianhai Wu <wuqianhai@loongson.cn>
Signed-off-by: Huacai Chen <chenhuacai@loongson.cn>
|
|
1. Replace "of_find_node_by_path("/")" with "of_root" to avoid multiple
calls to "of_node_put()".
2. Fix a potential kernel oops during early boot when memory allocation
fails while parsing CPU model from device tree.
Cc: stable@vger.kernel.org
Signed-off-by: Li Jun <lijun01@kylinos.cn>
Signed-off-by: Huacai Chen <chenhuacai@loongson.cn>
|
|
Previously different architectures were using random sources of
differing strength and cost to decide the random kstack offset. A number
of architectures (loongarch, powerpc, s390, x86) were using their
timestamp counter, at whatever the frequency happened to be. Other
arches (arm64, riscv) were using entropy from the crng via
get_random_u16().
There have been concerns that in some cases the timestamp counters may
be too weak, because they can be easily guessed or influenced by user
space. And get_random_u16() has been shown to be too costly for the
level of protection kstack offset randomization provides.
So let's use a common, architecture-agnostic source of entropy; a
per-cpu prng, seeded at boot-time from the crng. This has a few
benefits:
- We can remove choose_random_kstack_offset(); That was only there to
try to make the timestamp counter value a bit harder to influence
from user space [*].
- The architecture code is simplified. All it has to do now is call
add_random_kstack_offset() in the syscall path.
- The strength of the randomness can be reasoned about independently
of the architecture.
- Arches previously using get_random_u16() now have much faster
syscall paths, see below results.
[*] Additionally, this gets rid of some redundant work on s390 and x86.
Before this patch, those architectures called
choose_random_kstack_offset() under arch_exit_to_user_mode_prepare(),
which is also called for exception returns to userspace which were *not*
syscalls (e.g. regular interrupts). Getting rid of
choose_random_kstack_offset() avoids a small amount of redundant work
for the non-syscall cases.
In some configurations, add_random_kstack_offset() will now call
instrumentable code, so for a couple of arches, I have moved the call a
bit later to the first point where instrumentation is allowed. This
doesn't impact the efficacy of the mechanism.
There have been some claims that a prng may be less strong than the
timestamp counter if not regularly reseeded. But the prng has a period
of about 2^113. So as long as the prng state remains secret, it should
not be possible to guess. If the prng state can be accessed, we have
bigger problems.
Additionally, we are only consuming 6 bits to randomize the stack, so
there are only 64 possible random offsets. I assert that it would be
trivial for an attacker to brute force by repeating their attack and
waiting for the random stack offset to be the desired one. The prng
approach seems entirely proportional to this level of protection.
Performance data are provided below. The baseline is v6.18 with rndstack
on for each respective arch. (I)/(R) indicate statistically significant
improvement/regression. arm64 platform is AWS Graviton3 (m7g.metal).
x86_64 platform is AWS Sapphire Rapids (m7i.24xlarge):
+-----------------+--------------+---------------+---------------+
| Benchmark | Result Class | per-cpu-prng | per-cpu-prng |
| | | arm64 (metal) | x86_64 (VM) |
+=================+==============+===============+===============+
| syscall/getpid | mean (ns) | (I) -9.50% | (I) -17.65% |
| | p99 (ns) | (I) -59.24% | (I) -24.41% |
| | p99.9 (ns) | (I) -59.52% | (I) -28.52% |
+-----------------+--------------+---------------+---------------+
| syscall/getppid | mean (ns) | (I) -9.52% | (I) -19.24% |
| | p99 (ns) | (I) -59.25% | (I) -25.03% |
| | p99.9 (ns) | (I) -59.50% | (I) -28.17% |
+-----------------+--------------+---------------+---------------+
| syscall/invalid | mean (ns) | (I) -10.31% | (I) -18.56% |
| | p99 (ns) | (I) -60.79% | (I) -20.06% |
| | p99.9 (ns) | (I) -61.04% | (I) -25.04% |
+-----------------+--------------+---------------+---------------+
I tested an earlier version of this change on x86 bare metal and it
showed a smaller but still significant improvement. The bare metal
system wasn't available this time around so testing was done in a VM
instance. I'm guessing the cost of rdtsc is higher for VMs.
Acked-by: Mark Rutland <mark.rutland@arm.com>
Signed-off-by: Ryan Roberts <ryan.roberts@arm.com>
Link: https://patch.msgid.link/20260303150840.3789438-3-ryan.roberts@arm.com
Signed-off-by: Kees Cook <kees@kernel.org>
|
|
The kconfig options for generic crypto API modules have traditionally
*not* had a "_GENERIC" suffix. Also, the "_GENERIC" suffix will make
even less sense once the architecture-optimized SM3 code is moved into
lib/crypto/ and the "sm3" crypto_shash is reimplemented on top of that.
Thus, rename CRYPTO_SM3_GENERIC to CRYPTO_SM3.
Acked-by: Ard Biesheuvel <ardb@kernel.org>
Link: https://lore.kernel.org/r/20260321040935.410034-4-ebiggers@kernel.org
Signed-off-by: Eric Biggers <ebiggers@kernel.org>
|
|
The 'leading' helper returns BITS_PER_LONG if x == 0, while 'trailing'
one returns COUNT_TRAILING_ZEROS_0, which turns to be -1.
None of the current users explicitly check the returned value for
COUNT_TRAILING_ZEROS_0, except the loongarch, which tests implicitly
for the '>= 0'.
So, align count_trailing_zeros() with the count_leading_zeros(), and
simplify the loongarch handling.
Reviewed-by: Andy Shevchenko <andriy.shevchenko@intel.com>
Signed-off-by: Yury Norov <ynorov@nvidia.com>
|
|
Delete IRQ_MATRIX_BITS leftovers after commit 5b98d210ac1e ("genirq/matrix:
Dynamic bitmap allocation") has made IRQ_MATRIX_BITS obsolete.
Signed-off-by: Nam Cao <namcao@linutronix.de>
Signed-off-by: Thomas Gleixner <tglx@kernel.org>
Link: https://patch.msgid.link/20260316072850.467995-1-namcao@linutronix.de
|
|
Most of VM feature detections are integer OR operations, and integer
assignment operation will clear previous integer OR operation. So here
change all integer assignment operations to integer OR operations.
Fixes: 82db90bf461b ("LoongArch: KVM: Move feature detection in kvm_vm_init_features()")
Signed-off-by: Bibo Mao <maobibo@loongson.cn>
Signed-off-by: Huacai Chen <chenhuacai@loongson.cn>
|
|
Occasionally there exist "text_copy_cb: operation failed" when executing
the bpf selftests, the reason is copy_to_kernel_nofault() failed and the
ecode of ESTAT register is 0x4 (PME: Page Modification Exception) due to
the pte is not writeable. The root cause is that there is another place
to set the pte entry as readonly which is in the generic weak version of
arch_protect_bpf_trampoline().
There are two ways to fix this race condition issue: the direct way is
to modify the generic weak arch_protect_bpf_trampoline() to add a mutex
lock for set_memory_rox(), but the other simple and proper way is to
just make arch_protect_bpf_trampoline() return 0 in the arch-specific
code because LoongArch has already use the BPF prog pack allocator for
trampoline.
Here are the trimmed kernel log messages:
copy_to_kernel_nofault: memory access failed, ecode 0x4
copy_to_kernel_nofault: the caller is text_copy_cb+0x50/0xa0
text_copy_cb: operation failed
------------[ cut here ]------------
bpf_prog_pack bug: missing bpf_arch_text_invalidate?
WARNING: kernel/bpf/core.c:1008 at bpf_prog_pack_free+0x200/0x228
...
Call Trace:
[<9000000000248914>] show_stack+0x64/0x188
[<9000000000241308>] dump_stack_lvl+0x6c/0x9c
[<90000000002705bc>] __warn+0x9c/0x200
[<9000000001c428c0>] __report_bug+0xa8/0x1c0
[<9000000001c42b5c>] report_bug+0x64/0x120
[<9000000001c7dcd0>] do_bp+0x270/0x3c0
[<9000000000246f40>] handle_bp+0x120/0x1c0
[<900000000047b030>] bpf_prog_pack_free+0x200/0x228
[<900000000047b2ec>] bpf_jit_binary_pack_free+0x24/0x60
[<900000000026989c>] bpf_jit_free+0x54/0xb0
[<900000000029e10c>] process_one_work+0x184/0x610
[<900000000029ef8c>] worker_thread+0x24c/0x388
[<90000000002a902c>] kthread+0x13c/0x170
[<9000000001c7dfe8>] ret_from_kernel_thread+0x28/0x1c0
[<9000000000246624>] ret_from_kernel_thread_asm+0xc/0x88
---[ end trace 0000000000000000 ]---
Here is a simple shell script to reproduce:
#!/bin/bash
for ((i=1; i<=1000; i++))
do
echo "Under testing $i ..."
dmesg -c > /dev/null
./test_progs -t fentry_attach_stress > /dev/null
dmesg -t | grep "text_copy_cb: operation failed"
if [ $? -eq 0 ]; then
break
fi
done
Cc: stable@vger.kernel.org
Fixes: 4ab17e762b34 ("LoongArch: BPF: Use BPF prog pack allocator")
Acked-by: Hengqi Chen <hengqi.chen@gmail.com>
Signed-off-by: Tiezhu Yang <yangtiezhu@loongson.cn>
Signed-off-by: Huacai Chen <chenhuacai@loongson.cn>
|
|
If copy_to_kernel_nofault() failed, no need to flush icache and just
return immediately.
Cc: stable@vger.kernel.org
Signed-off-by: Tiezhu Yang <yangtiezhu@loongson.cn>
Signed-off-by: Huacai Chen <chenhuacai@loongson.cn>
|
|
set_memory_rw() and set_memory_rox() may fail, so we should check the
return values and return immediately in larch_insn_text_copy().
Cc: stable@vger.kernel.org
Signed-off-by: Tiezhu Yang <yangtiezhu@loongson.cn>
Signed-off-by: Huacai Chen <chenhuacai@loongson.cn>
|
|
If memory access such as copy_{from, to}_kernel_nofault() failed, its
users do not know what happened, so it is very useful to print the
exception code for such cases. Furthermore, it is better to print the
caller function to know where is the entry.
Here are the low level call chains:
copy_from_kernel_nofault()
copy_from_kernel_nofault_loop()
__get_kernel_nofault()
copy_to_kernel_nofault()
copy_to_kernel_nofault_loop()
__put_kernel_nofault()
Cc: stable@vger.kernel.org
Signed-off-by: Tiezhu Yang <yangtiezhu@loongson.cn>
Signed-off-by: Huacai Chen <chenhuacai@loongson.cn>
|