summaryrefslogtreecommitdiff
path: root/tools/include
AgeCommit message (Collapse)Author
2022-06-27objtool: Add entry UNRET validationPeter Zijlstra
Since entry asm is tricky, add a validation pass that ensures the retbleed mitigation has been done before the first actual RET instruction. Entry points are those that either have UNWIND_HINT_ENTRY, which acts as UNWIND_HINT_EMPTY but marks the instruction as an entry point, or those that have UWIND_HINT_IRET_REGS at +0. This is basically a variant of validate_branch() that is intra-function and it will simply follow all branches from marked entry points and ensures that all paths lead to ANNOTATE_UNRET_END. If a path hits RET or an indirection the path is a fail and will be reported. There are 3 ANNOTATE_UNRET_END instances: - UNTRAIN_RET itself - exception from-kernel; this path doesn't need UNTRAIN_RET - all early exceptions; these also don't need UNTRAIN_RET Signed-off-by: Peter Zijlstra (Intel) <peterz@infradead.org> Signed-off-by: Borislav Petkov <bp@suse.de> Reviewed-by: Josh Poimboeuf <jpoimboe@kernel.org> Signed-off-by: Borislav Petkov <bp@suse.de>
2022-06-26tools include UAPI: Sync linux/vhost.h with the kernel sourcesArnaldo Carvalho de Melo
To get the changes in: 84d7c8fd3aade2fe ("vhost-vdpa: introduce uAPI to set group ASID") 2d1fcb7758e49fd9 ("vhost-vdpa: uAPI to get virtqueue group id") a0c95f201170bd55 ("vhost-vdpa: introduce uAPI to get the number of address spaces") 3ace88bd37436abc ("vhost-vdpa: introduce uAPI to get the number of virtqueue groups") 175d493c3c3e09a3 ("vhost: move the backend feature bits to vhost_types.h") Silencing this perf build warning: Warning: Kernel ABI header at 'tools/include/uapi/linux/vhost.h' differs from latest version at 'include/uapi/linux/vhost.h' diff -u tools/include/uapi/linux/vhost.h include/uapi/linux/vhost.h To pick up these changes and support them: $ tools/perf/trace/beauty/vhost_virtio_ioctl.sh > before $ cp include/uapi/linux/vhost.h tools/include/uapi/linux/vhost.h $ tools/perf/trace/beauty/vhost_virtio_ioctl.sh > after $ diff -u before after --- before 2022-06-26 12:04:35.982003781 -0300 +++ after 2022-06-26 12:04:43.819972476 -0300 @@ -28,6 +28,7 @@ [0x74] = "VDPA_SET_CONFIG", [0x75] = "VDPA_SET_VRING_ENABLE", [0x77] = "VDPA_SET_CONFIG_CALL", + [0x7C] = "VDPA_SET_GROUP_ASID", }; static const char *vhost_virtio_ioctl_read_cmds[] = { [0x00] = "GET_FEATURES", @@ -39,5 +40,8 @@ [0x76] = "VDPA_GET_VRING_NUM", [0x78] = "VDPA_GET_IOVA_RANGE", [0x79] = "VDPA_GET_CONFIG_SIZE", + [0x7A] = "VDPA_GET_AS_NUM", + [0x7B] = "VDPA_GET_VRING_GROUP", [0x80] = "VDPA_GET_VQS_COUNT", + [0x81] = "VDPA_GET_GROUP_NUM", }; $ Cc: Adrian Hunter <adrian.hunter@intel.com> Cc: Gautam Dawar <gautam.dawar@xilinx.com> Cc: Jiri Olsa <jolsa@kernel.org> Cc: Michael S. Tsirkin <mst@redhat.com> Cc: Namhyung Kim <namhyung@kernel.org> Link: https://lore.kernel.org/lkml/Yrh3xMYbfeAD0MFL@kernel.org Signed-off-by: Arnaldo Carvalho de Melo <acme@redhat.com>
2022-06-26tools headers UAPI: Sync drm/i915_drm.h with the kernel sourcesArnaldo Carvalho de Melo
To pick up the changes in: ecf8eca51f33dbfd ("drm/i915/xehp: Add compute engine ABI") 991b4de3275728fd ("drm/i915/uapi: Add kerneldoc for engine class enum") c94fde8f516610b0 ("drm/i915/uapi: Add DRM_I915_QUERY_GEOMETRY_SUBSLICES") 1c671ad753dbbf5f ("drm/i915/doc: Link query items to their uapi structs") a2e5402691e23269 ("drm/i915/doc: Convert perf UAPI comments to kerneldoc") 462ac1cdf4d7acf1 ("drm/i915/doc: Convert drm_i915_query_topology_info comment to kerneldoc") 034d47b25b2ce627 ("drm/i915/uapi: Document DRM_I915_QUERY_HWCONFIG_BLOB") 78e1fb3112c0ac44 ("drm/i915/uapi: Add query for hwconfig blob") That don't add any new ioctl, so no changes in tooling. This silences this perf build warning: Warning: Kernel ABI header at 'tools/include/uapi/drm/i915_drm.h' differs from latest version at 'include/uapi/drm/i915_drm.h' diff -u tools/include/uapi/drm/i915_drm.h include/uapi/drm/i915_drm.h Cc: John Harrison <John.C.Harrison@intel.com> Cc: Matt Atwood <matthew.s.atwood@intel.com> Cc: Matt Roper <matthew.d.roper@intel.com> Cc: Rodrigo Vivi <rodrigo.vivi@intel.com> Cc: Tvrtko Ursulin <tvrtko.ursulin@intel.com> Link: http://lore.kernel.org/lkml/YrDi4ALYjv9Mdocq@kernel.org Signed-off-by: Arnaldo Carvalho de Melo <acme@redhat.com>
2022-06-20Merge tag 'perf-tools-fixes-for-v5.19-2022-06-19' of ↵Linus Torvalds
git://git.kernel.org/pub/scm/linux/kernel/git/acme/linux Pull perf tool fixes from Arnaldo Carvalho de Melo: - Don't set data source if it's not a memory operation in ARM SPE (Statistical Profiling Extensions). - Fix handling of exponent floating point values in perf stat expressions. - Don't leak fd on failure on libperf open. - Fix 'perf test' CPU topology test for PPC guest systems. - Fix undefined behaviour on breakpoint account 'perf test' entry. - Record only user callchains on the "Check ARM64 callgraphs are complete in FP mode" 'perf test' entry. - Fix "perf stat CSV output linter" test on s390. - Sync batch of kernel headers with tools/perf/. * tag 'perf-tools-fixes-for-v5.19-2022-06-19' of git://git.kernel.org/pub/scm/linux/kernel/git/acme/linux: tools headers UAPI: Sync linux/prctl.h with the kernel sources perf metrics: Ensure at least 1 id per metric tools headers arm64: Sync arm64's cputype.h with the kernel sources tools headers UAPI: Sync x86's asm/kvm.h with the kernel sources perf arm-spe: Don't set data source if it's not a memory operation perf expr: Allow exponents on floating point values perf test topology: Use !strncmp(right platform) to fix guest PPC comparision check perf test: Record only user callchains on the "Check Arm64 callgraphs are complete in fp mode" test perf beauty: Update copy of linux/socket.h with the kernel sources perf test: Fix variable length array undefined behavior in bp_account libperf evsel: Open shouldn't leak fd on failure perf test: Fix "perf stat CSV output linter" test on s390 perf unwind: Fix uninitialized variable
2022-06-19tools headers UAPI: Sync linux/prctl.h with the kernel sourcesArnaldo Carvalho de Melo
To pick the changes in: 9e4ab6c891094720 ("arm64/sme: Implement vector length configuration prctl()s") That don't result in any changes in tooling: $ tools/perf/trace/beauty/prctl_option.sh > before $ cp include/uapi/linux/prctl.h tools/include/uapi/linux/prctl.h $ tools/perf/trace/beauty/prctl_option.sh > after $ diff -u before after $ Just silences this perf tools build warning: Warning: Kernel ABI header at 'tools/include/uapi/linux/prctl.h' differs from latest version at 'include/uapi/linux/prctl.h' diff -u tools/include/uapi/linux/prctl.h include/uapi/linux/prctl.h Cc: Catalin Marinas <catalin.marinas@arm.com> Cc: Mark Brown <broonie@kernel.org> Link: http://lore.kernel.org/lkml/Yq81we+XFOqlBWyu@kernel.org Signed-off-by: Arnaldo Carvalho de Melo <acme@redhat.com>
2022-06-06x86/ftrace: Remove OBJECT_FILES_NON_STANDARD usageJosh Poimboeuf
The file-wide OBJECT_FILES_NON_STANDARD annotation is used with CONFIG_FRAME_POINTER to tell objtool to skip the entire file when frame pointers are enabled. However that annotation is now deprecated because it doesn't work with IBT, where objtool runs on vmlinux.o instead of individual translation units. Instead, use more fine-grained function-specific annotations: - The 'save_mcount_regs' macro does funny things with the frame pointer. Use STACK_FRAME_NON_STANDARD_FP to tell objtool to ignore the functions using it. - The return_to_handler() "function" isn't actually a callable function. Instead of being called, it's returned to. The real return address isn't on the stack, so unwinding is already doomed no matter which unwinder is used. So just remove the STT_FUNC annotation, telling objtool to ignore it. That also removes the implicit ANNOTATE_NOENDBR, which now needs to be made explicit. Fixes the following warning: vmlinux.o: warning: objtool: __fentry__+0x16: return with modified stack frame Fixes: ed53a0d97192 ("x86/alternative: Use .ibt_endbr_seal to seal indirect calls") Reported-by: kernel test robot <lkp@intel.com> Signed-off-by: Josh Poimboeuf <jpoimboe@kernel.org> Link: https://lore.kernel.org/r/b7a7a42fe306aca37826043dac89e113a1acdbac.1654268610.git.jpoimboe@kernel.org
2022-06-04Merge tag 'bitmap-for-5.19-rc1' of https://github.com/norov/linuxLinus Torvalds
Pull bitmap updates from Yury Norov: - bitmap: optimize bitmap_weight() usage, from me - lib/bitmap.c make bitmap_print_bitmask_to_buf parseable, from Mauro Carvalho Chehab - include/linux/find: Fix documentation, from Anna-Maria Behnsen - bitmap: fix conversion from/to fix-sized arrays, from me - bitmap: Fix return values to be unsigned, from Kees Cook It has been in linux-next for at least a week with no problems. * tag 'bitmap-for-5.19-rc1' of https://github.com/norov/linux: (31 commits) nodemask: Fix return values to be unsigned bitmap: Fix return values to be unsigned KVM: x86: hyper-v: replace bitmap_weight() with hweight64() KVM: x86: hyper-v: fix type of valid_bank_mask ia64: cleanup remove_siblinginfo() drm/amd/pm: use bitmap_{from,to}_arr32 where appropriate KVM: s390: replace bitmap_copy with bitmap_{from,to}_arr64 where appropriate lib/bitmap: add test for bitmap_{from,to}_arr64 lib: add bitmap_{from,to}_arr64 lib/bitmap: extend comment for bitmap_(from,to)_arr32() include/linux/find: Fix documentation lib/bitmap.c make bitmap_print_bitmask_to_buf parseable MAINTAINERS: add cpumask and nodemask files to BITMAP_API arch/x86: replace nodes_weight with nodes_empty where appropriate mm/vmstat: replace cpumask_weight with cpumask_empty where appropriate clocksource: replace cpumask_weight with cpumask_empty in clocksource.c genirq/affinity: replace cpumask_weight with cpumask_empty where appropriate irq: mips: replace cpumask_weight with cpumask_empty where appropriate drm/i915/pmu: replace cpumask_weight with cpumask_empty where appropriate arch/x86: replace cpumask_weight with cpumask_empty where appropriate ...
2022-06-03bitmap: Fix return values to be unsignedKees Cook
Both nodemask and bitmap routines had mixed return values that provided potentially signed return values that could never happen. This was leading to the compiler getting confusing about the range of possible return values (it was thinking things could be negative where they could not be). In preparation for fixing nodemask, fix all the bitmap routines that should be returning unsigned (or bool) values. Cc: Yury Norov <yury.norov@gmail.com> Cc: Rasmus Villemoes <linux@rasmusvillemoes.dk> Cc: Christophe de Dinechin <dinechin@redhat.com> Cc: Alexey Dobriyan <adobriyan@gmail.com> Cc: Andy Shevchenko <andriy.shevchenko@linux.intel.com> Cc: Andrew Morton <akpm@linux-foundation.org> Cc: Zhen Lei <thunder.leizhen@huawei.com> Signed-off-by: Kees Cook <keescook@chromium.org> Signed-off-by: Yury Norov <yury.norov@gmail.com>
2022-06-03LoongArch: Add other common headersHuacai Chen
Add some other common headers for basic LoongArch support. Reviewed-by: WANG Xuerui <git@xen0n.name> Reviewed-by: Jiaxun Yang <jiaxun.yang@flygoat.com> Signed-off-by: Huacai Chen <chenhuacai@loongson.cn>
2022-05-31Merge tag 'riscv-for-linus-5.19-mw0' of ↵Linus Torvalds
git://git.kernel.org/pub/scm/linux/kernel/git/riscv/linux Pull RISC-V updates from Palmer Dabbelt: - Support for the Svpbmt extension, which allows memory attributes to be encoded in pages - Support for the Allwinner D1's implementation of page-based memory attributes - Support for running rv32 binaries on rv64 systems, via the compat subsystem - Support for kexec_file() - Support for the new generic ticket-based spinlocks, which allows us to also move to qrwlock. These should have already gone in through the asm-geneic tree as well - A handful of cleanups and fixes, include some larger ones around atomics and XIP * tag 'riscv-for-linus-5.19-mw0' of git://git.kernel.org/pub/scm/linux/kernel/git/riscv/linux: (51 commits) RISC-V: Prepare dropping week attribute from arch_kexec_apply_relocations[_add] riscv: compat: Using seperated vdso_maps for compat_vdso_info RISC-V: Fix the XIP build RISC-V: Split out the XIP fixups into their own file RISC-V: ignore xipImage RISC-V: Avoid empty create_*_mapping definitions riscv: Don't output a bogus mmu-type on a no MMU kernel riscv: atomic: Add custom conditional atomic operation implementation riscv: atomic: Optimize dec_if_positive functions riscv: atomic: Cleanup unnecessary definition RISC-V: Load purgatory in kexec_file RISC-V: Add purgatory RISC-V: Support for kexec_file on panic RISC-V: Add kexec_file support RISC-V: use memcpy for kexec_file mode kexec_file: Fix kexec_file.c build error for riscv platform riscv: compat: Add COMPAT Kbuild skeletal support riscv: compat: ptrace: Add compat_arch_ptrace implement riscv: compat: signal: Add rt_frame implementation riscv: add memory-type errata for T-Head ...
2022-05-26Merge tag 'for-linus' of git://git.kernel.org/pub/scm/virt/kvm/kvmLinus Torvalds
Pull kvm updates from Paolo Bonzini: "S390: - ultravisor communication device driver - fix TEID on terminating storage key ops RISC-V: - Added Sv57x4 support for G-stage page table - Added range based local HFENCE functions - Added remote HFENCE functions based on VCPU requests - Added ISA extension registers in ONE_REG interface - Updated KVM RISC-V maintainers entry to cover selftests support ARM: - Add support for the ARMv8.6 WFxT extension - Guard pages for the EL2 stacks - Trap and emulate AArch32 ID registers to hide unsupported features - Ability to select and save/restore the set of hypercalls exposed to the guest - Support for PSCI-initiated suspend in collaboration with userspace - GICv3 register-based LPI invalidation support - Move host PMU event merging into the vcpu data structure - GICv3 ITS save/restore fixes - The usual set of small-scale cleanups and fixes x86: - New ioctls to get/set TSC frequency for a whole VM - Allow userspace to opt out of hypercall patching - Only do MSR filtering for MSRs accessed by rdmsr/wrmsr AMD SEV improvements: - Add KVM_EXIT_SHUTDOWN metadata for SEV-ES - V_TSC_AUX support Nested virtualization improvements for AMD: - Support for "nested nested" optimizations (nested vVMLOAD/VMSAVE, nested vGIF) - Allow AVIC to co-exist with a nested guest running - Fixes for LBR virtualizations when a nested guest is running, and nested LBR virtualization support - PAUSE filtering for nested hypervisors Guest support: - Decoupling of vcpu_is_preempted from PV spinlocks" * tag 'for-linus' of git://git.kernel.org/pub/scm/virt/kvm/kvm: (199 commits) KVM: x86: Fix the intel_pt PMI handling wrongly considered from guest KVM: selftests: x86: Sync the new name of the test case to .gitignore Documentation: kvm: reorder ARM-specific section about KVM_SYSTEM_EVENT_SUSPEND x86, kvm: use correct GFP flags for preemption disabled KVM: LAPIC: Drop pending LAPIC timer injection when canceling the timer x86/kvm: Alloc dummy async #PF token outside of raw spinlock KVM: x86: avoid calling x86 emulator without a decoded instruction KVM: SVM: Use kzalloc for sev ioctl interfaces to prevent kernel data leak x86/fpu: KVM: Set the base guest FPU uABI size to sizeof(struct kvm_xsave) s390/uv_uapi: depend on CONFIG_S390 KVM: selftests: x86: Fix test failure on arch lbr capable platforms KVM: LAPIC: Trace LAPIC timer expiration on every vmentry KVM: s390: selftest: Test suppression indication on key prot exception KVM: s390: Don't indicate suppression on dirtying, failing memop selftests: drivers/s390x: Add uvdevice tests drivers/s390/char: Add Ultravisor io device MAINTAINERS: Update KVM RISC-V entry to cover selftests support RISC-V: KVM: Introduce ISA extension register RISC-V: KVM: Cleanup stale TLB entries when host CPU changes RISC-V: KVM: Add remote HFENCE functions based on VCPU requests ...
2022-05-25Merge tag 'net-next-5.19' of ↵Linus Torvalds
git://git.kernel.org/pub/scm/linux/kernel/git/netdev/net-next Pull networking updates from Jakub Kicinski: "Core ---- - Support TCPv6 segmentation offload with super-segments larger than 64k bytes using the IPv6 Jumbogram extension header (AKA BIG TCP). - Generalize skb freeing deferral to per-cpu lists, instead of per-socket lists. - Add a netdev statistic for packets dropped due to L2 address mismatch (rx_otherhost_dropped). - Continue work annotating skb drop reasons. - Accept alternative netdev names (ALT_IFNAME) in more netlink requests. - Add VLAN support for AF_PACKET SOCK_RAW GSO. - Allow receiving skb mark from the socket as a cmsg. - Enable memcg accounting for veth queues, sysctl tables and IPv6. BPF --- - Add libbpf support for User Statically-Defined Tracing (USDTs). - Speed up symbol resolution for kprobes multi-link attachments. - Support storing typed pointers to referenced and unreferenced objects in BPF maps. - Add support for BPF link iterator. - Introduce access to remote CPU map elements in BPF per-cpu map. - Allow middle-of-the-road settings for the kernel.unprivileged_bpf_disabled sysctl. - Implement basic types of dynamic pointers e.g. to allow for dynamically sized ringbuf reservations without extra memory copies. Protocols --------- - Retire port only listening_hash table, add a second bind table hashed by port and address. Avoid linear list walk when binding to very popular ports (e.g. 443). - Add bridge FDB bulk flush filtering support allowing user space to remove all FDB entries matching a condition. - Introduce accept_unsolicited_na sysctl for IPv6 to implement router-side changes for RFC9131. - Support for MPTCP path manager in user space. - Add MPTCP support for fallback to regular TCP for connections that have never connected additional subflows or transmitted out-of-sequence data (partial support for RFC8684 fallback). - Avoid races in MPTCP-level window tracking, stabilize and improve throughput. - Support lockless operation of GRE tunnels with seq numbers enabled. - WiFi support for host based BSS color collision detection. - Add support for SO_TXTIME/SCM_TXTIME on CAN sockets. - Support transmission w/o flow control in CAN ISOTP (ISO 15765-2). - Support zero-copy Tx with TLS 1.2 crypto offload (sendfile). - Allow matching on the number of VLAN tags via tc-flower. - Add tracepoint for tcp_set_ca_state(). Driver API ---------- - Improve error reporting from classifier and action offload. - Add support for listing line cards in switches (devlink). - Add helpers for reporting page pool statistics with ethtool -S. - Add support for reading clock cycles when using PTP virtual clocks, instead of having the driver convert to time before reporting. This makes it possible to report time from different vclocks. - Support configuring low-latency Tx descriptor push via ethtool. - Separate Clause 22 and Clause 45 MDIO accesses more explicitly. New hardware / drivers ---------------------- - Ethernet: - Marvell's Octeon NIC PCI Endpoint support (octeon_ep) - Sunplus SP7021 SoC (sp7021_emac) - Add support for Renesas RZ/V2M (in ravb) - Add support for MediaTek mt7986 switches (in mtk_eth_soc) - Ethernet PHYs: - ADIN1100 industrial PHYs (w/ 10BASE-T1L and SQI reporting) - TI DP83TD510 PHY - Microchip LAN8742/LAN88xx PHYs - WiFi: - Driver for pureLiFi X, XL, XC devices (plfxlc) - Driver for Silicon Labs devices (wfx) - Support for WCN6750 (in ath11k) - Support Realtek 8852ce devices (in rtw89) - Mobile: - MediaTek T700 modems (Intel 5G 5000 M.2 cards) - CAN: - ctucanfd: add support for CTU CAN FD open-source IP core from Czech Technical University in Prague Drivers ------- - Delete a number of old drivers still using virt_to_bus(). - Ethernet NICs: - intel: support TSO on tunnels MPLS - broadcom: support multi-buffer XDP - nfp: support VF rate limiting - sfc: use hardware tx timestamps for more than PTP - mlx5: multi-port eswitch support - hyper-v: add support for XDP_REDIRECT - atlantic: XDP support (including multi-buffer) - macb: improve real-time perf by deferring Tx processing to NAPI - High-speed Ethernet switches: - mlxsw: implement basic line card information querying - prestera: add support for traffic policing on ingress and egress - Embedded Ethernet switches: - lan966x: add support for packet DMA (FDMA) - lan966x: add support for PTP programmable pins - ti: cpsw_new: enable bc/mc storm prevention - Qualcomm 802.11ax WiFi (ath11k): - Wake-on-WLAN support for QCA6390 and WCN6855 - device recovery (firmware restart) support - support setting Specific Absorption Rate (SAR) for WCN6855 - read country code from SMBIOS for WCN6855/QCA6390 - enable keep-alive during WoWLAN suspend - implement remain-on-channel support - MediaTek WiFi (mt76): - support Wireless Ethernet Dispatch offloading packet movement between the Ethernet switch and WiFi interfaces - non-standard VHT MCS10-11 support - mt7921 AP mode support - mt7921 IPv6 NS offload support - Ethernet PHYs: - micrel: ksz9031/ksz9131: cabletest support - lan87xx: SQI support for T1 PHYs - lan937x: add interrupt support for link detection" * tag 'net-next-5.19' of git://git.kernel.org/pub/scm/linux/kernel/git/netdev/net-next: (1809 commits) ptp: ocp: Add firmware header checks ptp: ocp: fix PPS source selector debugfs reporting ptp: ocp: add .init function for sma_op vector ptp: ocp: vectorize the sma accessor functions ptp: ocp: constify selectors ptp: ocp: parameterize input/output sma selectors ptp: ocp: revise firmware display ptp: ocp: add Celestica timecard PCI ids ptp: ocp: Remove #ifdefs around PCI IDs ptp: ocp: 32-bit fixups for pci start address Revert "net/smc: fix listen processing for SMC-Rv2" ath6kl: Use cc-disable-warning to disable -Wdangling-pointer selftests/bpf: Dynptr tests bpf: Add dynptr data slices bpf: Add bpf_dynptr_read and bpf_dynptr_write bpf: Dynptr support for ring buffers bpf: Add bpf_dynptr_from_mem for local dynptrs bpf: Add verifier support for dynptrs bpf: Suppress 'passing zero to PTR_ERR' warning bpf: Introduce bpf_arch_text_invalidate for bpf_prog_pack ...
2022-05-25Merge tag 'kvm-riscv-5.19-1' of https://github.com/kvm-riscv/linux into HEADPaolo Bonzini
KVM/riscv changes for 5.19 - Added Sv57x4 support for G-stage page table - Added range based local HFENCE functions - Added remote HFENCE functions based on VCPU requests - Added ISA extension registers in ONE_REG interface - Updated KVM RISC-V maintainers entry to cover selftests support
2022-05-24Merge tag 'objtool-core-2022-05-23' of ↵Linus Torvalds
git://git.kernel.org/pub/scm/linux/kernel/git/tip/tip Pull objtool updates from Ingo Molnar: - Comprehensive interface overhaul: ================================= Objtool's interface has some issues: - Several features are done unconditionally, without any way to turn them off. Some of them might be surprising. This makes objtool tricky to use, and prevents porting individual features to other arches. - The config dependencies are too coarse-grained. Objtool enablement is tied to CONFIG_STACK_VALIDATION, but it has several other features independent of that. - The objtool subcmds ("check" and "orc") are clumsy: "check" is really a subset of "orc", so it has all the same options. The subcmd model has never really worked for objtool, as it only has a single purpose: "do some combination of things on an object file". - The '--lto' and '--vmlinux' options are nonsensical and have surprising behavior. Overhaul the interface: - get rid of subcmds - make all features individually selectable - remove and/or clarify confusing/obsolete options - update the documentation - fix some bugs found along the way - Fix x32 regression - Fix Kbuild cleanup bugs - Add scripts/objdump-func helper script to disassemble a single function from an object file. - Rewrite scripts/faddr2line to be section-aware, by basing it on 'readelf', moving it away from 'nm', which doesn't handle multiple sections well, which can result in decoding failure. - Rewrite & fix symbol handling - which had a number of bugs wrt. object files that don't have global symbols - which is rare but possible. Also fix a bunch of symbol handling bugs found along the way. * tag 'objtool-core-2022-05-23' of git://git.kernel.org/pub/scm/linux/kernel/git/tip/tip: (23 commits) objtool: Fix objtool regression on x32 systems objtool: Fix symbol creation scripts/faddr2line: Fix overlapping text section failures scripts: Create objdump-func helper script objtool: Remove libsubcmd.a when make clean objtool: Remove inat-tables.c when make clean objtool: Update documentation objtool: Remove --lto and --vmlinux in favor of --link objtool: Add HAVE_NOINSTR_VALIDATION objtool: Rename "VMLINUX_VALIDATION" -> "NOINSTR_VALIDATION" objtool: Make noinstr hacks optional objtool: Make jump label hack optional objtool: Make static call annotation optional objtool: Make stack validation frame-pointer-specific objtool: Add CONFIG_OBJTOOL objtool: Extricate sls from stack validation objtool: Rework ibt and extricate from stack validation objtool: Make stack validation optional objtool: Add option to print section addresses objtool: Don't print parentheses in function addresses ...
2022-05-23Merge tag 'x86_asm_for_v5.19_rc1' of ↵Linus Torvalds
git://git.kernel.org/pub/scm/linux/kernel/git/tip/tip Pull x86 asm updates from Borislav Petkov: - A bunch of changes towards streamlining low level asm helpers' calling conventions so that former can be converted to C eventually - Simplify PUSH_AND_CLEAR_REGS so that it can be used at the system call entry paths instead of having opencoded, slightly different variants of it everywhere - Misc other fixes * tag 'x86_asm_for_v5.19_rc1' of git://git.kernel.org/pub/scm/linux/kernel/git/tip/tip: x86/entry: Fix register corruption in compat syscall objtool: Fix STACK_FRAME_NON_STANDARD reloc type linkage: Fix issue with missing symbol size x86/entry: Remove skip_r11rcx x86/entry: Use PUSH_AND_CLEAR_REGS for compat x86/entry: Simplify entry_INT80_compat() x86/mm: Simplify RESERVE_BRK() x86/entry: Convert SWAPGS to swapgs and remove the definition of SWAPGS x86/entry: Don't call error_entry() for XENPV x86/entry: Move CLD to the start of the idtentry macro x86/entry: Move PUSH_AND_CLEAR_REGS out of error_entry() x86/entry: Switch the stack after error_entry() returns x86/traps: Use pt_regs directly in fixup_bad_iret()
2022-05-23Merge https://git.kernel.org/pub/scm/linux/kernel/git/bpf/bpf-nextJakub Kicinski
Daniel Borkmann says: ==================== pull-request: bpf-next 2022-05-23 We've added 113 non-merge commits during the last 26 day(s) which contain a total of 121 files changed, 7425 insertions(+), 1586 deletions(-). The main changes are: 1) Speed up symbol resolution for kprobes multi-link attachments, from Jiri Olsa. 2) Add BPF dynamic pointer infrastructure e.g. to allow for dynamically sized ringbuf reservations without extra memory copies, from Joanne Koong. 3) Big batch of libbpf improvements towards libbpf 1.0 release, from Andrii Nakryiko. 4) Add BPF link iterator to traverse links via seq_file ops, from Dmitrii Dolgov. 5) Add source IP address to BPF tunnel key infrastructure, from Kaixi Fan. 6) Refine unprivileged BPF to disable only object-creating commands, from Alan Maguire. 7) Fix JIT blinding of ld_imm64 when they point to subprogs, from Alexei Starovoitov. 8) Add BPF access to mptcp_sock structures and their meta data, from Geliang Tang. 9) Add new BPF helper for access to remote CPU's BPF map elements, from Feng Zhou. 10) Allow attaching 64-bit cookie to BPF link of fentry/fexit/fmod_ret, from Kui-Feng Lee. 11) Follow-ups to typed pointer support in BPF maps, from Kumar Kartikeya Dwivedi. 12) Add busy-poll test cases to the XSK selftest suite, from Magnus Karlsson. 13) Improvements in BPF selftest test_progs subtest output, from Mykola Lysenko. 14) Fill bpf_prog_pack allocator areas with illegal instructions, from Song Liu. 15) Add generic batch operations for BPF map-in-map cases, from Takshak Chahande. 16) Make bpf_jit_enable more user friendly when permanently on 1, from Tiezhu Yang. 17) Fix an array overflow in bpf_trampoline_get_progs(), from Yuntao Wang. ==================== Link: https://lore.kernel.org/r/20220523223805.27931-1-daniel@iogearbox.net Signed-off-by: Jakub Kicinski <kuba@kernel.org>
2022-05-23bpf: Add dynptr data slicesJoanne Koong
This patch adds a new helper function void *bpf_dynptr_data(struct bpf_dynptr *ptr, u32 offset, u32 len); which returns a pointer to the underlying data of a dynptr. *len* must be a statically known value. The bpf program may access the returned data slice as a normal buffer (eg can do direct reads and writes), since the verifier associates the length with the returned pointer, and enforces that no out of bounds accesses occur. Signed-off-by: Joanne Koong <joannelkoong@gmail.com> Signed-off-by: Andrii Nakryiko <andrii@kernel.org> Acked-by: Yonghong Song <yhs@fb.com> Link: https://lore.kernel.org/bpf/20220523210712.3641569-6-joannelkoong@gmail.com
2022-05-23bpf: Add bpf_dynptr_read and bpf_dynptr_writeJoanne Koong
This patch adds two helper functions, bpf_dynptr_read and bpf_dynptr_write: long bpf_dynptr_read(void *dst, u32 len, struct bpf_dynptr *src, u32 offset); long bpf_dynptr_write(struct bpf_dynptr *dst, u32 offset, void *src, u32 len); The dynptr passed into these functions must be valid dynptrs that have been initialized. Signed-off-by: Joanne Koong <joannelkoong@gmail.com> Signed-off-by: Andrii Nakryiko <andrii@kernel.org> Acked-by: Andrii Nakryiko <andrii@kernel.org> Link: https://lore.kernel.org/bpf/20220523210712.3641569-5-joannelkoong@gmail.com
2022-05-23bpf: Dynptr support for ring buffersJoanne Koong
Currently, our only way of writing dynamically-sized data into a ring buffer is through bpf_ringbuf_output but this incurs an extra memcpy cost. bpf_ringbuf_reserve + bpf_ringbuf_commit avoids this extra memcpy, but it can only safely support reservation sizes that are statically known since the verifier cannot guarantee that the bpf program won’t access memory outside the reserved space. The bpf_dynptr abstraction allows for dynamically-sized ring buffer reservations without the extra memcpy. There are 3 new APIs: long bpf_ringbuf_reserve_dynptr(void *ringbuf, u32 size, u64 flags, struct bpf_dynptr *ptr); void bpf_ringbuf_submit_dynptr(struct bpf_dynptr *ptr, u64 flags); void bpf_ringbuf_discard_dynptr(struct bpf_dynptr *ptr, u64 flags); These closely follow the functionalities of the original ringbuf APIs. For example, all ringbuffer dynptrs that have been reserved must be either submitted or discarded before the program exits. Signed-off-by: Joanne Koong <joannelkoong@gmail.com> Signed-off-by: Andrii Nakryiko <andrii@kernel.org> Acked-by: Andrii Nakryiko <andrii@kernel.org> Acked-by: David Vernet <void@manifault.com> Link: https://lore.kernel.org/bpf/20220523210712.3641569-4-joannelkoong@gmail.com
2022-05-23bpf: Add bpf_dynptr_from_mem for local dynptrsJoanne Koong
This patch adds a new api bpf_dynptr_from_mem: long bpf_dynptr_from_mem(void *data, u32 size, u64 flags, struct bpf_dynptr *ptr); which initializes a dynptr to point to a bpf program's local memory. For now only local memory that is of reg type PTR_TO_MAP_VALUE is supported. Signed-off-by: Joanne Koong <joannelkoong@gmail.com> Signed-off-by: Andrii Nakryiko <andrii@kernel.org> Link: https://lore.kernel.org/bpf/20220523210712.3641569-3-joannelkoong@gmail.com
2022-05-23bpf: Add verifier support for dynptrsJoanne Koong
This patch adds the bulk of the verifier work for supporting dynamic pointers (dynptrs) in bpf. A bpf_dynptr is opaque to the bpf program. It is a 16-byte structure defined internally as: struct bpf_dynptr_kern { void *data; u32 size; u32 offset; } __aligned(8); The upper 8 bits of *size* is reserved (it contains extra metadata about read-only status and dynptr type). Consequently, a dynptr only supports memory less than 16 MB. There are different types of dynptrs (eg malloc, ringbuf, ...). In this patchset, the most basic one, dynptrs to a bpf program's local memory, is added. For now only local memory that is of reg type PTR_TO_MAP_VALUE is supported. In the verifier, dynptr state information will be tracked in stack slots. When the program passes in an uninitialized dynptr (ARG_PTR_TO_DYNPTR | MEM_UNINIT), the stack slots corresponding to the frame pointer where the dynptr resides at are marked STACK_DYNPTR. For helper functions that take in initialized dynptrs (eg bpf_dynptr_read + bpf_dynptr_write which are added later in this patchset), the verifier enforces that the dynptr has been initialized properly by checking that their corresponding stack slots have been marked as STACK_DYNPTR. The 6th patch in this patchset adds test cases that the verifier should successfully reject, such as for example attempting to use a dynptr after doing a direct write into it inside the bpf program. Signed-off-by: Joanne Koong <joannelkoong@gmail.com> Signed-off-by: Andrii Nakryiko <andrii@kernel.org> Acked-by: Andrii Nakryiko <andrii@kernel.org> Acked-by: David Vernet <void@manifault.com> Link: https://lore.kernel.org/bpf/20220523210712.3641569-2-joannelkoong@gmail.com
2022-05-23Merge tag 'nolibc.2022.05.20a' of ↵Linus Torvalds
git://git.kernel.org/pub/scm/linux/kernel/git/paulmck/linux-rcu Pull nolibc library updates from Paul McKenney: "This adds a number of library functions and splits this library into multiple files" * tag 'nolibc.2022.05.20a' of git://git.kernel.org/pub/scm/linux/kernel/git/paulmck/linux-rcu: (61 commits) tools/nolibc/string: Implement `strdup()` and `strndup()` tools/nolibc/string: Implement `strnlen()` tools/nolibc/stdlib: Implement `malloc()`, `calloc()`, `realloc()` and `free()` tools/nolibc/types: Implement `offsetof()` and `container_of()` macro tools/nolibc/sys: Implement `mmap()` and `munmap()` tools/nolibc: i386: Implement syscall with 6 arguments tools/nolibc: Remove .global _start from the entry point code tools/nolibc: Replace `asm` with `__asm__` tools/nolibc: x86-64: Update System V ABI document link tools/nolibc/stdlib: only reference the external environ when inlined tools/nolibc/string: do not use __builtin_strlen() at -O0 tools/nolibc: add the nolibc subdir to the common Makefile tools/nolibc: add a makefile to install headers tools/nolibc/types: add poll() and waitpid() flag definitions tools/nolibc/sys: add syscall definition for getppid() tools/nolibc/string: add strcmp() and strncmp() tools/nolibc/stdio: add support for '%p' to vfprintf() tools/nolibc/stdlib: add a simple getenv() implementation tools/nolibc/stdio: make printf(%s) accept NULL tools/nolibc/stdlib: implement abort() ...
2022-05-20bpf: Add bpf_skc_to_mptcp_sock_protoGeliang Tang
This patch implements a new struct bpf_func_proto, named bpf_skc_to_mptcp_sock_proto. Define a new bpf_id BTF_SOCK_TYPE_MPTCP, and a new helper bpf_skc_to_mptcp_sock(), which invokes another new helper bpf_mptcp_sock_from_subflow() in net/mptcp/bpf.c to get struct mptcp_sock from a given subflow socket. v2: Emit BTF type, add func_id checks in verifier.c and bpf_trace.c, remove build check for CONFIG_BPF_JIT v5: Drop EXPORT_SYMBOL (Martin) Co-developed-by: Nicolas Rybowski <nicolas.rybowski@tessares.net> Co-developed-by: Matthieu Baerts <matthieu.baerts@tessares.net> Signed-off-by: Nicolas Rybowski <nicolas.rybowski@tessares.net> Signed-off-by: Matthieu Baerts <matthieu.baerts@tessares.net> Signed-off-by: Geliang Tang <geliang.tang@suse.com> Signed-off-by: Mat Martineau <mathew.j.martineau@linux.intel.com> Signed-off-by: Andrii Nakryiko <andrii@kernel.org> Link: https://lore.kernel.org/bpf/20220519233016.105670-2-mathew.j.martineau@linux.intel.com
2022-05-19Merge git://git.kernel.org/pub/scm/linux/kernel/git/netdev/netJakub Kicinski
drivers/net/ethernet/mellanox/mlx5/core/main.c b33886971dbc ("net/mlx5: Initialize flow steering during driver probe") 40379a0084c2 ("net/mlx5_fpga: Drop INNOVA TLS support") f2b41b32cde8 ("net/mlx5: Remove ipsec_ops function table") https://lore.kernel.org/all/20220519040345.6yrjromcdistu7vh@sx1/ 16d42d313350 ("net/mlx5: Drain fw_reset when removing device") 8324a02c342a ("net/mlx5: Add exit route when waiting for FW") https://lore.kernel.org/all/20220519114119.060ce014@canb.auug.org.au/ tools/testing/selftests/net/mptcp/mptcp_join.sh e274f7154008 ("selftests: mptcp: add subflow limits test-cases") b6e074e171bc ("selftests: mptcp: add infinite map testcase") 5ac1d2d63451 ("selftests: mptcp: Add tests for userspace PM type") https://lore.kernel.org/all/20220516111918.366d747f@canb.auug.org.au/ net/mptcp/options.c ba2c89e0ea74 ("mptcp: fix checksum byte order") 1e39e5a32ad7 ("mptcp: infinite mapping sending") ea66758c1795 ("tcp: allow MPTCP to update the announced window") https://lore.kernel.org/all/20220519115146.751c3a37@canb.auug.org.au/ net/mptcp/pm.c 95d686517884 ("mptcp: fix subflow accounting on close") 4d25247d3ae4 ("mptcp: bypass in-kernel PM restrictions for non-kernel PMs") https://lore.kernel.org/all/20220516111435.72f35dca@canb.auug.org.au/ net/mptcp/subflow.c ae66fb2ba6c3 ("mptcp: Do TCP fallback on early DSS checksum failure") 0348c690ed37 ("mptcp: add the fallback check") f8d4bcacff3b ("mptcp: infinite mapping receiving") https://lore.kernel.org/all/20220519115837.380bb8d4@canb.auug.org.au/ Signed-off-by: Jakub Kicinski <kuba@kernel.org>
2022-05-16Merge branch kvm-arm64/hcall-selection into kvmarm-master/nextMarc Zyngier
* kvm-arm64/hcall-selection: : . : Introduce a new set of virtual sysregs for userspace to : select the hypercalls it wants to see exposed to the guest. : : Patches courtesy of Raghavendra and Oliver. : . KVM: arm64: Fix hypercall bitmap writeback when vcpus have already run KVM: arm64: Hide KVM_REG_ARM_*_BMAP_BIT_COUNT from userspace Documentation: Fix index.rst after psci.rst renaming selftests: KVM: aarch64: Add the bitmap firmware registers to get-reg-list selftests: KVM: aarch64: Introduce hypercall ABI test selftests: KVM: Create helper for making SMCCC calls selftests: KVM: Rename psci_cpu_on_test to psci_test tools: Import ARM SMCCC definitions Docs: KVM: Add doc for the bitmap firmware registers Docs: KVM: Rename psci.rst to hypercalls.rst KVM: arm64: Add vendor hypervisor firmware register KVM: arm64: Add standard hypervisor firmware register KVM: arm64: Setup a framework for hypercall bitmap firmware registers KVM: arm64: Factor out firmware register handling from psci.c Signed-off-by: Marc Zyngier <maz@kernel.org>
2022-05-16net: add IFLA_TSO_{MAX_SIZE|SEGS} attributesEric Dumazet
New netlink attributes IFLA_TSO_MAX_SIZE and IFLA_TSO_MAX_SEGS are used to report to user-space the device TSO limits. ip -d link sh dev eth1 ... tso_max_size 65536 tso_max_segs 65535 Signed-off-by: Eric Dumazet <edumazet@google.com> Acked-by: Alexander Duyck <alexanderduyck@fb.com> Signed-off-by: David S. Miller <davem@davemloft.net>
2022-05-11bpf: add bpf_map_lookup_percpu_elem for percpu mapFeng Zhou
Add new ebpf helpers bpf_map_lookup_percpu_elem. The implementation method is relatively simple, refer to the implementation method of map_lookup_elem of percpu map, increase the parameters of cpu, and obtain it according to the specified cpu. Signed-off-by: Feng Zhou <zhoufeng.zf@bytedance.com> Link: https://lore.kernel.org/r/20220511093854.411-2-zhoufeng.zf@bytedance.com Signed-off-by: Alexei Starovoitov <ast@kernel.org>
2022-05-10bpf, x86: Attach a cookie to fentry/fexit/fmod_ret/lsm.Kui-Feng Lee
Pass a cookie along with BPF_LINK_CREATE requests. Add a bpf_cookie field to struct bpf_tracing_link to attach a cookie. The cookie of a bpf_tracing_link is available by calling bpf_get_attach_cookie when running the BPF program of the attached link. The value of a cookie will be set at bpf_tramp_run_ctx by the trampoline of the link. Signed-off-by: Kui-Feng Lee <kuifeng@fb.com> Signed-off-by: Alexei Starovoitov <ast@kernel.org> Signed-off-by: Andrii Nakryiko <andrii@kernel.org> Acked-by: Andrii Nakryiko <andrii@kernel.org> Link: https://lore.kernel.org/bpf/20220510205923.3206889-4-kuifeng@fb.com
2022-05-10bpf, x86: Generate trampolines from bpf_tramp_linksKui-Feng Lee
Replace struct bpf_tramp_progs with struct bpf_tramp_links to collect struct bpf_tramp_link(s) for a trampoline. struct bpf_tramp_link extends bpf_link to act as a linked list node. arch_prepare_bpf_trampoline() accepts a struct bpf_tramp_links to collects all bpf_tramp_link(s) that a trampoline should call. Change BPF trampoline and bpf_struct_ops to pass bpf_tramp_links instead of bpf_tramp_progs. Signed-off-by: Kui-Feng Lee <kuifeng@fb.com> Signed-off-by: Alexei Starovoitov <ast@kernel.org> Signed-off-by: Andrii Nakryiko <andrii@kernel.org> Acked-by: Andrii Nakryiko <andrii@kernel.org> Link: https://lore.kernel.org/bpf/20220510205923.3206889-2-kuifeng@fb.com
2022-05-10bpf: Add source ip in "struct bpf_tunnel_key"Kaixi Fan
Add tunnel source ip field in "struct bpf_tunnel_key". Add related code to set and get tunnel source field. Signed-off-by: Kaixi Fan <fankaixi.li@bytedance.com> Link: https://lore.kernel.org/r/20220430074844.69214-2-fankaixi.li@bytedance.com Signed-off-by: Alexei Starovoitov <ast@kernel.org>
2022-05-08tools headers UAPI: Sync linux/kvm.h with the kernel sourcesArnaldo Carvalho de Melo
To pick the changes in: d495f942f40aa412 ("KVM: fix bad user ABI for KVM_EXIT_SYSTEM_EVENT") That just rebuilds perf, as these patches don't add any new KVM ioctl to be harvested for the the 'perf trace' ioctl syscall argument beautifiers. This is also by now used by tools/testing/selftests/kvm/, a simple test build succeeded. This silences this perf build warning: Warning: Kernel ABI header at 'tools/include/uapi/linux/kvm.h' differs from latest version at 'include/uapi/linux/kvm.h' diff -u tools/include/uapi/linux/kvm.h include/uapi/linux/kvm.h Cc: Adrian Hunter <adrian.hunter@intel.com> Cc: Jiri Olsa <jolsa@kernel.org> Cc: Namhyung Kim <namhyung@kernel.org> Cc: Paolo Bonzini <pbonzini@redhat.com> Link: http://lore.kernel.org/lkml/YnE5BIweGmCkpOTN@kernel.org Signed-off-by: Arnaldo Carvalho de Melo <acme@redhat.com>
2022-05-06objtool: Fix STACK_FRAME_NON_STANDARD reloc typePeter Zijlstra
STACK_FRAME_NON_STANDARD results in inconsistent relocation types depending on .c or .S usage: Relocation section '.rela.discard.func_stack_frame_non_standard' at offset 0x3c01090 contains 5 entries: Offset Info Type Symbol's Value Symbol's Name + Addend 0000000000000000 00020c2200000002 R_X86_64_PC32 0000000000047b40 do_suspend_lowlevel + 0 0000000000000008 0002461e00000001 R_X86_64_64 00000000000480a0 machine_real_restart + 0 0000000000000010 0000001400000001 R_X86_64_64 0000000000000000 .rodata + b3d4 0000000000000018 0002444600000002 R_X86_64_PC32 00000000000678a0 __efi64_thunk + 0 0000000000000020 0002659d00000001 R_X86_64_64 0000000000113160 __crash_kexec + 0 Signed-off-by: Peter Zijlstra (Intel) <peterz@infradead.org> Signed-off-by: Borislav Petkov <bp@suse.de> Link: https://lore.kernel.org/r/20220506121631.508692613@infradead.org
2022-05-03tools: Import ARM SMCCC definitionsRaghavendra Rao Ananta
Import the standard SMCCC definitions from include/linux/arm-smccc.h. Signed-off-by: Raghavendra Rao Ananta <rananta@google.com> Reviewed-by: Gavin Shan <gshan@redhat.com> Signed-off-by: Marc Zyngier <maz@kernel.org> Link: https://lore.kernel.org/r/20220502233853.1233742-8-rananta@google.com
2022-04-28net: SO_RCVMARK socket option for SO_MARK with recvmsg()Erin MacNeil
Adding a new socket option, SO_RCVMARK, to indicate that SO_MARK should be included in the ancillary data returned by recvmsg(). Renamed the sock_recv_ts_and_drops() function to sock_recv_cmsgs(). Signed-off-by: Erin MacNeil <lnx.erin@gmail.com> Reviewed-by: Eric Dumazet <edumazet@google.com> Reviewed-by: David Ahern <dsahern@kernel.org> Acked-by: Marc Kleine-Budde <mkl@pengutronix.de> Link: https://lore.kernel.org/r/20220427200259.2564-1-lnx.erin@gmail.com Signed-off-by: Jakub Kicinski <kuba@kernel.org>
2022-04-28Merge git://git.kernel.org/pub/scm/linux/kernel/git/netdev/netJakub Kicinski
include/linux/netdevice.h net/core/dev.c 6510ea973d8d ("net: Use this_cpu_inc() to increment net->core_stats") 794c24e9921f ("net-core: rx_otherhost_dropped to core_stats") https://lore.kernel.org/all/20220428111903.5f4304e0@canb.auug.org.au/ drivers/net/wan/cosa.c d48fea8401cf ("net: cosa: fix error check return value of register_chrdev()") 89fbca3307d4 ("net: wan: remove support for COSA and SRP synchronous serial boards") https://lore.kernel.org/all/20220428112130.1f689e5e@canb.auug.org.au/ Signed-off-by: Jakub Kicinski <kuba@kernel.org>
2022-04-27Merge https://git.kernel.org/pub/scm/linux/kernel/git/bpf/bpf-nextJakub Kicinski
Daniel Borkmann says: ==================== pull-request: bpf-next 2022-04-27 We've added 85 non-merge commits during the last 18 day(s) which contain a total of 163 files changed, 4499 insertions(+), 1521 deletions(-). The main changes are: 1) Teach libbpf to enhance BPF verifier log with human-readable and relevant information about failed CO-RE relocations, from Andrii Nakryiko. 2) Add typed pointer support in BPF maps and enable it for unreferenced pointers (via probe read) and referenced ones that can be passed to in-kernel helpers, from Kumar Kartikeya Dwivedi. 3) Improve xsk to break NAPI loop when rx queue gets full to allow for forward progress to consume descriptors, from Maciej Fijalkowski & Björn Töpel. 4) Fix a small RCU read-side race in BPF_PROG_RUN routines which dereferenced the effective prog array before the rcu_read_lock, from Stanislav Fomichev. 5) Implement BPF atomic operations for RV64 JIT, and add libbpf parsing logic for USDT arguments under riscv{32,64}, from Pu Lehui. 6) Implement libbpf parsing of USDT arguments under aarch64, from Alan Maguire. 7) Enable bpftool build for musl and remove nftw with FTW_ACTIONRETVAL usage so it can be shipped under Alpine which is musl-based, from Dominique Martinet. 8) Clean up {sk,task,inode} local storage trace RCU handling as they do not need to use call_rcu_tasks_trace() barrier, from KP Singh. 9) Improve libbpf API documentation and fix error return handling of various API functions, from Grant Seltzer. 10) Enlarge offset check for bpf_skb_{load,store}_bytes() helpers given data length of frags + frag_list may surpass old offset limit, from Liu Jian. 11) Various improvements to prog_tests in area of logging, test execution and by-name subtest selection, from Mykola Lysenko. 12) Simplify map_btf_id generation for all map types by moving this process to build time with help of resolve_btfids infra, from Menglong Dong. 13) Fix a libbpf bug in probing when falling back to legacy bpf_probe_read*() helpers; the probing caused always to use old helpers, from Runqing Yang. 14) Add support for ARCompact and ARCv2 platforms for libbpf's PT_REGS tracing macros, from Vladimir Isaev. 15) Cleanup BPF selftests to remove old & unneeded rlimit code given kernel switched to memcg-based memory accouting a while ago, from Yafang Shao. 16) Refactor of BPF sysctl handlers to move them to BPF core, from Yan Zhu. 17) Fix BPF selftests in two occasions to work around regressions caused by latest LLVM to unblock CI until their fixes are worked out, from Yonghong Song. 18) Misc cleanups all over the place, from various others. * https://git.kernel.org/pub/scm/linux/kernel/git/bpf/bpf-next: (85 commits) selftests/bpf: Add libbpf's log fixup logic selftests libbpf: Fix up verifier log for unguarded failed CO-RE relos libbpf: Simplify bpf_core_parse_spec() signature libbpf: Refactor CO-RE relo human description formatting routine libbpf: Record subprog-resolved CO-RE relocations unconditionally selftests/bpf: Add CO-RE relos and SEC("?...") to linked_funcs selftests libbpf: Avoid joining .BTF.ext data with BPF programs by section name libbpf: Fix logic for finding matching program for CO-RE relocation libbpf: Drop unhelpful "program too large" guess libbpf: Fix anonymous type check in CO-RE logic bpf: Compute map_btf_id during build time selftests/bpf: Add test for strict BTF type check selftests/bpf: Add verifier tests for kptr selftests/bpf: Add C tests for kptr libbpf: Add kptr type tag macros to bpf_helpers.h bpf: Make BTF type match stricter for release arguments bpf: Teach verifier about kptr_get kfunc helpers bpf: Wire up freeing of referenced kptr bpf: Populate pairs of btf_id and destructor kfunc in btf bpf: Adapt copy_map_value for multiple offset case ... ==================== Link: https://lore.kernel.org/r/20220427224758.20976-1-daniel@iogearbox.net Signed-off-by: Jakub Kicinski <kuba@kernel.org>
2022-04-26syscalls: compat: Fix the missing part for __SYSCALL_COMPATGuo Ren
Make "uapi asm unistd.h" could be used for architectures' COMPAT mode. The __SYSCALL_COMPAT is first used in riscv. Signed-off-by: Guo Ren <guoren@linux.alibaba.com> Signed-off-by: Guo Ren <guoren@kernel.org> Reviewed-by: Arnd Bergmann <arnd@arndb.de> Reviewed-by: Christoph Hellwig <hch@lst.de> Tested-by: Heiko Stuebner <heiko@sntech.de> Link: https://lore.kernel.org/r/20220405071314.3225832-8-guoren@kernel.org Signed-off-by: Palmer Dabbelt <palmer@rivosinc.com>
2022-04-26uapi: always define F_GETLK64/F_SETLK64/F_SETLKW64 in fcntl.hChristoph Hellwig
The F_GETLK64/F_SETLK64/F_SETLKW64 fcntl opcodes are only implemented for the 32-bit syscall APIs, but are also needed for compat handling on 64-bit kernels. Consolidate them in unistd.h instead of definining the internal compat definitions in compat.h, which is rather error prone (e.g. parisc gets the values wrong currently). Note that before this change they were never visible to userspace due to the fact that CONFIG_64BIT is only set for kernel builds. Signed-off-by: Christoph Hellwig <hch@lst.de> Signed-off-by: Guo Ren <guoren@kernel.org> Reviewed-by: Arnd Bergmann <arnd@arndb.de> Tested-by: Heiko Stuebner <heiko@sntech.de> Link: https://lore.kernel.org/r/20220405071314.3225832-3-guoren@kernel.org Signed-off-by: Palmer Dabbelt <palmer@rivosinc.com>
2022-04-26uapi: simplify __ARCH_FLOCK{,64}_PAD a littleChristoph Hellwig
Don't bother to define the symbols empty, just don't use them. That makes the intent a little more clear. Remove the unused HAVE_ARCH_STRUCT_FLOCK64 define and merge the 32-bit mips struct flock into the generic one. Add a new __ARCH_FLOCK_EXTRA_SYSID macro following the style of __ARCH_FLOCK_PAD to avoid having a separate definition just for one architecture. Signed-off-by: Christoph Hellwig <hch@lst.de> Signed-off-by: Guo Ren <guoren@kernel.org> Reviewed-by: Arnd Bergmann <arnd@arndb.de> Tested-by: Heiko Stuebner <heiko@sntech.de> Link: https://lore.kernel.org/r/20220405071314.3225832-2-guoren@kernel.org Signed-off-by: Palmer Dabbelt <palmer@rivosinc.com>
2022-04-25bpf: Allow storing referenced kptr in mapKumar Kartikeya Dwivedi
Extending the code in previous commits, introduce referenced kptr support, which needs to be tagged using 'kptr_ref' tag instead. Unlike unreferenced kptr, referenced kptr have a lot more restrictions. In addition to the type matching, only a newly introduced bpf_kptr_xchg helper is allowed to modify the map value at that offset. This transfers the referenced pointer being stored into the map, releasing the references state for the program, and returning the old value and creating new reference state for the returned pointer. Similar to unreferenced pointer case, return value for this case will also be PTR_TO_BTF_ID_OR_NULL. The reference for the returned pointer must either be eventually released by calling the corresponding release function, otherwise it must be transferred into another map. It is also allowed to call bpf_kptr_xchg with a NULL pointer, to clear the value, and obtain the old value if any. BPF_LDX, BPF_STX, and BPF_ST cannot access referenced kptr. A future commit will permit using BPF_LDX for such pointers, but attempt at making it safe, since the lifetime of object won't be guaranteed. There are valid reasons to enforce the restriction of permitting only bpf_kptr_xchg to operate on referenced kptr. The pointer value must be consistent in face of concurrent modification, and any prior values contained in the map must also be released before a new one is moved into the map. To ensure proper transfer of this ownership, bpf_kptr_xchg returns the old value, which the verifier would require the user to either free or move into another map, and releases the reference held for the pointer being moved in. In the future, direct BPF_XCHG instruction may also be permitted to work like bpf_kptr_xchg helper. Note that process_kptr_func doesn't have to call check_helper_mem_access, since we already disallow rdonly/wronly flags for map, which is what check_map_access_type checks, and we already ensure the PTR_TO_MAP_VALUE refers to kptr by obtaining its off_desc, so check_map_access is also not required. Signed-off-by: Kumar Kartikeya Dwivedi <memxor@gmail.com> Signed-off-by: Alexei Starovoitov <ast@kernel.org> Link: https://lore.kernel.org/bpf/20220424214901.2743946-4-memxor@gmail.com
2022-04-22tools: Add kmem_cache_alloc_lru()Matthew Wilcox (Oracle)
Turn kmem_cache_alloc() into a wrapper around kmem_cache_alloc_lru(). Fixes: 9bbdc0f32409 ("xarray: use kmem_cache_alloc_lru to allocate xa_node") Signed-off-by: Matthew Wilcox (Oracle) <willy@infradead.org> Reported-by: Liam R. Howlett <Liam.Howlett@oracle.com> Reported-by: Li Wang <liwang@redhat.com>
2022-04-22objtool: Add CONFIG_OBJTOOLJosh Poimboeuf
Now that stack validation is an optional feature of objtool, add CONFIG_OBJTOOL and replace most usages of CONFIG_STACK_VALIDATION with it. CONFIG_STACK_VALIDATION can now be considered to be frame-pointer specific. CONFIG_UNWINDER_ORC is already inherently valid for live patching, so no need to "validate" it. Signed-off-by: Josh Poimboeuf <jpoimboe@redhat.com> Signed-off-by: Peter Zijlstra (Intel) <peterz@infradead.org> Reviewed-by: Miroslav Benes <mbenes@suse.cz> Link: https://lkml.kernel.org/r/939bf3d85604b2a126412bf11af6e3bd3b872bcb.1650300597.git.jpoimboe@redhat.com
2022-04-20tools/nolibc/string: Implement `strdup()` and `strndup()`Ammar Faizi
These functions are currently only available on architectures that have my_syscall6() macro implemented. Since these functions use malloc(), malloc() uses mmap(), mmap() depends on my_syscall6() macro. On architectures that don't support my_syscall6(), these function will always return NULL with errno set to ENOSYS. Acked-by: Willy Tarreau <w@1wt.eu> Signed-off-by: Ammar Faizi <ammarfaizi2@gnuweeb.org> Signed-off-by: Paul E. McKenney <paulmck@kernel.org>
2022-04-20tools/nolibc/string: Implement `strnlen()`Ammar Faizi
size_t strnlen(const char *str, size_t maxlen); The strnlen() function returns the number of bytes in the string pointed to by sstr, excluding the terminating null byte ('\0'), but at most maxlen. In doing this, strnlen() looks only at the first maxlen characters in the string pointed to by str and never beyond str[maxlen-1]. The first use case of this function is for determining the memory allocation size in the strndup() function. Link: https://lore.kernel.org/lkml/CAOG64qMpEMh+EkOfjNdAoueC+uQyT2Uv3689_sOr37-JxdJf4g@mail.gmail.com Suggested-by: Alviro Iskandar Setiawan <alviro.iskandar@gnuweeb.org> Acked-by: Willy Tarreau <w@1wt.eu> Signed-off-by: Ammar Faizi <ammarfaizi2@gnuweeb.org> Signed-off-by: Paul E. McKenney <paulmck@kernel.org>
2022-04-20tools/nolibc/stdlib: Implement `malloc()`, `calloc()`, `realloc()` and `free()`Ammar Faizi
Implement basic dynamic allocator functions. These functions are currently only available on architectures that have nolibc mmap() syscall implemented. These are not a super-fast memory allocator, but at least they can satisfy basic needs for having heap without libc. Cc: David Laight <David.Laight@ACULAB.COM> Acked-by: Willy Tarreau <w@1wt.eu> Signed-off-by: Ammar Faizi <ammarfaizi2@gnuweeb.org> Signed-off-by: Paul E. McKenney <paulmck@kernel.org>
2022-04-20tools/nolibc/types: Implement `offsetof()` and `container_of()` macroAmmar Faizi
Implement `offsetof()` and `container_of()` macro. The first use case of these macros is for `malloc()`, `realloc()` and `free()`. Acked-by: Willy Tarreau <w@1wt.eu> Signed-off-by: Ammar Faizi <ammarfaizi2@gnuweeb.org> Signed-off-by: Paul E. McKenney <paulmck@kernel.org>
2022-04-20tools/nolibc/sys: Implement `mmap()` and `munmap()`Ammar Faizi
Implement mmap() and munmap(). Currently, they are only available for architecures that have my_syscall6 macro. For architectures that don't have, this function will return -1 with errno set to ENOSYS (Function not implemented). This has been tested on x86 and i386. Notes for i386: 1) The common mmap() syscall implementation uses __NR_mmap2 instead of __NR_mmap. 2) The offset must be shifted-right by 12-bit. Acked-by: Willy Tarreau <w@1wt.eu> Signed-off-by: Ammar Faizi <ammarfaizi2@gnuweeb.org> Signed-off-by: Paul E. McKenney <paulmck@kernel.org>
2022-04-20tools/nolibc: i386: Implement syscall with 6 argumentsAmmar Faizi
On i386, the 6th argument of syscall goes in %ebp. However, both Clang and GCC cannot use %ebp in the clobber list and in the "r" constraint without using -fomit-frame-pointer. To make it always available for any kind of compilation, the below workaround is implemented. 1) Push the 6-th argument. 2) Push %ebp. 3) Load the 6-th argument from 4(%esp) to %ebp. 4) Do the syscall (int $0x80). 5) Pop %ebp (restore the old value of %ebp). 6) Add %esp by 4 (undo the stack pointer). Cc: x86@kernel.org Cc: llvm@lists.linux.dev Link: https://lore.kernel.org/lkml/2e335ac54db44f1d8496583d97f9dab0@AcuMS.aculab.com Suggested-by: David Laight <David.Laight@ACULAB.COM> Acked-by: Willy Tarreau <w@1wt.eu> Signed-off-by: Ammar Faizi <ammarfaizi2@gnuweeb.org> Signed-off-by: Paul E. McKenney <paulmck@kernel.org>
2022-04-20tools/nolibc: Remove .global _start from the entry point codeAmmar Faizi
Building with clang yields the following error: ``` <inline asm>:3:1: error: _start changed binding to STB_GLOBAL .global _start ^ 1 error generated. ``` Make sure only specify one between `.global _start` and `.weak _start`. Remove `.global _start`. Cc: llvm@lists.linux.dev Reviewed-by: Nick Desaulniers <ndesaulniers@google.com> Acked-by: Willy Tarreau <w@1wt.eu> Signed-off-by: Ammar Faizi <ammarfaizi2@gnuweeb.org> Signed-off-by: Paul E. McKenney <paulmck@kernel.org>
2022-04-20tools/nolibc: Replace `asm` with `__asm__`Ammar Faizi
Replace `asm` with `__asm__` to support compilation with -std flag. Using `asm` with -std flag makes GCC think `asm()` is a function call instead of an inline assembly. GCC doc says: For the C language, the `asm` keyword is a GNU extension. When writing C code that can be compiled with `-ansi` and the `-std` options that select C dialects without GNU extensions, use `__asm__` instead of `asm`. Link: https://gcc.gnu.org/onlinedocs/gcc/Basic-Asm.html Reported-by: Alviro Iskandar Setiawan <alviro.iskandar@gnuweeb.org> Acked-by: Willy Tarreau <w@1wt.eu> Signed-off-by: Ammar Faizi <ammarfaizi2@gnuweeb.org> Signed-off-by: Paul E. McKenney <paulmck@kernel.org>