lwn.git - Linux kernel documentation tree maintained by Jonathan Corbet

Age	Commit message (Collapse)	Author
9 days	Merge tag 'bpf_res_spin_lock' of ↵	Linus Torvalds
	git://git.kernel.org/pub/scm/linux/kernel/git/bpf/bpf-next Pull bpf relisient spinlock support from Alexei Starovoitov: "This patch set introduces Resilient Queued Spin Lock (or rqspinlock with res_spin_lock() and res_spin_unlock() APIs). This is a qspinlock variant which recovers the kernel from a stalled state when the lock acquisition path cannot make forward progress. This can occur when a lock acquisition attempt enters a deadlock situation (e.g. AA, or ABBA), or more generally, when the owner of the lock (which we’re trying to acquire) isn’t making forward progress. Deadlock detection is the main mechanism used to provide instant recovery, with the timeout mechanism acting as a final line of defense. Detection is triggered immediately when beginning the waiting loop of a lock slow path. Additionally, BPF programs attached to different parts of the kernel can introduce new control flow into the kernel, which increases the likelihood of deadlocks in code not written to handle reentrancy. There have been multiple syzbot reports surfacing deadlocks in internal kernel code due to the diverse ways in which BPF programs can be attached to different parts of the kernel. By switching the BPF subsystem’s lock usage to rqspinlock, all of these issues are mitigated at runtime. This spin lock implementation allows BPF maps to become safer and remove mechanisms that have fallen short in assuring safety when nesting programs in arbitrary ways in the same context or across different contexts. We run benchmarks that stress locking scalability and perform comparison against the baseline (qspinlock). For the rqspinlock case, we replace the default qspinlock with it in the kernel, such that all spin locks in the kernel use the rqspinlock slow path. As such, benchmarks that stress kernel spin locks end up exercising rqspinlock. More details in the cover letter in commit 6ffb9017e932 ("Merge branch 'resilient-queued-spin-lock'")" * tag 'bpf_res_spin_lock' of git://git.kernel.org/pub/scm/linux/kernel/git/bpf/bpf-next: (24 commits) selftests/bpf: Add tests for rqspinlock bpf: Maintain FIFO property for rqspinlock unlock bpf: Implement verifier support for rqspinlock bpf: Introduce rqspinlock kfuncs bpf: Convert lpm_trie.c to rqspinlock bpf: Convert percpu_freelist.c to rqspinlock bpf: Convert hashtab.c to rqspinlock rqspinlock: Add locktorture support rqspinlock: Add entry to Makefile, MAINTAINERS rqspinlock: Add macros for rqspinlock usage rqspinlock: Add basic support for CONFIG_PARAVIRT rqspinlock: Add a test-and-set fallback rqspinlock: Add deadlock detection and recovery rqspinlock: Protect waiters in trylock fallback from stalls rqspinlock: Protect waiters in queue from stalls rqspinlock: Protect pending bit owners from stalls rqspinlock: Hardcode cond_acquire loops for arm64 rqspinlock: Add support for timeouts rqspinlock: Drop PV and virtualization support rqspinlock: Add rqspinlock.h header ...
9 days	Merge tag 'bpf-next-6.15' of ↵	Linus Torvalds
	git://git.kernel.org/pub/scm/linux/kernel/git/bpf/bpf-next Pull bpf updates from Alexei Starovoitov: "For this merge window we're splitting BPF pull request into three for higher visibility: main changes, res_spin_lock, try_alloc_pages. These are the main BPF changes: - Add DFA-based live registers analysis to improve verification of programs with loops (Eduard Zingerman) - Introduce load_acquire and store_release BPF instructions and add x86, arm64 JIT support (Peilin Ye) - Fix loop detection logic in the verifier (Eduard Zingerman) - Drop unnecesary lock in bpf_map_inc_not_zero() (Eric Dumazet) - Add kfunc for populating cpumask bits (Emil Tsalapatis) - Convert various shell based tests to selftests/bpf/test_progs format (Bastien Curutchet) - Allow passing referenced kptrs into struct_ops callbacks (Amery Hung) - Add a flag to LSM bpf hook to facilitate bpf program signing (Blaise Boscaccy) - Track arena arguments in kfuncs (Ihor Solodrai) - Add copy_remote_vm_str() helper for reading strings from remote VM and bpf_copy_from_user_task_str() kfunc (Jordan Rome) - Add support for timed may_goto instruction (Kumar Kartikeya Dwivedi) - Allow bpf_get_netns_cookie() int cgroup_skb programs (Mahe Tardy) - Reduce bpf_cgrp_storage_busy false positives when accessing cgroup local storage (Martin KaFai Lau) - Introduce bpf_dynptr_copy() kfunc (Mykyta Yatsenko) - Allow retrieving BTF data with BTF token (Mykyta Yatsenko) - Add BPF kfuncs to set and get xattrs with 'security.bpf.' prefix (Song Liu) - Reject attaching programs to noreturn functions (Yafang Shao) - Introduce pre-order traversal of cgroup bpf programs (Yonghong Song)" * tag 'bpf-next-6.15' of git://git.kernel.org/pub/scm/linux/kernel/git/bpf/bpf-next: (186 commits) selftests/bpf: Add selftests for load-acquire/store-release when register number is invalid bpf: Fix out-of-bounds read in check_atomic_load/store() libbpf: Add namespace for errstr making it libbpf_errstr bpf: Add struct_ops context information to struct bpf_prog_aux selftests/bpf: Sanitize pointer prior fclose() selftests/bpf: Migrate test_xdp_vlan.sh into test_progs selftests/bpf: test_xdp_vlan: Rename BPF sections bpf: clarify a misleading verifier error message selftests/bpf: Add selftest for attaching fexit to __noreturn functions bpf: Reject attaching fexit/fmod_ret to __noreturn functions bpf: Only fails the busy counter check in bpf_cgrp_storage_get if it creates storage bpf: Make perf_event_read_output accessible in all program types. bpftool: Using the right format specifiers bpftool: Add -Wformat-signedness flag to detect format errors selftests/bpf: Test freplace from user namespace libbpf: Pass BPF token from find_prog_btf_id to BPF_BTF_GET_FD_BY_ID bpf: Return prog btf_id without capable check bpf: BPF token support for BPF_BTF_GET_FD_BY_ID bpf, x86: Fix objtool warning for timed may_goto bpf: Check map->record at the beginning of check_and_free_fields() ...
2025-03-19	selftests/bpf: Migrate test_xdp_vlan.sh into test_progs	Bastien Curutchet (eBPF Foundation)
	test_xdp_vlan.sh isn't used by the BPF CI. Migrate test_xdp_vlan.sh in prog_tests/xdp_vlan.c. It uses the same BPF programs located in progs/test_xdp_vlan.c and the same network topology. Remove test_xdp_vlan*.sh and their Makefile entries. Acked-by: Stanislav Fomichev <sdf@fomichev.me> Signed-off-by: Bastien Curutchet (eBPF Foundation) <bastien.curutchet@bootlin.com> Signed-off-by: Martin KaFai Lau <martin.lau@kernel.org> Link: https://patch.msgid.link/20250221-xdp_vlan-v1-2-7d29847169af@bootlin.com/
2025-03-19	selftests/bpf: Add tests for rqspinlock	Kumar Kartikeya Dwivedi
	Introduce selftests that trigger AA, ABBA deadlocks, and test the edge case where the held locks table runs out of entries, since we then fallback to the timeout as the final line of defense. Also exercise verifier's AA detection where applicable. Signed-off-by: Kumar Kartikeya Dwivedi <memxor@gmail.com> Link: https://lore.kernel.org/r/20250316040541.108729-26-memxor@gmail.com Signed-off-by: Alexei Starovoitov <ast@kernel.org>
2025-03-18	selftests/bpf: Add selftest for attaching fexit to __noreturn functions	Yafang Shao
	The reuslt: $ tools/testing/selftests/bpf/test_progs --name=fexit_noreturns #99/1 fexit_noreturns/noreturns:OK #99 fexit_noreturns:OK Summary: 1/1 PASSED, 0 SKIPPED, 0 FAILED Signed-off-by: Yafang Shao <laoar.shao@gmail.com> Link: https://lore.kernel.org/r/20250318114447.75484-3-laoar.shao@gmail.com Signed-off-by: Alexei Starovoitov <ast@kernel.org>
2025-03-17	selftests/bpf: Test freplace from user namespace	Mykyta Yatsenko
	Add selftests to verify that it is possible to load freplace program from user namespace if BPF token is initialized by bpf_object__prepare before calling bpf_program__set_attach_target. Negative test is added as well. Modified type of the priv_prog to xdp, as kprobe did not work on aarch64 and s390x. Signed-off-by: Mykyta Yatsenko <yatsenko@meta.com> Signed-off-by: Andrii Nakryiko <andrii@kernel.org> Acked-by: Yonghong Song <yonghong.song@linux.dev> Link: https://lore.kernel.org/bpf/20250317174039.161275-5-mykyta.yatsenko5@gmail.com
2025-03-15	selftests/bpf: Fix arena_spin_lock compilation on PowerPC	Kumar Kartikeya Dwivedi
	Venkat reported a compilation error for BPF selftests on PowerPC [0]. The crux of the error is the following message: In file included from progs/arena_spin_lock.c:7: /root/bpf-next/tools/testing/selftests/bpf/bpf_arena_spin_lock.h:122:8: error: member reference base type '__attribute__((address_space(1))) u32' (aka '__attribute__((address_space(1))) unsigned int') is not a structure or union 122 \| old = atomic_read(&lock->val); This is because PowerPC overrides the qspinlock type changing the lock->val member's type from atomic_t to u32. To remedy this, import the asm-generic version in the arena spin lock header, name it __qspinlock (since it's aliased to arena_spinlock_t, the actual name hardly matters), and adjust the selftest to not depend on the type in vmlinux.h. [0]: https://lore.kernel.org/bpf/7bc80a3b-d708-4735-aa3b-6a8c21720f9d@linux.ibm.com Fixes: 88d706ba7cc5 ("selftests/bpf: Introduce arena spin lock") Reported-by: Venkat Rao Bagalkote <venkat88@linux.ibm.com> Signed-off-by: Kumar Kartikeya Dwivedi <memxor@gmail.com> Signed-off-by: Andrii Nakryiko <andrii@kernel.org> Tested-by: Venkat Rao Bagalkote <venkat88@linux.ibm.com> Link: https://lore.kernel.org/bpf/20250311154244.3775505-1-memxor@gmail.com Signed-off-by: Alexei Starovoitov <ast@kernel.org>
2025-03-15	selftests/bpf: Add a kernel flag test for LSM bpf hook	Blaise Boscaccy
	This test exercises the kernel flag added to security_bpf by effectively blocking light-skeletons from loading while allowing normal skeletons to function as-is. Since this should work with any arbitrary BPF program, an existing program from LSKELS_EXTRA was used as a test payload. Signed-off-by: Blaise Boscaccy <bboscaccy@linux.microsoft.com> Acked-by: Song Liu <song@kernel.org> Link: https://lore.kernel.org/r/20250310221737.821889-3-bboscaccy@linux.microsoft.com Signed-off-by: Alexei Starovoitov <ast@kernel.org>
2025-03-15	selftests/bpf: Convert comma to semicolon	Chen Ni
	Replace comma between expressions with semicolons. Using a ',' in place of a ';' can have unintended side effects. Although that is not the case here, it is seems best to use ';' unless ',' is intended. Found by inspection. No functional change intended. Compile tested only. Signed-off-by: Chen Ni <nichen@iscas.ac.cn> Signed-off-by: Andrii Nakryiko <andrii@kernel.org> Acked-by: Anton Protopopov <aspsk@isovalent.com> Link: https://lore.kernel.org/bpf/20250310032045.651068-1-nichen@iscas.ac.cn Signed-off-by: Alexei Starovoitov <ast@kernel.org>
2025-03-15	selftests: bpf: fix duplicate selftests in cpumask_success.	Emil Tsalapatis
	The BPF cpumask selftests are currently run twice in test_progs/cpumask.c, once by traversing cpumask_success_testcases, and once by invoking RUN_TESTS(cpumask_success). Remove the invocation of RUN_TESTS to properly run the selftests only once. Now that the tests are run only through cpumask_success_testscases, add to it the missing test_refcount_null_tracking testcase. Also remove the __success annotation from it, since it is now loaded and invoked by the runner. Signed-off-by: Emil Tsalapatis (Meta) <emil@etsalapatis.com> Acked-by: Hou Tao <houtao1@huawei.com> Link: https://lore.kernel.org/r/20250309230427.26603-5-emil@etsalapatis.com Signed-off-by: Alexei Starovoitov <ast@kernel.org>
2025-03-15	selftests: bpf: add bpf_cpumask_populate selftests	Emil Tsalapatis
	Add selftests for the bpf_cpumask_populate helper that sets a bpf_cpumask to a bit pattern provided by a BPF program. Signed-off-by: Emil Tsalapatis (Meta) <emil@etsalapatis.com> Acked-by: Hou Tao <houtao1@huawei.com> Link: https://lore.kernel.org/r/20250309230427.26603-3-emil@etsalapatis.com Signed-off-by: Alexei Starovoitov <ast@kernel.org>
2025-03-15	selftests/bpf: lwt_seg6local: Move test to test_progs	Bastien Curutchet (eBPF Foundation)
	test_lwt_seg6local.sh isn't used by the BPF CI. Add a new file in the test_progs framework to migrate the tests done by test_lwt_seg6local.sh. It uses the same network topology and the same BPF programs located in progs/test_lwt_seg6local.c. Use the network helpers instead of `nc` to exchange the final packet. Remove test_lwt_seg6local.sh and its Makefile entry. Signed-off-by: Bastien Curutchet (eBPF Foundation) <bastien.curutchet@bootlin.com> Link: https://lore.kernel.org/r/20250307-seg6local-v1-2-990fff8f180d@bootlin.com Signed-off-by: Alexei Starovoitov <ast@kernel.org>
2025-03-15	selftests/bpf: Fix cap_enable_effective() return code	Feng Yang
	The caller of cap_enable_effective() expects negative error code. Fix it. Before: failed to restore CAP_SYS_ADMIN: -1, Unknown error -1 After: failed to restore CAP_SYS_ADMIN: -3, No such process failed to restore CAP_SYS_ADMIN: -22, Invalid argument Signed-off-by: Feng Yang <yangfeng@kylinos.cn> Acked-by: Eduard Zingerman <eddyz87@gmail.com> Link: https://lore.kernel.org/r/20250305022234.44932-1-yangfeng59949@163.com Signed-off-by: Alexei Starovoitov <ast@kernel.org>
2025-03-15	selftests/bpf: Move test_lwt_ip_encap to test_progs	Bastien Curutchet (eBPF Foundation)
	test_lwt_ip_encap.sh isn't used by the BPF CI. Add a new file in the test_progs framework to migrate the tests done by test_lwt_ip_encap.sh. It uses the same network topology and the same BPF programs located in progs/test_lwt_ip_encap.c. Rework the GSO part to avoid using nc and dd. Remove test_lwt_ip_encap.sh and its Makefile entry. Signed-off-by: Bastien Curutchet (eBPF Foundation) <bastien.curutchet@bootlin.com> Signed-off-by: Martin KaFai Lau <martin.lau@kernel.org> Link: https://patch.msgid.link/20250304-lwt_ip-v1-1-8fdeb9e79a56@bootlin.com Signed-off-by: Alexei Starovoitov <ast@kernel.org>
2025-03-15	selftests/bpf: Add tests for arena spin lock	Kumar Kartikeya Dwivedi
	Add some basic selftests for qspinlock built over BPF arena using cond_break_label macro. Signed-off-by: Kumar Kartikeya Dwivedi <memxor@gmail.com> Link: https://lore.kernel.org/r/20250306035431.2186189-4-memxor@gmail.com Signed-off-by: Alexei Starovoitov <ast@kernel.org>
2025-03-15	selftests/bpf: test cases for compute_live_registers()	Eduard Zingerman
	Cover instructions from each kind: - assignment - arithmetic - store/load - endian conversion - atomics - branches, conditional branches, may_goto, calls - LD_ABS/LD_IND - address_space_cast Signed-off-by: Eduard Zingerman <eddyz87@gmail.com> Link: https://lore.kernel.org/r/20250304195024.2478889-6-eddyz87@gmail.com Signed-off-by: Alexei Starovoitov <ast@kernel.org>
2025-03-15	bpf: simple DFA-based live registers analysis	Eduard Zingerman
	Compute may-live registers before each instruction in the program. The register is live before the instruction I if it is read by I or some instruction S following I during program execution and is not overwritten between I and S. This information would be used in the next patch as a hint in func_states_equal(). Use a simple algorithm described in [1] to compute this information: - define the following: - I.use : a set of all registers read by instruction I; - I.def : a set of all registers written by instruction I; - I.in : a set of all registers that may be alive before I execution; - I.out : a set of all registers that may be alive after I execution; - I.successors : a set of instructions S that might immediately follow I for some program execution; - associate separate empty sets 'I.in' and 'I.out' with each instruction; - visit each instruction in a postorder and update corresponding 'I.in' and 'I.out' sets as follows: I.out = U [S.in for S in I.successors] I.in = (I.out / I.def) U I.use (where U stands for set union, / stands for set difference) - repeat the computation while I.{in,out} changes for any instruction. On implementation side keep things as simple, as possible: - check_cfg() already marks instructions EXPLORED in post-order, modify it to save the index of each EXPLORED instruction in a vector; - represent I.{in,out,use,def} as bitmasks; - don't split the program into basic blocks and don't maintain the work queue, instead: - do fixed-point computation by visiting each instruction; - maintain a simple 'changed' flag if I.{in,out} for any instruction change; Measurements show that even such simplistic implementation does not add measurable verification time overhead (for selftests, at-least). Note on check_cfg() ex_insn_beg/ex_done change: To avoid out of bounds access to env->cfg.insn_postorder array, it should be guaranteed that instruction transitions to EXPLORED state only once. Previously this was not the fact for incorrect programs with direct calls to exception callbacks. The 'align' selftest needs adjustment to skip computed insn/live registers printout. Otherwise it matches lines from the live registers printout. [1] https://en.wikipedia.org/wiki/Live-variable_analysis Signed-off-by: Eduard Zingerman <eddyz87@gmail.com> Link: https://lore.kernel.org/r/20250304195024.2478889-4-eddyz87@gmail.com Signed-off-by: Alexei Starovoitov <ast@kernel.org>
2025-03-15	selftests/bpf: Add selftests for load-acquire and store-release instructions	Peilin Ye
	Add several ./test_progs tests: - arena_atomics/load_acquire - arena_atomics/store_release - verifier_load_acquire/* - verifier_store_release/* - verifier_precision/bpf_load_acquire - verifier_precision/bpf_store_release The last two tests are added to check if backtrack_insn() handles the new instructions correctly. Additionally, the last test also makes sure that the verifier "remembers" the value (in src_reg) we store-release into e.g. a stack slot. For example, if we take a look at the test program: #0: r1 = 8; /* store_release((u64 )(r10 - 8), r1); / #1: .8byte %[store_release]; #2: r1 = (u64 )(r10 - 8); #3: r2 = r10; #4: r2 += r1; #5: r0 = 0; #6: exit; At #1, if the verifier doesn't remember that we wrote 8 to the stack, then later at #4 we would be adding an unbounded scalar value to the stack pointer, which would cause the program to be rejected: VERIFIER LOG: ============= ... math between fp pointer and register with unbounded min value is not allowed For easier CI integration, instead of using built-ins like __atomic_{load,store}_n() which depend on the new __BPF_FEATURE_LOAD_ACQ_STORE_REL pre-defined macro, manually craft load-acquire/store-release instructions using __imm_insn(), as suggested by Eduard. All new tests depend on: (1) Clang major version >= 18, and (2) ENABLE_ATOMICS_TESTS is defined (currently implies -mcpu=v3 or v4), and (3) JIT supports load-acquire/store-release (currently arm64 and x86-64) In .../progs/arena_atomics.c: /* 8-byte-aligned / __u8 __arena_global load_acquire8_value = 0x12; / 1-byte hole */ __u16 __arena_global load_acquire16_value = 0x1234; That 1-byte hole in the .addr_space.1 ELF section caused clang-17 to crash: fatal error: error in backend: unable to write nop sequence of 1 bytes To work around such llvm-17 CI job failures, conditionally define __arena_global variables as 64-bit if __clang_major__ < 18, to make sure .addr_space.1 has no holes. Ideally we should avoid compiling this file using clang-17 at all (arena tests depend on __BPF_FEATURE_ADDR_SPACE_CAST, and are skipped for llvm-17 anyway), but that is a separate topic. Acked-by: Eduard Zingerman <eddyz87@gmail.com> Signed-off-by: Peilin Ye <yepeilin@google.com> Link: https://lore.kernel.org/r/1b46c6feaf0f1b6984d9ec80e500cc7383e9da1a.1741049567.git.yepeilin@google.com Signed-off-by: Alexei Starovoitov <ast@kernel.org>
2025-03-15	selftests/bpf: Add tests for bpf_object__prepare	Mykyta Yatsenko
	Add selftests, checking that running bpf_object__prepare successfully creates maps before load step. Signed-off-by: Mykyta Yatsenko <yatsenko@meta.com> Signed-off-by: Andrii Nakryiko <andrii@kernel.org> Link: https://lore.kernel.org/bpf/20250303135752.158343-5-mykyta.yatsenko5@gmail.com Signed-off-by: Alexei Starovoitov <ast@kernel.org>
2025-03-15	selftests/bpf: test_tunnel: Move ip6tnl tunnel tests to test_progs	Bastien Curutchet (eBPF Foundation)
	ip6tnl tunnels are tested in the test_tunnel.sh but not in the test_progs framework. Add a new test in test_progs to test ip6tnl tunnels. It uses the same network topology and the same BPF programs than the script. Remove test_ipip6() and test_ip6ip6() from the script. Signed-off-by: Bastien Curutchet (eBPF Foundation) <bastien.curutchet@bootlin.com> Signed-off-by: Martin KaFai Lau <martin.lau@kernel.org> Acked-by: Stanislav Fomichev <sdf@fomichev.me> Link: https://patch.msgid.link/20250303-tunnels-v2-9-8329f38f0678@bootlin.com Signed-off-by: Alexei Starovoitov <ast@kernel.org>
2025-03-15	selftests/bpf: test_tunnel: Move ip6geneve tunnel test to test_progs	Bastien Curutchet (eBPF Foundation)
	ip6geneve tunnels are tested in the test_tunnel.sh but not in the test_progs framework. Add a new test in test_progs to test ip6geneve tunnels. It uses the same network topology and the same BPF programs than the script. Remove test_ip6geneve() from the script. Signed-off-by: Bastien Curutchet (eBPF Foundation) <bastien.curutchet@bootlin.com> Signed-off-by: Martin KaFai Lau <martin.lau@kernel.org> Acked-by: Stanislav Fomichev <sdf@fomichev.me> Link: https://patch.msgid.link/20250303-tunnels-v2-8-8329f38f0678@bootlin.com Signed-off-by: Alexei Starovoitov <ast@kernel.org>
2025-03-15	selftests/bpf: test_tunnel: Move geneve tunnel test to test_progs	Bastien Curutchet (eBPF Foundation)
	geneve tunnels are tested in the test_tunnel.sh but not in the test_progs framework. Add a new test in test_progs to test geneve tunnels. It uses the same network topology and the same BPF programs than the script. Remove test_geneve() from the script. Signed-off-by: Bastien Curutchet (eBPF Foundation) <bastien.curutchet@bootlin.com> Signed-off-by: Martin KaFai Lau <martin.lau@kernel.org> Acked-by: Stanislav Fomichev <sdf@fomichev.me> Link: https://patch.msgid.link/20250303-tunnels-v2-7-8329f38f0678@bootlin.com Signed-off-by: Alexei Starovoitov <ast@kernel.org>
2025-03-15	selftests/bpf: test_tunnel: Move ip6erspan tunnel test to test_progs	Bastien Curutchet (eBPF Foundation)
	ip6erspan tunnels are tested in the test_tunnel.sh but not in the test_progs framework. Add a new test in test_progs to test ip6erspan tunnels. It uses the same network topology and the same BPF programs than the script. Remove test_ip6erspan() from the script. Signed-off-by: Bastien Curutchet (eBPF Foundation) <bastien.curutchet@bootlin.com> Signed-off-by: Martin KaFai Lau <martin.lau@kernel.org> Acked-by: Stanislav Fomichev <sdf@fomichev.me> Link: https://patch.msgid.link/20250303-tunnels-v2-6-8329f38f0678@bootlin.com Signed-off-by: Alexei Starovoitov <ast@kernel.org>
2025-03-15	selftests/bpf: test_tunnel: Move erspan tunnel tests to test_progs	Bastien Curutchet (eBPF Foundation)
	erspan tunnels are tested in the test_tunnel.sh but not in the test_progs framework. Add a new test in test_progs to test erspan tunnels. It uses the same network topology and the same BPF programs than the script. Remove test_erspan() from the script. Signed-off-by: Bastien Curutchet (eBPF Foundation) <bastien.curutchet@bootlin.com> Signed-off-by: Martin KaFai Lau <martin.lau@kernel.org> Acked-by: Stanislav Fomichev <sdf@fomichev.me> Link: https://patch.msgid.link/20250303-tunnels-v2-5-8329f38f0678@bootlin.com Signed-off-by: Alexei Starovoitov <ast@kernel.org>
2025-03-15	selftests/bpf: test_tunnel: Move ip6gre tunnel test to test_progs	Bastien Curutchet (eBPF Foundation)
	ip6gre tunnels are tested in the test_tunnel.sh but not in the test_progs framework. Add a new test in test_progs to test ip6gre tunnels. It uses the same network topology and the same BPF programs than the script. Disable the IPv6 DAD feature because it can take lot of time and cause some tests to fail depending on the environment they're run on. Remove test_ip6gre() and test_ip6gretap() from the script. Signed-off-by: Bastien Curutchet (eBPF Foundation) <bastien.curutchet@bootlin.com> Signed-off-by: Martin KaFai Lau <martin.lau@kernel.org> Acked-by: Stanislav Fomichev <sdf@fomichev.me> Link: https://patch.msgid.link/20250303-tunnels-v2-4-8329f38f0678@bootlin.com Signed-off-by: Alexei Starovoitov <ast@kernel.org>
2025-03-15	selftests/bpf: test_tunnel: Move gre tunnel test to test_progs	Bastien Curutchet (eBPF Foundation)
	gre tunnels are tested in the test_tunnel.sh but not in the test_progs framework. Add a new test in test_progs to test gre tunnels. It uses the same network topology and the same BPF programs than the script. Remove test_gre() and test_gre_no_tunnel_key() from the script. Signed-off-by: Bastien Curutchet (eBPF Foundation) <bastien.curutchet@bootlin.com> Signed-off-by: Martin KaFai Lau <martin.lau@kernel.org> Acked-by: Stanislav Fomichev <sdf@fomichev.me> Link: https://patch.msgid.link/20250303-tunnels-v2-3-8329f38f0678@bootlin.com Signed-off-by: Alexei Starovoitov <ast@kernel.org>
2025-03-15	selftests/bpf: test_tunnel: Add ping helpers	Bastien Curutchet (eBPF Foundation)
	All tests use more or less the same ping commands as final validation. Also test_ping()'s return value is checked with ASSERT_OK() while this check is already done by the SYS() macro inside test_ping(). Create helpers around test_ping() and use them in the tests to avoid code duplication. Remove the unnecessary ASSERT_OK() from the tests. Signed-off-by: Bastien Curutchet (eBPF Foundation) <bastien.curutchet@bootlin.com> Signed-off-by: Martin KaFai Lau <martin.lau@kernel.org> Acked-by: Stanislav Fomichev <sdf@fomichev.me> Link: https://patch.msgid.link/20250303-tunnels-v2-2-8329f38f0678@bootlin.com Signed-off-by: Alexei Starovoitov <ast@kernel.org>
2025-03-15	selftests/bpf: test_tunnel: Add generic_attach* helpers	Bastien Curutchet (eBPF Foundation)
	A fair amount of code duplication is present among tests to attach BPF programs. Create generic_attach* helpers that attach BPF programs to a given interface. Use ASSERT_OK_FD() instead of ASSERT_GE() to check fd's validity. Use these helpers in all the available tests. Signed-off-by: Bastien Curutchet (eBPF Foundation) <bastien.curutchet@bootlin.com> Signed-off-by: Martin KaFai Lau <martin.lau@kernel.org> Acked-by: Stanislav Fomichev <sdf@fomichev.me> Link: https://patch.msgid.link/20250303-tunnels-v2-1-8329f38f0678@bootlin.com Signed-off-by: Alexei Starovoitov <ast@kernel.org>
2025-03-15	selftests/bpf: Add tests for extending sleepable global subprogs	Kumar Kartikeya Dwivedi
	Add tests for freplace behavior with the combination of sleepable and non-sleepable global subprogs. The changes_pkt_data selftest did all the hardwork, so simply rename it and include new support for more summarization tests for might_sleep bit. Signed-off-by: Kumar Kartikeya Dwivedi <memxor@gmail.com> Link: https://lore.kernel.org/r/20250301151846.1552362-4-memxor@gmail.com Signed-off-by: Alexei Starovoitov <ast@kernel.org>
2025-03-15	selftests/bpf: Test sleepable global subprogs in atomic contexts	Kumar Kartikeya Dwivedi
	Add tests for rejecting sleepable and accepting non-sleepable global function calls in atomic contexts. For spin locks, we still reject all global function calls. Once resilient spin locks land, we will carefully lift in cases where we deem it safe. Signed-off-by: Kumar Kartikeya Dwivedi <memxor@gmail.com> Link: https://lore.kernel.org/r/20250301151846.1552362-3-memxor@gmail.com Signed-off-by: Alexei Starovoitov <ast@kernel.org>
2025-03-15	selftests/bpf: Add selftests allowing cgroup prog pre-ordering	Yonghong Song
	Add a few selftests with cgroup prog pre-ordering. Signed-off-by: Yonghong Song <yonghong.song@linux.dev> Link: https://lore.kernel.org/r/20250224230121.283601-1-yonghong.song@linux.dev Signed-off-by: Alexei Starovoitov <ast@kernel.org>
2025-03-15	selftests/bpf: Allow auto port binding for bpf nf	Jiayuan Chen
	Allow auto port binding for bpf nf test to avoid binding conflict. ./test_progs -a bpf_nf 24/1 bpf_nf/xdp-ct:OK 24/2 bpf_nf/tc-bpf-ct:OK 24/3 bpf_nf/alloc_release:OK 24/4 bpf_nf/insert_insert:OK 24/5 bpf_nf/lookup_insert:OK 24/6 bpf_nf/set_timeout_after_insert:OK 24/7 bpf_nf/set_status_after_insert:OK 24/8 bpf_nf/change_timeout_after_alloc:OK 24/9 bpf_nf/change_status_after_alloc:OK 24/10 bpf_nf/write_not_allowlisted_field:OK 24 bpf_nf:OK Summary: 1/10 PASSED, 0 SKIPPED, 0 FAILED Signed-off-by: Jiayuan Chen <jiayuan.chen@linux.dev> Link: https://lore.kernel.org/r/20250227142646.59711-3-jiayuan.chen@linux.dev Signed-off-by: Alexei Starovoitov <ast@kernel.org>
2025-03-15	selftests/bpf: Allow auto port binding for cgroup connect	Jiayuan Chen
	Allow auto port binding for cgroup connect test to avoid binding conflict. Result: ./test_progs -a cgroup_v1v2 59 cgroup_v1v2:OK Summary: 1/0 PASSED, 0 SKIPPED, 0 FAILED Signed-off-by: Jiayuan Chen <jiayuan.chen@linux.dev> Link: https://lore.kernel.org/r/20250227142646.59711-2-jiayuan.chen@linux.dev Signed-off-by: Alexei Starovoitov <ast@kernel.org>
2025-03-15	selftests/bpf: Add tests for bpf_dynptr_copy	Mykyta Yatsenko
	Add XDP setup type for dynptr tests, enabling testing for non-contiguous buffer. Add 2 tests: - test_dynptr_copy - verify correctness for the fast (contiguous buffer) code path. - test_dynptr_copy_xdp - verifies code paths that handle non-contiguous buffer. Signed-off-by: Mykyta Yatsenko <yatsenko@meta.com> Signed-off-by: Andrii Nakryiko <andrii@kernel.org> Link: https://lore.kernel.org/bpf/20250226183201.332713-4-mykyta.yatsenko5@gmail.com Signed-off-by: Alexei Starovoitov <ast@kernel.org>
2025-03-07	Merge tag 'for-netdev' of ↵	Jakub Kicinski
	https://git.kernel.org/pub/scm/linux/kernel/git/bpf/bpf-next Martin KaFai Lau says: ==================== pull-request: bpf-next 2025-03-06 We've added 6 non-merge commits during the last 13 day(s) which contain a total of 6 files changed, 230 insertions(+), 56 deletions(-). The main changes are: 1) Add XDP metadata support for tun driver, from Marcus Wichelmann. * tag 'for-netdev' of https://git.kernel.org/pub/scm/linux/kernel/git/bpf/bpf-next: selftests/bpf: Fix file descriptor assertion in open_tuntap helper selftests/bpf: Add test for XDP metadata support in tun driver selftests/bpf: Refactor xdp_context_functional test and bpf program selftests/bpf: Move open_tuntap to network helpers net: tun: Enable transfer of XDP metadata to skb net: tun: Enable XDP metadata support ==================== Link: https://patch.msgid.link/20250307055335.441298-1-martin.lau@linux.dev Signed-off-by: Jakub Kicinski <kuba@kernel.org>
2025-03-06	selftests/bpf: Add test for XDP metadata support in tun driver	Marcus Wichelmann
	Add a selftest that creates a tap device, attaches XDP and TC programs, writes a packet with a test payload into the tap device and checks the test result. This test ensures that the XDP metadata support in the tun driver is enabled and that the metadata size is correctly passed to the skb. See the previous commit ("selftests/bpf: refactor xdp_context_functional test and bpf program") for details about the test design. The test runs in its own network namespace. This provides some extra safety against conflicting interface names. Signed-off-by: Marcus Wichelmann <marcus.wichelmann@hetzner-cloud.de> Signed-off-by: Martin KaFai Lau <martin.lau@kernel.org> Link: https://patch.msgid.link/20250305213438.3863922-6-marcus.wichelmann@hetzner-cloud.de
2025-03-06	selftests/bpf: Refactor xdp_context_functional test and bpf program	Marcus Wichelmann
	The existing XDP metadata test works by creating a veth pair and attaching XDP & TC programs that drop the packet when the condition of the test isn't fulfilled. The test then pings through the veth pair and succeeds when the ping comes through. While this test works great for a veth pair, it is hard to replicate for tap devices to test the XDP metadata support of them. A similar test for the tun driver would either involve logic to reply to the ping request, or would have to capture the packet to check if it was dropped or not. To make the testing of other drivers easier while still maximizing code reuse, this commit refactors the existing xdp_context_functional test to use a test_result map. Instead of conditionally passing or dropping the packet, the TC program is changed to copy the received metadata into the value of that single-entry array map. Tests can then verify that the map value matches the expectation. This testing logic is easy to adapt to other network drivers as the only remaining requirement is that there is some way to send a custom Ethernet packet through it that triggers the XDP & TC programs. The Ethernet header of that custom packet is all-zero, because it is not required to be valid for the test to work. The zero ethertype also helps to filter out packets that are not related to the test and would otherwise interfere with it. The payload of the Ethernet packet is used as the test data that is expected to be passed as metadata from the XDP to the TC program and written to the map. It has a fixed size of 32 bytes which is a reasonable size that should be supported by both drivers. Additional packet headers are not necessary for the test and were therefore skipped to keep the testing code short. This new testing methodology no longer requires the veth interfaces to have IP addresses assigned, therefore these were removed. Signed-off-by: Marcus Wichelmann <marcus.wichelmann@hetzner-cloud.de> Signed-off-by: Martin KaFai Lau <martin.lau@kernel.org> Reviewed-by: Willem de Bruijn <willemb@google.com> Link: https://patch.msgid.link/20250305213438.3863922-5-marcus.wichelmann@hetzner-cloud.de
2025-03-06	selftests/bpf: Move open_tuntap to network helpers	Marcus Wichelmann
	To test the XDP metadata functionality of the tun driver, it's necessary to create a new tap device first. A helper function for this already exists in lwt_helpers.h. Move it to the common network helpers header, so it can be reused in other tests. Signed-off-by: Marcus Wichelmann <marcus.wichelmann@hetzner-cloud.de> Signed-off-by: Martin KaFai Lau <martin.lau@kernel.org> Link: https://patch.msgid.link/20250305213438.3863922-4-marcus.wichelmann@hetzner-cloud.de
2025-02-27	Merge git://git.kernel.org/pub/scm/linux/kernel/git/netdev/net	Jakub Kicinski
	Cross-merge networking fixes after downstream PR (net-6.14-rc5). Conflicts: drivers/net/ethernet/cadence/macb_main.c fa52f15c745c ("net: cadence: macb: Synchronize stats calculations") 75696dd0fd72 ("net: cadence: macb: Convert to get_stats64") https://lore.kernel.org/20250224125848.68ee63e5@canb.auug.org.au Adjacent changes: drivers/net/ethernet/intel/ice/ice_sriov.c 79990cf5e7ad ("ice: Fix deinitializing VF in error path") a203163274a4 ("ice: simplify VF MSI-X managing") net/ipv4/tcp.c 18912c520674 ("tcp: devmem: don't write truncated dmabuf CMSGs to userspace") 297d389e9e5b ("net: prefix devmem specific helpers") net/mptcp/subflow.c 8668860b0ad3 ("mptcp: reset when MPTCP opts are dropped after join") c3349a22c200 ("mptcp: consolidate subflow cleanup") Signed-off-by: Jakub Kicinski <kuba@kernel.org>
2025-02-26	selftests/bpf: Introduce veristat test	Mykyta Yatsenko
	Introducing test for veristat, part of test_progs. Test cases cover functionality of setting global variables in BPF program. Signed-off-by: Mykyta Yatsenko <yatsenko@meta.com> Signed-off-by: Andrii Nakryiko <andrii@kernel.org> Acked-by: Eduard Zingerman <eddyz87@gmail.com> Link: https://lore.kernel.org/bpf/20250225163101.121043-3-mykyta.yatsenko5@gmail.com
2025-02-26	selftests/bpf: Test bpf_usdt_arg_size() function	Ihor Solodrai
	Update usdt tests to also check for correct behavior of bpf_usdt_arg_size(). Signed-off-by: Ihor Solodrai <ihor.solodrai@linux.dev> Signed-off-by: Andrii Nakryiko <andrii@kernel.org> Reviewed-by: Jiri Olsa <jolsa@kernel.org> Link: https://lore.kernel.org/bpf/20250224235756.2612606-2-ihor.solodrai@linux.dev
2025-02-26	selftests/bpf: add cgroup_skb netns cookie tests	Mahe Tardy
	Add netns cookie test that verifies the helper is now supported and work in the context of cgroup_skb programs. Signed-off-by: Mahe Tardy <mahe.tardy@gmail.com> Link: https://lore.kernel.org/r/20250225125031.258740-2-mahe.tardy@gmail.com Signed-off-by: Alexei Starovoitov <ast@kernel.org>
2025-02-25	selftests/bpf: Test gen_pro/epilogue that generate kfuncs	Amery Hung
	Test gen_prologue and gen_epilogue that generate kfuncs that have not been seen in the main program. The main bpf program and return value checks are identical to pro_epilogue.c introduced in commit 47e69431b57a ("selftests/bpf: Test gen_prologue and gen_epilogue"). However, now when bpf_testmod_st_ops detects a program name with prefix "test_kfunc_", it generates slightly different prologue and epilogue: They still add 1000 to args->a in prologue, add 10000 to args->a and set r0 to 2 * args->a in epilogue, but involve kfuncs. At high level, the alternative version of prologue and epilogue look like this: cgrp = bpf_cgroup_from_id(0); if (cgrp) bpf_cgroup_release(cgrp); else /* Perform what original bpf_testmod_st_ops prologue or * epilogue does */ Since 0 is never a valid cgroup id, the original prologue or epilogue logic will be performed. As a result, the __retval check should expect the exact same return value. Signed-off-by: Amery Hung <ameryhung@gmail.com> Acked-by: Martin KaFai Lau <martin.lau@kernel.org> Link: https://lore.kernel.org/r/20250225233545.285481-2-ameryhung@gmail.com Signed-off-by: Alexei Starovoitov <ast@kernel.org>
2025-02-21	Merge tag 'for-netdev' of ↵	Jakub Kicinski
	https://git.kernel.org/pub/scm/linux/kernel/git/bpf/bpf-next Martin KaFai Lau says: ==================== pull-request: bpf-next 2025-02-20 We've added 19 non-merge commits during the last 8 day(s) which contain a total of 35 files changed, 1126 insertions(+), 53 deletions(-). The main changes are: 1) Add TCP_RTO_MAX_MS support to bpf_set/getsockopt, from Jason Xing 2) Add network TX timestamping support to BPF sock_ops, from Jason Xing 3) Add TX metadata Launch Time support, from Song Yoong Siang * tag 'for-netdev' of https://git.kernel.org/pub/scm/linux/kernel/git/bpf/bpf-next: igc: Add launch time support to XDP ZC igc: Refactor empty frame insertion for launch time support net: stmmac: Add launch time support to XDP ZC selftests/bpf: Add launch time request to xdp_hw_metadata xsk: Add launch time hardware offload support to XDP Tx metadata selftests/bpf: Add simple bpf tests in the tx path for timestamping feature bpf: Support selective sampling for bpf timestamping bpf: Add BPF_SOCK_OPS_TSTAMP_SENDMSG_CB callback bpf: Add BPF_SOCK_OPS_TSTAMP_ACK_CB callback bpf: Add BPF_SOCK_OPS_TSTAMP_SND_HW_CB callback bpf: Add BPF_SOCK_OPS_TSTAMP_SND_SW_CB callback bpf: Add BPF_SOCK_OPS_TSTAMP_SCHED_CB callback net-timestamp: Prepare for isolating two modes of SO_TIMESTAMPING bpf: Disable unsafe helpers in TX timestamping callbacks bpf: Prevent unsafe access to the sock fields in the BPF timestamping callback bpf: Prepare the sock_ops ctx and call bpf prog for TX timestamping bpf: Add networking timestamping support to bpf_get/setsockopt() selftests/bpf: Add rto max for bpf_setsockopt test bpf: Support TCP_RTO_MAX_MS for bpf_setsockopt ==================== Link: https://patch.msgid.link/20250221022104.386462-1-martin.lau@linux.dev Signed-off-by: Jakub Kicinski <kuba@kernel.org>
2025-02-20	selftests/bpf: Test struct_ops program with __ref arg calling bpf_tail_call	Amery Hung
	Test if the verifier rejects struct_ops program with __ref argument calling bpf_tail_call(). Signed-off-by: Amery Hung <ameryhung@gmail.com> Acked-by: Eduard Zingerman <eddyz87@gmail.com> Link: https://lore.kernel.org/r/20250220221532.1079331-2-ameryhung@gmail.com Signed-off-by: Alexei Starovoitov <ast@kernel.org>
2025-02-20	Merge git://git.kernel.org/pub/scm/linux/kernel/git/bpf/bpf bpf-6.14-rc4	Alexei Starovoitov
	Cross-merge bpf fixes after downstream PR (bpf-6.14-rc4). Minor conflict: kernel/bpf/btf.c Adjacent changes: kernel/bpf/arena.c kernel/bpf/btf.c kernel/bpf/syscall.c kernel/bpf/verifier.c mm/memory.c Signed-off-by: Alexei Starovoitov <ast@kernel.org>
2025-02-20	Merge tag 'bpf-fixes' of git://git.kernel.org/pub/scm/linux/kernel/git/bpf/bpf	Linus Torvalds
	Pull BPF fixes from Daniel Borkmann: - Fix a soft-lockup in BPF arena_map_free on 64k page size kernels (Alan Maguire) - Fix a missing allocation failure check in BPF verifier's acquire_lock_state (Kumar Kartikeya Dwivedi) - Fix a NULL-pointer dereference in trace_kfree_skb by adding kfree_skb to the raw_tp_null_args set (Kuniyuki Iwashima) - Fix a deadlock when freeing BPF cgroup storage (Abel Wu) - Fix a syzbot-reported deadlock when holding BPF map's freeze_mutex (Andrii Nakryiko) - Fix a use-after-free issue in bpf_test_init when eth_skb_pkt_type is accessing skb data not containing an Ethernet header (Shigeru Yoshida) - Fix skipping non-existing keys in generic_map_lookup_batch (Yan Zhai) - Several BPF sockmap fixes to address incorrect TCP copied_seq calculations, which prevented correct data reads from recv(2) in user space (Jiayuan Chen) - Two fixes for BPF map lookup nullness elision (Daniel Xu) - Fix a NULL-pointer dereference from vmlinux BTF lookup in bpf_sk_storage_tracing_allowed (Jared Kangas) * tag 'bpf-fixes' of git://git.kernel.org/pub/scm/linux/kernel/git/bpf/bpf: selftests: bpf: test batch lookup on array of maps with holes bpf: skip non exist keys in generic_map_lookup_batch bpf: Handle allocation failure in acquire_lock_state bpf: verifier: Disambiguate get_constant_map_key() errors bpf: selftests: Test constant key extraction on irrelevant maps bpf: verifier: Do not extract constant map keys for irrelevant maps bpf: Fix softlockup in arena_map_free on 64k page kernel net: Add rx_skb of kfree_skb to raw_tp_null_args[]. bpf: Fix deadlock when freeing cgroup storage selftests/bpf: Add strparser test for bpf selftests/bpf: Fix invalid flag of recv() bpf: Disable non stream socket for strparser bpf: Fix wrong copied_seq calculation strparser: Add read_sock callback bpf: avoid holding freeze_mutex during mmap operation bpf: unify VM_WRITE vs VM_MAYWRITE use in BPF map mmaping logic selftests/bpf: Adjust data size to have ETH_HLEN bpf, test_run: Fix use-after-free issue in eth_skb_pkt_type() bpf: Remove unnecessary BTF lookups in bpf_sk_storage_tracing_allowed
2025-02-20	selftests/bpf: Add simple bpf tests in the tx path for timestamping feature	Jason Xing
	BPF program calculates a couple of latency deltas between each tx timestamping callbacks. It can be used in the real world to diagnose the kernel behaviour in the tx path. Check the safety issues by accessing a few bpf calls in bpf_test_access_bpf_calls() which are implemented in the patch 3 and 4. Check if the bpf timestamping can co-exist with socket timestamping. There remains a few realistic things[1][2] to highlight: 1. in general a packet may pass through multiple qdiscs. For instance with bonding or tunnel virtual devices in the egress path. 2. packets may be resent, in which case an ACK might precede a repeat SCHED and SND. 3. erroneous or malicious peers may also just never send an ACK. [1]: https://lore.kernel.org/all/67a389af981b0_14e0832949d@willemb.c.googlers.com.notmuch/ [2]: https://lore.kernel.org/all/c329a0c1-239b-4ca1-91f2-cb30b8dd2f6a@linux.dev/ Signed-off-by: Jason Xing <kerneljasonxing@gmail.com> Signed-off-by: Martin KaFai Lau <martin.lau@kernel.org> Reviewed-by: Willem de Bruijn <willemb@google.com> Link: https://patch.msgid.link/20250220072940.99994-13-kerneljasonxing@gmail.com
2025-02-19	selftests/bpf: Add a specific dst port matching	Cong Wang
	After this patch: #102/1 flow_dissector_classification/ipv4:OK #102/2 flow_dissector_classification/ipv4_continue_dissect:OK #102/3 flow_dissector_classification/ipip:OK #102/4 flow_dissector_classification/gre:OK #102/5 flow_dissector_classification/port_range:OK #102/6 flow_dissector_classification/ipv6:OK #102 flow_dissector_classification:OK Summary: 1/6 PASSED, 0 SKIPPED, 0 FAILED Cc: Daniel Borkmann <daniel@iogearbox.net> Cc: Andrii Nakryiko <andrii@kernel.org> Signed-off-by: Cong Wang <xiyou.wangcong@gmail.com> Link: https://patch.msgid.link/20250218043210.732959-5-xiyou.wangcong@gmail.com Signed-off-by: Jakub Kicinski <kuba@kernel.org>
2025-02-19	selftests/bpf: Add tests for bpf_copy_from_user_task_str	Jordan Rome
	This adds tests for both the happy path and the error path (with and without the BPF_F_PAD_ZEROS flag). Signed-off-by: Jordan Rome <linux@jordanrome.com> Signed-off-by: Andrii Nakryiko <andrii@kernel.org> Link: https://lore.kernel.org/bpf/20250213152125.1837400-3-linux@jordanrome.com