lwn.git - Linux kernel documentation tree maintained by Jonathan Corbet

Age	Commit message (Collapse)	Author
10 days	Merge tag 'bpf-next-6.15' of ↵	Linus Torvalds
	git://git.kernel.org/pub/scm/linux/kernel/git/bpf/bpf-next Pull bpf updates from Alexei Starovoitov: "For this merge window we're splitting BPF pull request into three for higher visibility: main changes, res_spin_lock, try_alloc_pages. These are the main BPF changes: - Add DFA-based live registers analysis to improve verification of programs with loops (Eduard Zingerman) - Introduce load_acquire and store_release BPF instructions and add x86, arm64 JIT support (Peilin Ye) - Fix loop detection logic in the verifier (Eduard Zingerman) - Drop unnecesary lock in bpf_map_inc_not_zero() (Eric Dumazet) - Add kfunc for populating cpumask bits (Emil Tsalapatis) - Convert various shell based tests to selftests/bpf/test_progs format (Bastien Curutchet) - Allow passing referenced kptrs into struct_ops callbacks (Amery Hung) - Add a flag to LSM bpf hook to facilitate bpf program signing (Blaise Boscaccy) - Track arena arguments in kfuncs (Ihor Solodrai) - Add copy_remote_vm_str() helper for reading strings from remote VM and bpf_copy_from_user_task_str() kfunc (Jordan Rome) - Add support for timed may_goto instruction (Kumar Kartikeya Dwivedi) - Allow bpf_get_netns_cookie() int cgroup_skb programs (Mahe Tardy) - Reduce bpf_cgrp_storage_busy false positives when accessing cgroup local storage (Martin KaFai Lau) - Introduce bpf_dynptr_copy() kfunc (Mykyta Yatsenko) - Allow retrieving BTF data with BTF token (Mykyta Yatsenko) - Add BPF kfuncs to set and get xattrs with 'security.bpf.' prefix (Song Liu) - Reject attaching programs to noreturn functions (Yafang Shao) - Introduce pre-order traversal of cgroup bpf programs (Yonghong Song)" * tag 'bpf-next-6.15' of git://git.kernel.org/pub/scm/linux/kernel/git/bpf/bpf-next: (186 commits) selftests/bpf: Add selftests for load-acquire/store-release when register number is invalid bpf: Fix out-of-bounds read in check_atomic_load/store() libbpf: Add namespace for errstr making it libbpf_errstr bpf: Add struct_ops context information to struct bpf_prog_aux selftests/bpf: Sanitize pointer prior fclose() selftests/bpf: Migrate test_xdp_vlan.sh into test_progs selftests/bpf: test_xdp_vlan: Rename BPF sections bpf: clarify a misleading verifier error message selftests/bpf: Add selftest for attaching fexit to __noreturn functions bpf: Reject attaching fexit/fmod_ret to __noreturn functions bpf: Only fails the busy counter check in bpf_cgrp_storage_get if it creates storage bpf: Make perf_event_read_output accessible in all program types. bpftool: Using the right format specifiers bpftool: Add -Wformat-signedness flag to detect format errors selftests/bpf: Test freplace from user namespace libbpf: Pass BPF token from find_prog_btf_id to BPF_BTF_GET_FD_BY_ID bpf: Return prog btf_id without capable check bpf: BPF token support for BPF_BTF_GET_FD_BY_ID bpf, x86: Fix objtool warning for timed may_goto bpf: Check map->record at the beginning of check_and_free_fields() ...
2025-03-15	selftests/bpf: Fix sockopt selftest failure on powerpc	Saket Kumar Bhaskar
	The SO_RCVLOWAT option is defined as 18 in the selftest header, which matches the generic definition. However, on powerpc, SO_RCVLOWAT is defined as 16. This discrepancy causes sol_socket_sockopt() to fail with the default switch case on powerpc. This commit fixes by defining SO_RCVLOWAT as 16 for powerpc. Signed-off-by: Saket Kumar Bhaskar <skb99@linux.ibm.com> Signed-off-by: Andrii Nakryiko <andrii@kernel.org> Tested-by: Venkat Rao Bagalkote <venkat88@linux.ibm.com> Link: https://lore.kernel.org/bpf/20250311084647.3686544-1-skb99@linux.ibm.com Signed-off-by: Alexei Starovoitov <ast@kernel.org>
2025-02-19	selftests/bpf: Add rto max for bpf_setsockopt test	Jason Xing
	Test the TCP_RTO_MAX_MS optname in the existing setget_sockopt test. Signed-off-by: Jason Xing <kerneljasonxing@gmail.com> Signed-off-by: Martin KaFai Lau <martin.lau@kernel.org> Link: https://patch.msgid.link/20250219081333.56378-3-kerneljasonxing@gmail.com
2024-05-09	selftests/bpf: Remove the bpf_tcp_helpers.h usages from other non tcp-cc tests	Martin KaFai Lau
	The patch removes the remaining bpf_tcp_helpers.h usages in the non tcp-cc networking tests. It either replaces it with bpf_tracing_net.h or just removed it because the test is not actually using any kernel sockets. For the later, the missing macro (mainly SOL_TCP) is defined locally. An exception is the test_sock_fields which is testing the "struct bpf_sock" type instead of the kernel sock type. Whenever "vmlinux.h" is used instead, it hits a verifier error on doing arithmetic on the sock_common pointer: ; return !a6[0] && !a6[1] && !a6[2] && a6[3] == bpf_htonl(1); @ test_sock_fields.c:54 21: (61) r2 = (u32 )(r1 +28) ; R1_w=sock_common() R2_w=scalar(smin=0,smax=umax=0xffffffff,var_off=(0x0; 0xffffffff)) 22: (56) if w2 != 0x0 goto pc-6 ; R2_w=0 23: (b7) r3 = 28 ; R3_w=28 24: (bf) r2 = r1 ; R1_w=sock_common() R2_w=sock_common() 25: (0f) r2 += r3 R2 pointer arithmetic on sock_common prohibited Hence, instead of including bpf_tracing_net.h, the test_sock_fields test defines a tcp_sock with one lsndtime field in it. Another highlight is, in sockopt_qos_to_cc.c, the tcp_cc_eq() is replaced by bpf_strncmp(). tcp_cc_eq() was a workaround in bpf_tcp_helpers.h before bpf_strncmp had been added. The SOL_IPV6 addition to bpf_tracing_net.h is needed by the test_tcpbpf_kern test. Signed-off-by: Martin KaFai Lau <martin.lau@kernel.org> Link: https://lore.kernel.org/r/20240509175026.3423614-10-martin.lau@linux.dev Signed-off-by: Alexei Starovoitov <ast@kernel.org>
2024-05-09	selftests/bpf: Add a few tcp helper functions and macros to bpf_tracing_net.h	Martin KaFai Lau
	This patch adds a few tcp related helper functions to bpf_tracing_net.h. They will be useful for both tcp-cc and network tracing related bpf progs. They have already been in the bpf_tcp_helpers.h. This change is needed to retire the bpf_tcp_helpers.h and consolidate all tests to vmlinux.h (i.e. bpf_tracing_net.h). Some of the helpers (tcp_sk and inet_csk) are also defined in bpf_cc_cubic.c and they are removed. While at it, remove the vmlinux.h from bpf_cc_cubic.c. bpf_tracing_net.h (which has vmlinux.h after this patch) is enough and will be consistent with the other tcp-cc tests in the later patches. The other TCP_* macro additions will be needed for the bpf_dctcp changes in the later patch. Signed-off-by: Martin KaFai Lau <martin.lau@kernel.org> Link: https://lore.kernel.org/r/20240509175026.3423614-3-martin.lau@linux.dev Signed-off-by: Alexei Starovoitov <ast@kernel.org>
2024-05-02	selftests/bpf: Add test for the use of new args in cong_control	Miao Xu
	This patch adds a selftest to show the usage of the new arguments in cong_control. For simplicity's sake, the testing example reuses cubic's kernel functions. Signed-off-by: Miao Xu <miaxu@meta.com> Link: https://lore.kernel.org/r/20240502042318.801932-4-miaxu@meta.com Signed-off-by: Martin KaFai Lau <martin.lau@kernel.org>
2024-01-23	selftest: bpf: Test bpf_sk_assign_tcp_reqsk().	Kuniyuki Iwashima
	This commit adds a sample selftest to demonstrate how we can use bpf_sk_assign_tcp_reqsk() as the backend of SYN Proxy. The test creates IPv4/IPv6 x TCP connections and transfer messages over them on lo with BPF tc prog attached. The tc prog will process SYN and returns SYN+ACK with the following ISN and TS. In a real use case, this part will be done by other hosts. MSB LSB ISN: \| 31 ... 8 \| 7 6 \| 5 \| 4 \| 3 2 1 0 \| \| Hash_1 \| MSS \| ECN \| SACK \| WScale \| TS: \| 31 ... 8 \| 7 ... 0 \| \| Random \| Hash_2 \| WScale in SYN is reused in SYN+ACK. The client returns ACK, and tc prog will recalculate ISN and TS from ACK and validate SYN Cookie. If it's valid, the prog calls kfunc to allocate a reqsk for skb and configure the reqsk based on the argument created from SYN Cookie. Later, the reqsk will be processed in cookie_v[46]_check() to create a connection. Signed-off-by: Kuniyuki Iwashima <kuniyu@amazon.com> Link: https://lore.kernel.org/r/20240115205514.68364-7-kuniyu@amazon.com Signed-off-by: Martin KaFai Lau <martin.lau@kernel.org> Signed-off-by: Alexei Starovoitov <ast@kernel.org>
2024-01-13	selftests/bpf: Test udp and tcp iter batching	Martin KaFai Lau
	The patch adds a test to exercise the bpf_iter_udp batching logic. It specifically tests the case that there are multiple so_reuseport udp_sk in a bucket of the udp_table. The test creates two sets of so_reuseport sockets and each set on a different port. Meaning there will be two buckets in the udp_table. The test does the following: 1. read() 3 out of 4 sockets in the first bucket. 2. close() all sockets in the first bucket. This will ensure the current bucket's offset in the kernel does not affect the read() of the following bucket. 3. read() all 4 sockets in the second bucket. The test also reads one udp_sk at a time from the bpf_iter_udp prog. The true case in "do_test(..., bool onebyone)". This is the buggy case that the previous patch fixed. It also tests the "false" case in "do_test(..., bool onebyone)", meaning the userspace reads the whole bucket. There is no bug in this case but adding this test also while at it. Considering the way to have multiple tcp_sk in the same bucket is similar (by using so_reuseport), this patch also tests the bpf_iter_tcp even though the bpf_iter_tcp batching logic works correctly. Both IP v4 and v6 are exercising the same bpf_iter batching code path, so only v6 is tested. Acked-by: Yonghong Song <yonghong.song@linux.dev> Signed-off-by: Martin KaFai Lau <martin.lau@kernel.org> Link: https://lore.kernel.org/r/20240112190530.3751661-4-martin.lau@linux.dev Signed-off-by: Alexei Starovoitov <ast@kernel.org>
2023-12-14	bpf: selftests: test_tunnel: Use vmlinux.h declarations	Daniel Xu
	vmlinux.h declarations are more ergnomic, especially when working with kfuncs. The uapi headers are often incomplete for kfunc definitions. This commit also switches bitfield accesses to use CO-RE helpers. Switching to vmlinux.h definitions makes the verifier very unhappy with raw bitfield accesses. The error is: ; md.u.md2.dir = direction; 33: (69) r1 = (u16 )(r2 +11) misaligned stack access off (0x0; 0x0)+-64+11 size 2 Fix by using CO-RE-aware bitfield reads and writes. Co-developed-by: Antony Antony <antony.antony@secunet.com> Signed-off-by: Antony Antony <antony.antony@secunet.com> Signed-off-by: Daniel Xu <dxu@dxuuu.xyz> Link: https://lore.kernel.org/r/884bde1d9a351d126a3923886b945ea6b1b0776b.1702593901.git.dxu@dxuuu.xyz Signed-off-by: Alexei Starovoitov <ast@kernel.org>
2023-09-06	selftests/bpf: Check bpf_sk_storage has uncharged sk_omem_alloc	Martin KaFai Lau
	This patch checks the sk_omem_alloc has been uncharged by bpf_sk_storage during the __sk_destruct. Signed-off-by: Martin KaFai Lau <martin.lau@kernel.org> Signed-off-by: Daniel Borkmann <daniel@iogearbox.net> Link: https://lore.kernel.org/bpf/20230901231129.578493-4-martin.lau@linux.dev
2022-12-22	selftests/bpf: Test bpf_skb_adjust_room on CHECKSUM_PARTIAL	Martin KaFai Lau
	When the bpf_skb_adjust_room() shrinks the skb such that its csum_start is invalid, the skb->ip_summed should be reset from CHECKSUM_PARTIAL to CHECKSUM_NONE. The commit 54c3f1a81421 ("bpf: pull before calling skb_postpull_rcsum()") fixed it. This patch adds a test to ensure the skb->ip_summed changed from CHECKSUM_PARTIAL to CHECKSUM_NONE after bpf_skb_adjust_room(). Signed-off-by: Martin KaFai Lau <martin.lau@kernel.org> Signed-off-by: Daniel Borkmann <daniel@iogearbox.net> Acked-by: Stanislav Fomichev <sdf@google.com> Link: https://lore.kernel.org/bpf/20221221185653.1589961-1-martin.lau@linux.dev
2022-12-05	selftests/bpf: add xfrm_info tests	Eyal Birger
	Test the xfrm_info kfunc helpers. The test setup creates three name spaces - NS0, NS1, NS2. XFRM tunnels are setup between NS0 and the two other NSs. The kfunc helpers are used to steer traffic from NS0 to the other NSs based on a userspace populated bpf global variable and validate that the return traffic had arrived from the desired NS. Signed-off-by: Eyal Birger <eyal.birger@gmail.com> Link: https://lore.kernel.org/r/20221203084659.1837829-5-eyal.birger@gmail.com Signed-off-by: Martin KaFai Lau <martin.lau@kernel.org>
2022-09-02	selftest/bpf: Add test for bpf_getsockopt()	Martin KaFai Lau
	This patch removes the __bpf_getsockopt() which directly reads the sk by using PTR_TO_BTF_ID. Instead, the test now directly uses the kernel bpf helper bpf_getsockopt() which supports all the required optname now. TCP_SAVE[D]_SYN and TCP_MAXSEG are not tested in a loop for all the hooks and sock_ops's cb. TCP_SAVE[D]_SYN only works in passive connection. TCP_MAXSEG only works when it is setsockopt before the connection is established and the getsockopt return value can only be tested after the connection is established. Signed-off-by: Martin KaFai Lau <martin.lau@kernel.org> Link: https://lore.kernel.org/r/20220902002937.2896904-1-kafai@fb.com Signed-off-by: Alexei Starovoitov <ast@kernel.org>
2022-08-18	selftests/bpf: bpf_setsockopt tests	Martin KaFai Lau
	This patch adds tests to exercise optnames that are allowed in bpf_setsockopt(). Reviewed-by: Stanislav Fomichev <sdf@google.com> Signed-off-by: Martin KaFai Lau <kafai@fb.com> Link: https://lore.kernel.org/r/20220817061847.4182339-1-kafai@fb.com Signed-off-by: Alexei Starovoitov <ast@kernel.org>
2022-06-29	selftests/bpf: lsm_cgroup functional test	Stanislav Fomichev
	Functional test that exercises the following: 1. apply default sk_priority policy 2. permit TX-only AF_PACKET socket 3. cgroup attach/detach/replace 4. reusing trampoline shim Signed-off-by: Stanislav Fomichev <sdf@google.com> Acked-by: Martin KaFai Lau <kafai@fb.com> Link: https://lore.kernel.org/r/20220628174314.1216643-12-sdf@google.com Signed-off-by: Alexei Starovoitov <ast@kernel.org>
2022-01-18	selftest/bpf: Test batching and bpf_(get\|set)sockopt in bpf unix iter.	Kuniyuki Iwashima
	This patch adds a test for the batching and bpf_(get\|set)sockopt in bpf unix iter. It does the following. 1. Creates an abstract UNIX domain socket 2. Call bpf_setsockopt() 3. Call bpf_getsockopt() and save the value 4. Call setsockopt() 5. Call getsockopt() and save the value 6. Compare the saved values Signed-off-by: Kuniyuki Iwashima <kuniyu@amazon.co.jp> Link: https://lore.kernel.org/r/20220113002849.4384-5-kuniyu@amazon.co.jp Signed-off-by: Alexei Starovoitov <ast@kernel.org>
2021-11-26	af_unix: Remove UNIX_ABSTRACT() macro and test sun_path[0] instead.	Kuniyuki Iwashima
	In BSD and abstract address cases, we store sockets in the hash table with keys between 0 and UNIX_HASH_SIZE - 1. However, the hash saved in a socket varies depending on its address type; sockets with BSD addresses always have UNIX_HASH_SIZE in their unix_sk(sk)->addr->hash. This is just for the UNIX_ABSTRACT() macro used to check the address type. The difference of the saved hashes comes from the first byte of the address in the first place. So, we can test it directly. Then we can keep a real hash in each socket and replace unix_table_lock with per-hash locks in the later patch. Signed-off-by: Kuniyuki Iwashima <kuniyu@amazon.co.jp> Signed-off-by: Jakub Kicinski <kuba@kernel.org>
2021-08-15	selftest/bpf: Implement sample UNIX domain socket iterator program.	Kuniyuki Iwashima
	The iterator can output almost the same result compared to /proc/net/unix. The header line is aligned, and the Inode column uses "%8lu" because "%5lu" can be easily overflown. # cat /sys/fs/bpf/unix Num RefCount Protocol Flags Type St Inode Path ffff963c06689800: 00000002 00000000 00010000 0001 01 18697 private/defer ffff963c7c979c00: 00000002 00000000 00000000 0001 01 598245 @Hello@World@ # cat /proc/net/unix Num RefCount Protocol Flags Type St Inode Path ffff963c06689800: 00000002 00000000 00010000 0001 01 18697 private/defer ffff963c7c979c00: 00000002 00000000 00000000 0001 01 598245 @Hello@World@ Signed-off-by: Kuniyuki Iwashima <kuniyu@amazon.co.jp> Signed-off-by: Andrii Nakryiko <andrii@kernel.org> Acked-by: Yonghong Song <yhs@fb.com> Link: https://lore.kernel.org/bpf/20210814015718.42704-4-kuniyu@amazon.co.jp
2021-07-23	bpf: selftest: Test batching and bpf_(get\|set)sockopt in bpf tcp iter	Martin KaFai Lau
	This patch adds tests for the batching and bpf_(get\|set)sockopt in bpf tcp iter. It first creates: a) 1 non SO_REUSEPORT listener in lhash2. b) 256 passive and active fds connected to the listener in (a). c) 256 SO_REUSEPORT listeners in one of the lhash2 bucket. The test sets all listeners and connections to bpf_cubic before running the bpf iter. The bpf iter then calls setsockopt(TCP_CONGESTION) to switch each listener and connection from bpf_cubic to bpf_dctcp. The bpf iter has a random_retry mode such that it can return EAGAIN to the usespace in the middle of a batch. Signed-off-by: Martin KaFai Lau <kafai@fb.com> Signed-off-by: Andrii Nakryiko <andrii@kernel.org> Reviewed-by: Eric Dumazet <edumazet@google.com> Acked-by: Kuniyuki Iwashima <kuniyu@amazon.co.jp> Acked-by: Yonghong Song <yhs@fb.com> Link: https://lore.kernel.org/bpf/20210701200625.1036874-1-kafai@fb.com
2020-06-24	selftests/bpf: Add more common macros to bpf_tracing_net.h	Yonghong Song
	These newly added macros will be used in subsequent bpf iterator tcp{4,6} and udp{4,6} programs. Signed-off-by: Yonghong Song <yhs@fb.com> Signed-off-by: Alexei Starovoitov <ast@kernel.org> Acked-by: Andrii Nakryiko <andriin@fb.com> Acked-by: Martin KaFai Lau <kafai@fb.com> Link: https://lore.kernel.org/bpf/20200623230819.3989050-1-yhs@fb.com
2020-06-24	selftests/bpf: Refactor some net macros to bpf_tracing_net.h	Yonghong Song
	Refactor bpf_iter_ipv6_route.c and bpf_iter_netlink.c so net macros, originally from various include/linux header files, are moved to a new header file bpf_tracing_net.h. The goal is to improve reuse so networking tracing programs do not need to copy these macros every time they use them. Signed-off-by: Yonghong Song <yhs@fb.com> Signed-off-by: Alexei Starovoitov <ast@kernel.org> Acked-by: Andrii Nakryiko <andriin@fb.com> Acked-by: Martin KaFai Lau <kafai@fb.com> Link: https://lore.kernel.org/bpf/20200623230817.3988962-1-yhs@fb.com