lwn.git - Linux kernel documentation tree maintained by Jonathan Corbet

Age	Commit message (Collapse)	Author
2021-12-30	bpf, docs: Generate nicer tables for instruction encodings	Christoph Hellwig
	Use RST tables that are nicely readable both in plain ascii as well as in html to render the instruction encodings, and add a few subheadings to better structure the text. Signed-off-by: Christoph Hellwig <hch@lst.de> Signed-off-by: Alexei Starovoitov <ast@kernel.org> Link: https://lore.kernel.org/bpf/20211223101906.977624-4-hch@lst.de
2021-12-30	bpf, docs: Split the comparism to classic BPF from instruction-set.rst	Christoph Hellwig
	Split the introductory that explain eBPF vs classic BPF and how it maps to hardware from the instruction set specification into a standalone document. This duplicates a little bit of information but gives us a useful reference for the eBPF instrution set that is not encumbered by classic BPF. Signed-off-by: Christoph Hellwig <hch@lst.de> Signed-off-by: Alexei Starovoitov <ast@kernel.org> Link: https://lore.kernel.org/bpf/20211223101906.977624-3-hch@lst.de
2021-12-30	bpf, docs: Fix verifier references	Christoph Hellwig
	Use normal RST file reference instead of linkage copied from the old filter.rst document that does not actually work when using HTML output. Signed-off-by: Christoph Hellwig <hch@lst.de> Signed-off-by: Alexei Starovoitov <ast@kernel.org> Link: https://lore.kernel.org/bpf/20211223101906.977624-2-hch@lst.de
2021-12-29	Merge branch 'lighten uapi/bpf.h rebuilds'	Alexei Starovoitov
	Jakub Kicinski says: ==================== Last change in the bpf headers - disentangling BPF uapi from netdevice.h. Both linux/bpf.h and uapi/bpf.h changes should now rebuild ~1k objects down from the original ~18k. There's probably more that can be done but it's diminishing returns. Split into two patches for ease of review. ==================== Signed-off-by: Alexei Starovoitov <ast@kernel.org>
2021-12-29	bpf: Invert the dependency between bpf-netns.h and netns/bpf.h	Jakub Kicinski
	netns/bpf.h gets included by netdevice.h (thru net_namespace.h) which in turn gets included in a lot of places. We should keep netns/bpf.h as light-weight as possible. bpf-netns.h seems to contain more implementation details than deserves to be included in a netns header. It needs to pull in uapi/bpf.h to get various enum types. Move enum netns_bpf_attach_type to netns/bpf.h and invert the dependency. This makes netns/bpf.h fit the mold of a struct definition header more clearly, and drops the number of objects rebuilt when uapi/bpf.h is touched from 7.7k to 1.1k. Signed-off-by: Jakub Kicinski <kuba@kernel.org> Signed-off-by: Alexei Starovoitov <ast@kernel.org> Link: https://lore.kernel.org/bpf/20211230012742.770642-3-kuba@kernel.org
2021-12-29	net: Add includes masked by netdevice.h including uapi/bpf.h	Jakub Kicinski
	Add missing includes unmasked by the subsequent change. Mostly network drivers missing an include for XDP_PACKET_HEADROOM. Signed-off-by: Jakub Kicinski <kuba@kernel.org> Signed-off-by: Alexei Starovoitov <ast@kernel.org> Link: https://lore.kernel.org/bpf/20211230012742.770642-2-kuba@kernel.org
2021-12-29	Merge branch 'Sleepable local storage'	Alexei Starovoitov
	KP Singh says: ==================== Local storage is currently unusable in sleepable helpers. One of the important use cases of local_storage is to attach security (or performance) contextual information to kernel objects in LSM / tracing programs to be used later in the life-cyle of the object. Sometimes this context can only be gathered from sleepable programs (because it needs accesing __user pointers or helpers like bpf_ima_inode_hash). Allowing local storage to be used from sleepable programs allows such context to be managed with the benefits of local_storage. # v2 -> v3 * Fixed some RCU issues pointed by Martin * Added Martin's ack # v1 -> v2 * Generalize RCU checks (will send a separate patch for updating non local storage code where this can be used). * Add missing RCU lock checks from v1 ==================== Signed-off-by: Alexei Starovoitov <ast@kernel.org>
2021-12-29	bpf/selftests: Update local storage selftest for sleepable programs	KP Singh
	Remove the spin lock logic and update the selftests to use sleepable programs to use a mix of sleepable and non-sleepable programs. It's more useful to test the sleepable programs since the tests don't really need spinlocks. Signed-off-by: KP Singh <kpsingh@kernel.org> Signed-off-by: Alexei Starovoitov <ast@kernel.org> Acked-by: Martin KaFai Lau <kafai@fb.com> Link: https://lore.kernel.org/bpf/20211224152916.1550677-3-kpsingh@kernel.org
2021-12-29	bpf: Allow bpf_local_storage to be used by sleepable programs	KP Singh
	Other maps like hashmaps are already available to sleepable programs. Sleepable BPF programs run under trace RCU. Allow task, sk and inode storage to be used from sleepable programs. This allows sleepable and non-sleepable programs to provide shareable annotations on kernel objects. Sleepable programs run in trace RCU where as non-sleepable programs run in a normal RCU critical section i.e. __bpf_prog_enter{_sleepable} and __bpf_prog_exit{_sleepable}) (rcu_read_lock or rcu_read_lock_trace). In order to make the local storage maps accessible to both sleepable and non-sleepable programs, one needs to call both call_rcu_tasks_trace and call_rcu to wait for both trace and classical RCU grace periods to expire before freeing memory. Paul's work on call_rcu_tasks_trace allows us to have per CPU queueing for call_rcu_tasks_trace. This behaviour can be achieved by setting rcupdate.rcu_task_enqueue_lim=<num_cpus> boot parameter. In light of these new performance changes and to keep the local storage code simple, avoid adding a new flag for sleepable maps / local storage to select the RCU synchronization (trace / classical). Also, update the dereferencing of the pointers to use rcu_derference_check (with either the trace or normal RCU locks held) with a common bpf_rcu_lock_held helper method. Signed-off-by: KP Singh <kpsingh@kernel.org> Signed-off-by: Alexei Starovoitov <ast@kernel.org> Acked-by: Martin KaFai Lau <kafai@fb.com> Link: https://lore.kernel.org/bpf/20211224152916.1550677-2-kpsingh@kernel.org
2021-12-29	bpf: Add missing map_get_next_key method to bloom filter map.	Haimin Zhang
	Without it, kernel crashes in map_get_next_key(). Fixes: 9330986c0300 ("bpf: Add bloom filter map implementation") Reported-by: TCS Robot <tcs_robot@tencent.com> Signed-off-by: Haimin Zhang <tcs_kernel@tencent.com> Signed-off-by: Alexei Starovoitov <ast@kernel.org> Acked-by: Joanne Koong <joannekoong@fb.com> Link: https://lore.kernel.org/bpf/1640776802-22421-1-git-send-email-tcs.kernel@gmail.com
2021-12-29	net: Don't include filter.h from net/sock.h	Jakub Kicinski
	sock.h is pretty heavily used (5k objects rebuilt on x86 after it's touched). We can drop the include of filter.h from it and add a forward declaration of struct sk_filter instead. This decreases the number of rebuilt objects when bpf.h is touched from ~5k to ~1k. There's a lot of missing includes this was masking. Primarily in networking tho, this time. Signed-off-by: Jakub Kicinski <kuba@kernel.org> Signed-off-by: Alexei Starovoitov <ast@kernel.org> Acked-by: Marc Kleine-Budde <mkl@pengutronix.de> Acked-by: Florian Fainelli <f.fainelli@gmail.com> Acked-by: Nikolay Aleksandrov <nikolay@nvidia.com> Acked-by: Stefano Garzarella <sgarzare@redhat.com> Link: https://lore.kernel.org/bpf/20211229004913.513372-1-kuba@kernel.org
2021-12-28	libbpf: Improve LINUX_VERSION_CODE detection	Andrii Nakryiko
	Ubuntu reports incorrect kernel version through uname(), which on older kernels leads to kprobe BPF programs failing to load due to the version check mismatch. Accommodate Ubuntu's quirks with LINUX_VERSION_CODE by using Ubuntu-specific /proc/version_code to fetch major/minor/patch versions to form LINUX_VERSION_CODE. While at it, consolide libbpf's kernel version detection code between libbpf.c and libbpf_probes.c. [0] Closes: https://github.com/libbpf/libbpf/issues/421 Signed-off-by: Andrii Nakryiko <andrii@kernel.org> Signed-off-by: Alexei Starovoitov <ast@kernel.org> Acked-by: Yonghong Song <yhs@fb.com> Link: https://lore.kernel.org/bpf/20211222231003.2334940-1-andrii@kernel.org
2021-12-28	libbpf: Use 100-character limit to make bpf_tracing.h easier to read	Andrii Nakryiko
	Improve bpf_tracing.h's macro definition readability by keeping them single-line and better aligned. This makes it easier to follow all those variadic patterns. Signed-off-by: Andrii Nakryiko <andrii@kernel.org> Signed-off-by: Alexei Starovoitov <ast@kernel.org> Acked-by: Yonghong Song <yhs@fb.com> Link: https://lore.kernel.org/bpf/20211222213924.1869758-2-andrii@kernel.org
2021-12-28	libbpf: Normalize PT_REGS_xxx() macro definitions	Andrii Nakryiko
	Refactor PT_REGS macros definitions in bpf_tracing.h to avoid excessive duplication. We currently have classic PT_REGS_xxx() and CO-RE-enabled PT_REGS_xxx_CORE(). We are about to add also _SYSCALL variants, which would require excessive copying of all the per-architecture definitions. Instead, separate architecture-specific field/register names from the final macro that utilize them. That way for upcoming _SYSCALL variants we'll be able to just define x86_64 exception and otherwise have one common set of _SYSCALL macro definitions common for all architectures. Signed-off-by: Andrii Nakryiko <andrii@kernel.org> Signed-off-by: Alexei Starovoitov <ast@kernel.org> Tested-by: Ilya Leoshkevich <iii@linux.ibm.com> Acked-by: Yonghong Song <yhs@fb.com> Acked-by: Ilya Leoshkevich <iii@linux.ibm.com> Link: https://lore.kernel.org/bpf/20211222213924.1869758-1-andrii@kernel.org
2021-12-23	selftests/bpf: Add btf_dump__new to test_cpp	Jiri Olsa
	Adding btf_dump__new call to test_cpp, so we can test C++ compilation with that. Signed-off-by: Jiri Olsa <jolsa@kernel.org> Signed-off-by: Andrii Nakryiko <andrii@kernel.org> Acked-by: Yonghong Song <yhs@fb.com> Link: https://lore.kernel.org/bpf/20211223131736.483956-2-jolsa@kernel.org
2021-12-23	libbpf: Do not use btf_dump__new() macro in C++ mode	Jiri Olsa
	As reported in here [0], C++ compilers don't support __builtin_types_compatible_p(), so at least don't screw up compilation for them and let C++ users pick btf_dump__new vs btf_dump__new_deprecated explicitly. [0] https://github.com/libbpf/libbpf/issues/283#issuecomment-986100727 Fixes: 6084f5dc928f ("libbpf: Ensure btf_dump__new() and btf_dump_opts are future-proof") Signed-off-by: Jiri Olsa <jolsa@kernel.org> Signed-off-by: Andrii Nakryiko <andrii@kernel.org> Acked-by: Yonghong Song <yhs@fb.com> Link: https://lore.kernel.org/bpf/20211223131736.483956-1-jolsa@kernel.org
2021-12-21	bpftool: Enable line buffering for stdout	Paul Chaignon
	The output of bpftool prog tracelog is currently buffered, which is inconvenient when piping the output into other commands. A simple tracelog \| grep will typically not display anything. This patch fixes it by enabling line buffering on stdout for the whole bpftool binary. Fixes: 30da46b5dc3a ("tools: bpftool: add a command to dump the trace pipe") Signed-off-by: Quentin Monnet <quentin@isovalent.com> Signed-off-by: Paul Chaignon <paul@isovalent.com> Signed-off-by: Andrii Nakryiko <andrii@kernel.org> Acked-by: Yonghong Song <yhs@fb.com> Link: https://lore.kernel.org/bpf/20211220214528.GA11706@Mem
2021-12-21	bpf: Use struct_size() helper	Xiu Jianfeng
	In an effort to avoid open-coded arithmetic in the kernel, use the struct_size() helper instead of open-coded calculation. Signed-off-by: Xiu Jianfeng <xiujianfeng@huawei.com> Signed-off-by: Andrii Nakryiko <andrii@kernel.org> Acked-by: Yonghong Song <yhs@fb.com> Link: https://github.com/KSPP/linux/issues/160 Link: https://lore.kernel.org/bpf/20211220113048.2859-1-xiujianfeng@huawei.com
2021-12-20	selftests/bpf: Correct the INDEX address in vmtest.sh	Pu Lehui
	Migration of vmtest to libbpf/ci will change the address of INDEX in vmtest.sh, which will cause vmtest.sh to not work due to the failure of rootfs fetching. Signed-off-by: Pu Lehui <pulehui@huawei.com> Signed-off-by: Andrii Nakryiko <andrii@kernel.org> Tested-by: Lorenzo Fontana <lorenzo.fontana@elastic.co> Link: https://lore.kernel.org/bpf/20211220050803.2670677-1-pulehui@huawei.com
2021-12-18	bpf: Extend kfunc with PTR_TO_CTX, PTR_TO_MEM argument support	Kumar Kartikeya Dwivedi
	Allow passing PTR_TO_CTX, if the kfunc expects a matching struct type, and punt to PTR_TO_MEM block if reg->type does not fall in one of PTR_TO_BTF_ID or PTR_TO_SOCK* types. This will be used by future commits to get access to XDP and TC PTR_TO_CTX, and pass various data (flags, l4proto, netns_id, etc.) encoded in opts struct passed as pointer to kfunc. For PTR_TO_MEM support, arguments are currently limited to pointer to scalar, or pointer to struct composed of scalars. This is done so that unsafe scenarios (like passing PTR_TO_MEM where PTR_TO_BTF_ID of in-kernel valid structure is expected, which may have pointers) are avoided. Since the argument checking happens basd on argument register type, it is not easy to ascertain what the expected type is. In the future, support for PTR_TO_MEM for kfunc can be extended to serve other usecases. The struct type whose pointer is passed in may have maximum nesting depth of 4, all recursively composed of scalars or struct with scalars. Future commits will add negative tests that check whether these restrictions imposed for kfunc arguments are duly rejected by BPF verifier or not. Signed-off-by: Kumar Kartikeya Dwivedi <memxor@gmail.com> Signed-off-by: Alexei Starovoitov <ast@kernel.org> Link: https://lore.kernel.org/bpf/20211217015031.1278167-4-memxor@gmail.com
2021-12-18	Merge branch 'Introduce composable bpf types'	Alexei Starovoitov
	Hao Luo says: ==================== This patch set consists of two changes: - a cleanup of arg_type, ret_type and reg_type which try to make those types composable. (patch 1/9 - patch 6/9) - a bug fix that prevents bpf programs from writing kernel memory. (patch 7/9 - patch 9/9) The purpose of the cleanup is to find a scalable way to express type nullness and read-onliness. This patchset introduces two flags that can be applied on all three types: PTR_MAYBE_NULL and MEM_RDONLY. Previous types such as ARG_XXX_OR_NULL can now be written as ARG_XXX \| PTR_MAYBE_NULL Similarly, PTR_TO_RDONLY_BUF is now "PTR_TO_BUF \| MEM_RDONLY". Flags can be composed, as ARGs can be both MEM_RDONLY and MAYBE_NULL. ARG_PTR_TO_MEM \| PTR_MAYBE_NULL \| MEM_RDONLY Based on this new composable types, patch 7/9 applies MEM_RDONLY on PTR_TO_MEM, in order to tag the returned memory from per_cpu_ptr as read-only. Therefore fixing a previous bug that one can leverage per_cpu_ptr to modify kernel memory within BPF programs. Patch 8/9 generalizes the use of MEM_RDONLY further by tagging a set of helper arguments ARG_PTR_TO_MEM with MEM_RDONLY. Some helper functions may override their arguments, such as bpf_d_path, bpf_snprintf. In this patch, we narrow the ARG_PTR_TO_MEM to be compatible with only a subset of memory types. This prevents these helpers from writing read-only memories. For the helpers that do not write its arguments, we add tag MEM_RDONLY to allow taking a RDONLY memory as argument. Changes since v1: - use %u to print base_type(type) instead of %lu. (Andrii, patch 3/9) - improve reg_type_str() by appending '_or_null' and prepending 'rdonly_'. use preallocated buffer in 'bpf_env'. - unified handling of the previous XXX_OR_NULL in adjust_ptr_min_max_vals (Andrii, patch 4/9) - move PTR_TO_MAP_KEY up to PTR_TO_MAP_VALUE so that we don't have to change to drivers that assume the numeric values of bpf_reg. (patch 4/9) - reintroduce the typo from previous commits in fixes tags (Andrii, patch 7/9) - extensive comments on the reason behind folding flags in check_reg_type (Andrii, patch 8/9) Changes since RFC v2: - renamed BPF_BASE_TYPE to a more succinct name base_type and move its definition to bpf_verifier.h. Same for BPF_TYPE_FLAG. (Alexei) - made checking MEM_RDONLY in check_reg_type() universal (Alexei) - ran through majority of test_progs and fixed bugs in RFC v2: - fixed incorrect BPF_BASE_TYPE_MASK. The high bit of GENMASK should be BITS - 1, rather than BITS. patch 1/9. - fixed incorrect conditions when checking ARG_PTR_TO_MAP_VALUE in check_func_arg(). See patch 2/9. - fixed a bug where PTR_TO_BTF_ID may be combined with MEM_RDONLY, causing the check in check_mem_access() to fall through to the 'else' branch. See check_helper_call() in patch 7/9. - fixed build failure on netronome driver. Entries in bpf_reg_type have been ordered. patch 4/9. - fixed build warnings of using '%d' to print base_type. patch 4/9 - unify arg_type_may_be_null() and reg_type_may_be_null() into a single type_may_be_null(). Previous versions: v1: https://lwn.net/Articles/877938/ RFC v2: https://lwn.net/Articles/877171/ RFC v1: https://lore.kernel.org/bpf/20211109003052.3499225-1-haoluo@google.com/T/ https://lore.kernel.org/bpf/20211109021624.1140446-8-haoluo@google.com/T/ ==================== Signed-off-by: Alexei Starovoitov <ast@kernel.org>
2021-12-18	bpf/selftests: Test PTR_TO_RDONLY_MEM	Hao Luo
	This test verifies that a ksym of non-struct can not be directly updated. Signed-off-by: Hao Luo <haoluo@google.com> Signed-off-by: Alexei Starovoitov <ast@kernel.org> Acked-by: Andrii Nakryiko <andrii@kernel.org> Link: https://lore.kernel.org/bpf/20211217003152.48334-10-haoluo@google.com
2021-12-18	bpf: Add MEM_RDONLY for helper args that are pointers to rdonly mem.	Hao Luo
	Some helper functions may modify its arguments, for example, bpf_d_path, bpf_get_stack etc. Previously, their argument types were marked as ARG_PTR_TO_MEM, which is compatible with read-only mem types, such as PTR_TO_RDONLY_BUF. Therefore it's legitimate, but technically incorrect, to modify a read-only memory by passing it into one of such helper functions. This patch tags the bpf_args compatible with immutable memory with MEM_RDONLY flag. The arguments that don't have this flag will be only compatible with mutable memory types, preventing the helper from modifying a read-only memory. The bpf_args that have MEM_RDONLY are compatible with both mutable memory and immutable memory. Signed-off-by: Hao Luo <haoluo@google.com> Signed-off-by: Alexei Starovoitov <ast@kernel.org> Link: https://lore.kernel.org/bpf/20211217003152.48334-9-haoluo@google.com
2021-12-18	bpf: Make per_cpu_ptr return rdonly PTR_TO_MEM.	Hao Luo
	Tag the return type of {per, this}_cpu_ptr with RDONLY_MEM. The returned value of this pair of helpers is kernel object, which can not be updated by bpf programs. Previously these two helpers return PTR_OT_MEM for kernel objects of scalar type, which allows one to directly modify the memory. Now with RDONLY_MEM tagging, the verifier will reject programs that write into RDONLY_MEM. Fixes: 63d9b80dcf2c ("bpf: Introducte bpf_this_cpu_ptr()") Fixes: eaa6bcb71ef6 ("bpf: Introduce bpf_per_cpu_ptr()") Fixes: 4976b718c355 ("bpf: Introduce pseudo_btf_id") Signed-off-by: Hao Luo <haoluo@google.com> Signed-off-by: Alexei Starovoitov <ast@kernel.org> Link: https://lore.kernel.org/bpf/20211217003152.48334-8-haoluo@google.com
2021-12-18	bpf: Convert PTR_TO_MEM_OR_NULL to composable types.	Hao Luo
	Remove PTR_TO_MEM_OR_NULL and replace it with PTR_TO_MEM combined with flag PTR_MAYBE_NULL. Signed-off-by: Hao Luo <haoluo@google.com> Signed-off-by: Alexei Starovoitov <ast@kernel.org> Link: https://lore.kernel.org/bpf/20211217003152.48334-7-haoluo@google.com
2021-12-18	bpf: Introduce MEM_RDONLY flag	Hao Luo
	This patch introduce a flag MEM_RDONLY to tag a reg value pointing to read-only memory. It makes the following changes: 1. PTR_TO_RDWR_BUF -> PTR_TO_BUF 2. PTR_TO_RDONLY_BUF -> PTR_TO_BUF \| MEM_RDONLY Signed-off-by: Hao Luo <haoluo@google.com> Signed-off-by: Alexei Starovoitov <ast@kernel.org> Link: https://lore.kernel.org/bpf/20211217003152.48334-6-haoluo@google.com
2021-12-18	bpf: Replace PTR_TO_XXX_OR_NULL with PTR_TO_XXX \| PTR_MAYBE_NULL	Hao Luo
	We have introduced a new type to make bpf_reg composable, by allocating bits in the type to represent flags. One of the flags is PTR_MAYBE_NULL which indicates a pointer may be NULL. This patch switches the qualified reg_types to use this flag. The reg_types changed in this patch include: 1. PTR_TO_MAP_VALUE_OR_NULL 2. PTR_TO_SOCKET_OR_NULL 3. PTR_TO_SOCK_COMMON_OR_NULL 4. PTR_TO_TCP_SOCK_OR_NULL 5. PTR_TO_BTF_ID_OR_NULL 6. PTR_TO_MEM_OR_NULL 7. PTR_TO_RDONLY_BUF_OR_NULL 8. PTR_TO_RDWR_BUF_OR_NULL Signed-off-by: Hao Luo <haoluo@google.com> Signed-off-by: Alexei Starovoitov <ast@kernel.org> Link: https://lore.kernel.org/r/20211217003152.48334-5-haoluo@google.com
2021-12-18	bpf: Replace RET_XXX_OR_NULL with RET_XXX \| PTR_MAYBE_NULL	Hao Luo
	We have introduced a new type to make bpf_ret composable, by reserving high bits to represent flags. One of the flag is PTR_MAYBE_NULL, which indicates a pointer may be NULL. When applying this flag to ret_types, it means the returned value could be a NULL pointer. This patch switches the qualified arg_types to use this flag. The ret_types changed in this patch include: 1. RET_PTR_TO_MAP_VALUE_OR_NULL 2. RET_PTR_TO_SOCKET_OR_NULL 3. RET_PTR_TO_TCP_SOCK_OR_NULL 4. RET_PTR_TO_SOCK_COMMON_OR_NULL 5. RET_PTR_TO_ALLOC_MEM_OR_NULL 6. RET_PTR_TO_MEM_OR_BTF_ID_OR_NULL 7. RET_PTR_TO_BTF_ID_OR_NULL This patch doesn't eliminate the use of these names, instead it makes them aliases to 'RET_PTR_TO_XXX \| PTR_MAYBE_NULL'. Signed-off-by: Hao Luo <haoluo@google.com> Signed-off-by: Alexei Starovoitov <ast@kernel.org> Link: https://lore.kernel.org/bpf/20211217003152.48334-4-haoluo@google.com
2021-12-18	bpf: Replace ARG_XXX_OR_NULL with ARG_XXX \| PTR_MAYBE_NULL	Hao Luo
	We have introduced a new type to make bpf_arg composable, by reserving high bits of bpf_arg to represent flags of a type. One of the flags is PTR_MAYBE_NULL which indicates a pointer may be NULL. When applying this flag to an arg_type, it means the arg can take NULL pointer. This patch switches the qualified arg_types to use this flag. The arg_types changed in this patch include: 1. ARG_PTR_TO_MAP_VALUE_OR_NULL 2. ARG_PTR_TO_MEM_OR_NULL 3. ARG_PTR_TO_CTX_OR_NULL 4. ARG_PTR_TO_SOCKET_OR_NULL 5. ARG_PTR_TO_ALLOC_MEM_OR_NULL 6. ARG_PTR_TO_STACK_OR_NULL This patch does not eliminate the use of these arg_types, instead it makes them an alias to the 'ARG_XXX \| PTR_MAYBE_NULL'. Signed-off-by: Hao Luo <haoluo@google.com> Signed-off-by: Alexei Starovoitov <ast@kernel.org> Link: https://lore.kernel.org/bpf/20211217003152.48334-3-haoluo@google.com
2021-12-18	bpf: Introduce composable reg, ret and arg types.	Hao Luo
	There are some common properties shared between bpf reg, ret and arg values. For instance, a value may be a NULL pointer, or a pointer to a read-only memory. Previously, to express these properties, enumeration was used. For example, in order to test whether a reg value can be NULL, reg_type_may_be_null() simply enumerates all types that are possibly NULL. The problem of this approach is that it's not scalable and causes a lot of duplication. These properties can be combined, for example, a type could be either MAYBE_NULL or RDONLY, or both. This patch series rewrites the layout of reg_type, arg_type and ret_type, so that common properties can be extracted and represented as composable flag. For example, one can write ARG_PTR_TO_MEM \| PTR_MAYBE_NULL which is equivalent to the previous ARG_PTR_TO_MEM_OR_NULL The type ARG_PTR_TO_MEM are called "base type" in this patch. Base types can be extended with flags. A flag occupies the higher bits while base types sits in the lower bits. This patch in particular sets up a set of macro for this purpose. The following patches will rewrite arg_types, ret_types and reg_types respectively. Signed-off-by: Hao Luo <haoluo@google.com> Signed-off-by: Alexei Starovoitov <ast@kernel.org> Link: https://lore.kernel.org/bpf/20211217003152.48334-2-haoluo@google.com
2021-12-17	bpftool: Reimplement large insn size limit feature probing	Andrii Nakryiko
	Reimplement bpf_probe_large_insn_limit() in bpftool, as that libbpf API is scheduled for deprecation in v0.8. Signed-off-by: Andrii Nakryiko <andrii@kernel.org> Signed-off-by: Daniel Borkmann <daniel@iogearbox.net> Acked-by: Dave Marchevsky <davemarchevsky@fb.com> Link: https://lore.kernel.org/bpf/20211217171202.3352835-4-andrii@kernel.org
2021-12-17	selftests/bpf: Add libbpf feature-probing API selftests	Andrii Nakryiko
	Add selftests for prog/map/prog+helper feature probing APIs. Prog and map selftests are designed in such a way that they will always test all the possible prog/map types, based on running kernel's vmlinux BTF enum definition. This way we'll always be sure that when adding new BPF program types or map types, libbpf will be always updated accordingly to be able to feature-detect them. BPF prog_helper selftest will have to be manually extended with interesting and important prog+helper combinations, it's easy, but can't be completely automated. Signed-off-by: Andrii Nakryiko <andrii@kernel.org> Signed-off-by: Daniel Borkmann <daniel@iogearbox.net> Acked-by: Dave Marchevsky <davemarchevsky@fb.com> Link: https://lore.kernel.org/bpf/20211217171202.3352835-3-andrii@kernel.org
2021-12-17	libbpf: Rework feature-probing APIs	Andrii Nakryiko
	Create three extensible alternatives to inconsistently named feature-probing APIs: - libbpf_probe_bpf_prog_type() instead of bpf_probe_prog_type(); - libbpf_probe_bpf_map_type() instead of bpf_probe_map_type(); - libbpf_probe_bpf_helper() instead of bpf_probe_helper(). Set up return values such that libbpf can report errors (e.g., if some combination of input arguments isn't possible to validate, etc), in addition to whether the feature is supported (return value 1) or not supported (return value 0). Also schedule deprecation of those three APIs. Also schedule deprecation of bpf_probe_large_insn_limit(). Also fix all the existing detection logic for various program and map types that never worked: - BPF_PROG_TYPE_LIRC_MODE2; - BPF_PROG_TYPE_TRACING; - BPF_PROG_TYPE_LSM; - BPF_PROG_TYPE_EXT; - BPF_PROG_TYPE_SYSCALL; - BPF_PROG_TYPE_STRUCT_OPS; - BPF_MAP_TYPE_STRUCT_OPS; - BPF_MAP_TYPE_BLOOM_FILTER. Above prog/map types needed special setups and detection logic to work. Subsequent patch adds selftests that will make sure that all the detection logic keeps working for all current and future program and map types, avoiding otherwise inevitable bit rot. [0] Closes: https://github.com/libbpf/libbpf/issues/312 Signed-off-by: Andrii Nakryiko <andrii@kernel.org> Signed-off-by: Daniel Borkmann <daniel@iogearbox.net> Acked-by: Dave Marchevsky <davemarchevsky@fb.com> Cc: Julia Kartseva <hex@fb.com> Link: https://lore.kernel.org/bpf/20211217171202.3352835-2-andrii@kernel.org
2021-12-16	Only output backtracking information in log level 2	Christy Lee
	Backtracking information is very verbose, don't print it in log level 1 to improve readability. Signed-off-by: Christy Lee <christylee@fb.com> Signed-off-by: Alexei Starovoitov <ast@kernel.org> Acked-by: Andrii Nakryiko <andrii@kernel.org> Link: https://lore.kernel.org/bpf/20211216213358.3374427-4-christylee@fb.com
2021-12-16	bpf: Right align verifier states in verifier logs.	Christy Lee
	Make the verifier logs more readable, print the verifier states on the corresponding instruction line. If the previous line was not a bpf instruction, then print the verifier states on its own line. Before: Validating test_pkt_access_subprog3() func#3... 86: R1=invP(id=0) R2=ctx(id=0,off=0,imm=0) R10=fp0 ; int test_pkt_access_subprog3(int val, struct __sk_buff skb) 86: (bf) r6 = r2 87: R2=ctx(id=0,off=0,imm=0) R6_w=ctx(id=0,off=0,imm=0) 87: (bc) w7 = w1 88: R1=invP(id=0) R7_w=invP(id=0,umax_value=4294967295,var_off=(0x0; 0xffffffff)) ; return get_skb_len(skb) get_skb_ifindex(val, skb, get_constant(123)); 88: (bf) r1 = r6 89: R1_w=ctx(id=0,off=0,imm=0) R6_w=ctx(id=0,off=0,imm=0) 89: (85) call pc+9 Func#4 is global and valid. Skipping. 90: R0_w=invP(id=0) 90: (bc) w8 = w0 91: R0_w=invP(id=0) R8_w=invP(id=0,umax_value=4294967295,var_off=(0x0; 0xffffffff)) ; return get_skb_len(skb) * get_skb_ifindex(val, skb, get_constant(123)); 91: (b7) r1 = 123 92: R1_w=invP123 92: (85) call pc+65 Func#5 is global and valid. Skipping. 93: R0=invP(id=0) After: 86: R1=invP(id=0) R2=ctx(id=0,off=0,imm=0) R10=fp0 ; int test_pkt_access_subprog3(int val, struct __sk_buff skb) 86: (bf) r6 = r2 ; R2=ctx(id=0,off=0,imm=0) R6_w=ctx(id=0,off=0,imm=0) 87: (bc) w7 = w1 ; R1=invP(id=0) R7_w=invP(id=0,umax_value=4294967295,var_off=(0x0; 0xffffffff)) ; return get_skb_len(skb) get_skb_ifindex(val, skb, get_constant(123)); 88: (bf) r1 = r6 ; R1_w=ctx(id=0,off=0,imm=0) R6_w=ctx(id=0,off=0,imm=0) 89: (85) call pc+9 Func#4 is global and valid. Skipping. 90: R0_w=invP(id=0) 90: (bc) w8 = w0 ; R0_w=invP(id=0) R8_w=invP(id=0,umax_value=4294967295,var_off=(0x0; 0xffffffff)) ; return get_skb_len(skb) * get_skb_ifindex(val, skb, get_constant(123)); 91: (b7) r1 = 123 ; R1_w=invP123 92: (85) call pc+65 Func#5 is global and valid. Skipping. 93: R0=invP(id=0) Signed-off-by: Christy Lee <christylee@fb.com> Acked-by: Andrii Nakryiko <andrii@kernel.org> Signed-off-by: Alexei Starovoitov <ast@kernel.org>
2021-12-16	bpf: Only print scratched registers and stack slots to verifier logs.	Christy Lee
	When printing verifier state for any log level, print full verifier state only on function calls or on errors. Otherwise, only print the registers and stack slots that were accessed. Log size differences: verif_scale_loop6 before: 234566564 verif_scale_loop6 after: 72143943 69% size reduction kfree_skb before: 166406 kfree_skb after: 55386 69% size reduction Before: 156: (61) r0 = (u32 )(r1 +0) 157: R0_w=invP(id=0,umax_value=4294967295,var_off=(0x0; 0xffffffff)) R1=ctx(id=0,off=0,imm=0) R2_w=invP0 R10=fp0 fp-8_w=00000000 fp-16_w=00\ 000000 fp-24_w=00000000 fp-32_w=00000000 fp-40_w=00000000 fp-48_w=00000000 fp-56_w=00000000 fp-64_w=00000000 fp-72_w=00000000 fp-80_w=00000\ 000 fp-88_w=00000000 fp-96_w=00000000 fp-104_w=00000000 fp-112_w=00000000 fp-120_w=00000000 fp-128_w=00000000 fp-136_w=00000000 fp-144_w=00\ 000000 fp-152_w=00000000 fp-160_w=00000000 fp-168_w=00000000 fp-176_w=00000000 fp-184_w=00000000 fp-192_w=00000000 fp-200_w=00000000 fp-208\ _w=00000000 fp-216_w=00000000 fp-224_w=00000000 fp-232_w=00000000 fp-240_w=00000000 fp-248_w=00000000 fp-256_w=00000000 fp-264_w=00000000 f\ p-272_w=00000000 fp-280_w=00000000 fp-288_w=00000000 fp-296_w=00000000 fp-304_w=00000000 fp-312_w=00000000 fp-320_w=00000000 fp-328_w=00000\ 000 fp-336_w=00000000 fp-344_w=00000000 fp-352_w=00000000 fp-360_w=00000000 fp-368_w=00000000 fp-376_w=00000000 fp-384_w=00000000 fp-392_w=\ 00000000 fp-400_w=00000000 fp-408_w=00000000 fp-416_w=00000000 fp-424_w=00000000 fp-432_w=00000000 fp-440_w=00000000 fp-448_w=00000000 ; return skb->len; 157: (95) exit Func#4 is safe for any args that match its prototype Validating get_constant() func#5... 158: R1=invP(id=0) R10=fp0 ; int get_constant(long val) 158: (bf) r0 = r1 159: R0_w=invP(id=1) R1=invP(id=1) R10=fp0 ; return val - 122; 159: (04) w0 += -122 160: R0_w=invP(id=0,umax_value=4294967295,var_off=(0x0; 0xffffffff)) R1=invP(id=1) R10=fp0 ; return val - 122; 160: (95) exit Func#5 is safe for any args that match its prototype Validating get_skb_ifindex() func#6... 161: R1=invP(id=0) R2=ctx(id=0,off=0,imm=0) R3=invP(id=0) R10=fp0 ; int get_skb_ifindex(int val, struct __sk_buff skb, int var) 161: (bc) w0 = w3 162: R0_w=invP(id=0,umax_value=4294967295,var_off=(0x0; 0xffffffff)) R1=invP(id=0) R2=ctx(id=0,off=0,imm=0) R3=invP(id=0) R10=fp0 After: 156: (61) r0 = (u32 )(r1 +0) 157: R0_w=invP(id=0,umax_value=4294967295,var_off=(0x0; 0xffffffff)) R1=ctx(id=0,off=0,imm=0) ; return skb->len; 157: (95) exit Func#4 is safe for any args that match its prototype Validating get_constant() func#5... 158: R1=invP(id=0) R10=fp0 ; int get_constant(long val) 158: (bf) r0 = r1 159: R0_w=invP(id=1) R1=invP(id=1) ; return val - 122; 159: (04) w0 += -122 160: R0_w=invP(id=0,umax_value=4294967295,var_off=(0x0; 0xffffffff)) ; return val - 122; 160: (95) exit Func#5 is safe for any args that match its prototype Validating get_skb_ifindex() func#6... 161: R1=invP(id=0) R2=ctx(id=0,off=0,imm=0) R3=invP(id=0) R10=fp0 ; int get_skb_ifindex(int val, struct __sk_buff skb, int var) 161: (bc) w0 = w3 162: R0_w=invP(id=0,umax_value=4294967295,var_off=(0x0; 0xffffffff)) R3=invP(id=0) Signed-off-by: Christy Lee <christylee@fb.com> Signed-off-by: Alexei Starovoitov <ast@kernel.org> Acked-by: Andrii Nakryiko <andrii@kernel.org> Link: https://lore.kernel.org/bpf/20211216213358.3374427-2-christylee@fb.com
2021-12-16	Merge branch 'bpf: remove the cgroup -> bpf header dependecy'	Alexei Starovoitov
	Jakub Kicinski says: ==================== Changes to bpf.h tend to clog up our build systems. The netdev/bpf build bot does incremental builds to save time (reusing the build directory to only rebuild changed objects). This is the rough breakdown of how many objects needs to be rebuilt based on file touched: kernel.h 40633 bpf.h 17881 bpf-cgroup.h 17875 skbuff.h 10696 bpf-netns.h 7604 netdevice.h 7452 filter.h 5003 sock.h 4959 tcp.h 4048 As the stats show touching bpf.h is _very_ expensive. Bulk of the objects get rebuilt because MM includes cgroup headers. Luckily bpf-cgroup.h does not fundamentally depend on bpf.h so we can break that dependency and reduce the number of objects. With the patches applied touching bpf.h causes 5019 objects to be rebuilt (17881 / 5019 = 3.56x). That's pretty much down to filter.h plus noise. v2: Try to make the new headers wider in scope. Collapse bpf-link and bpf-cgroup-types into one header, which may serve as "BPF kernel API" header in the future if needed. Rename bpf-cgroup-storage.h to bpf-inlines.h. Add a fix for the s390 build issue. v3: https://lore.kernel.org/all/20211215061916.715513-1-kuba@kernel.org/ Merge bpf-includes.h into bpf.h. v4: https://lore.kernel.org/all/20211215181231.1053479-1-kuba@kernel.org/ Change course - break off cgroup instead of breaking off bpf. v5: Add forward declaration of struct bpf_prog to perf_event.h when !CONFIG_BPF_SYSCALL (kbuild bot). ==================== Signed-off-by: Alexei Starovoitov <ast@kernel.org>
2021-12-16	bpf: Remove the cgroup -> bpf header dependecy	Jakub Kicinski
	Remove the dependency from cgroup-defs.h to bpf-cgroup.h and bpf.h. This reduces the incremental build size of x86 allmodconfig after bpf.h was touched from ~17k objects rebuilt to ~5k objects. bpf.h is 2.2kLoC and is modified relatively often. We need a new header with just the definition of struct cgroup_bpf and enum cgroup_bpf_attach_type, this is akin to cgroup-defs.h. Signed-off-by: Jakub Kicinski <kuba@kernel.org> Signed-off-by: Alexei Starovoitov <ast@kernel.org> Acked-by: Tejun Heo <tj@kernel.org> Link: https://lore.kernel.org/bpf/20211216025538.1649516-4-kuba@kernel.org
2021-12-16	add missing bpf-cgroup.h includes	Jakub Kicinski
	We're about to break the cgroup-defs.h -> bpf-cgroup.h dependency, make sure those who actually need more than the definition of struct cgroup_bpf include bpf-cgroup.h explicitly. Signed-off-by: Jakub Kicinski <kuba@kernel.org> Signed-off-by: Alexei Starovoitov <ast@kernel.org> Acked-by: Tejun Heo <tj@kernel.org> Link: https://lore.kernel.org/bpf/20211216025538.1649516-3-kuba@kernel.org
2021-12-16	add includes masked by cgroup -> bpf dependency	Jakub Kicinski
	cgroup pulls in BPF which pulls in a lot of includes. We're about to break that chain so fix those who were depending on it. Signed-off-by: Jakub Kicinski <kuba@kernel.org> Signed-off-by: Alexei Starovoitov <ast@kernel.org> Link: https://lore.kernel.org/bpf/20211216025538.1649516-2-kuba@kernel.org
2021-12-16	Merge branch 'tools/bpf: Enable cross-building with clang'	Andrii Nakryiko
	Jean-Philippe Brucker says: ==================== Since v1 [1], I added Quentin's acks and applied Andrii's suggestions: * Pass CFLAGS to libbpf link in patch 3 * Substitute CLANG_CROSS_FLAGS whole in HOST_CFLAGS to avoid accidents, patch 4 Add support for cross-building BPF tools and selftests with clang, by passing LLVM=1 or CC=clang to make, as well as CROSS_COMPILE. A single clang toolchain can generate binaries for multiple architectures, so instead of having prefixes such as aarch64-linux-gnu-gcc, clang uses the -target parameter: `clang -target aarch64-linux-gnu'. Patch 1 adds the parameter in Makefile.include so tools can easily support this. Patch 2 prepares for the libbpf change from patch 3 (keep building resolve_btfids's libbpf in the host arch, when cross-building the kernel with clang). Patches 3-6 enable cross-building BPF tools with clang. [1] https://lore.kernel.org/bpf/20211122192019.1277299-1-jean-philippe@linaro.org/ ==================== Signed-off-by: Andrii Nakryiko <andrii@kernel.org>
2021-12-16	selftests/bpf: Enable cross-building with clang	Jean-Philippe Brucker
	Cross building using clang requires passing the "-target" flag rather than using the CROSS_COMPILE prefix. Makefile.include transforms CROSS_COMPILE into CLANG_CROSS_FLAGS. Clear CROSS_COMPILE for bpftool and the host libbpf, and use the clang flags for urandom_read and bench. Signed-off-by: Jean-Philippe Brucker <jean-philippe@linaro.org> Signed-off-by: Andrii Nakryiko <andrii@kernel.org> Acked-by: Quentin Monnet <quentin@isovalent.com> Link: https://lore.kernel.org/bpf/20211216163842.829836-7-jean-philippe@linaro.org
2021-12-16	tools/runqslower: Enable cross-building with clang	Jean-Philippe Brucker
	Cross-building using clang requires passing the "-target" flag rather than using the CROSS_COMPILE prefix. Makefile.include transforms CROSS_COMPILE into CLANG_CROSS_FLAGS. Add them to CFLAGS, and erase CROSS_COMPILE for the bpftool build, since it needs to be executed on the host. Signed-off-by: Jean-Philippe Brucker <jean-philippe@linaro.org> Signed-off-by: Andrii Nakryiko <andrii@kernel.org> Acked-by: Quentin Monnet <quentin@isovalent.com> Link: https://lore.kernel.org/bpf/20211216163842.829836-6-jean-philippe@linaro.org
2021-12-16	bpftool: Enable cross-building with clang	Jean-Philippe Brucker
	Cross-building using clang requires passing the "-target" flag rather than using the CROSS_COMPILE prefix. Makefile.include transforms CROSS_COMPILE into CLANG_CROSS_FLAGS, and adds that to CFLAGS. Remove the cross flags for the bootstrap bpftool, and erase the CROSS_COMPILE flag for the bootstrap libbpf. Signed-off-by: Jean-Philippe Brucker <jean-philippe@linaro.org> Signed-off-by: Andrii Nakryiko <andrii@kernel.org> Acked-by: Quentin Monnet <quentin@isovalent.com> Link: https://lore.kernel.org/bpf/20211216163842.829836-5-jean-philippe@linaro.org
2021-12-16	tools/libbpf: Enable cross-building with clang	Jean-Philippe Brucker
	Cross-building using clang requires passing the "-target" flag rather than using the CROSS_COMPILE prefix. Makefile.include transforms CROSS_COMPILE into CLANG_CROSS_FLAGS. Add them to the CFLAGS. Signed-off-by: Jean-Philippe Brucker <jean-philippe@linaro.org> Signed-off-by: Andrii Nakryiko <andrii@kernel.org> Acked-by: Quentin Monnet <quentin@isovalent.com> Link: https://lore.kernel.org/bpf/20211216163842.829836-4-jean-philippe@linaro.org
2021-12-16	tools/resolve_btfids: Support cross-building the kernel with clang	Jean-Philippe Brucker
	The CROSS_COMPILE variable may be present during resolve_btfids build if the kernel is being cross-built. Since resolve_btfids is always executed on the host, we set CC to HOSTCC in order to use the host toolchain when cross-building with GCC. But instead of a toolchain prefix, cross-build with clang uses a "-target" parameter, which Makefile.include deduces from the CROSS_COMPILE variable. In order to avoid cross-building libbpf, clear CROSS_COMPILE before building resolve_btfids. Signed-off-by: Jean-Philippe Brucker <jean-philippe@linaro.org> Signed-off-by: Andrii Nakryiko <andrii@kernel.org> Acked-by: Quentin Monnet <quentin@isovalent.com> Link: https://lore.kernel.org/bpf/20211216163842.829836-3-jean-philippe@linaro.org
2021-12-16	tools: Help cross-building with clang	Jean-Philippe Brucker
	Cross-compilation with clang uses the -target parameter rather than a toolchain prefix. Just like the kernel Makefile, add that parameter to CFLAGS when CROSS_COMPILE is set. Unlike the kernel Makefile, we use the --sysroot and --gcc-toolchain options because unlike the kernel, tools require standard libraries. Commit c91d4e47e10e ("Makefile: Remove '--gcc-toolchain' flag") provides some background about --gcc-toolchain. Normally clang finds on its own the additional utilities and libraries that it needs (for example GNU ld or glibc). On some systems however, this autodetection doesn't work. There, our only recourse is asking GCC directly, and pass the result to --sysroot and --gcc-toolchain. Of course that only works when a cross GCC is available. Autodetection worked fine on Debian, but to use the aarch64-linux-gnu toolchain from Archlinux I needed both --sysroot (for crt1.o) and --gcc-toolchain (for crtbegin.o, -lgcc). The --prefix parameter wasn't needed there, but it might be useful on other distributions. Use the CLANG_CROSS_FLAGS variable instead of CLANG_FLAGS because it allows tools such as bpftool, that need to build both host and target binaries, to easily filter out the cross-build flags from CFLAGS. Signed-off-by: Jean-Philippe Brucker <jean-philippe@linaro.org> Signed-off-by: Andrii Nakryiko <andrii@kernel.org> Acked-by: Quentin Monnet <quentin@isovalent.com> Acked-by: Nick Desaulniers <ndesaulniers@google.com> Link: https://lore.kernel.org/bpf/20211216163842.829836-2-jean-philippe@linaro.org
2021-12-14	libbpf: Avoid reading past ELF data section end when copying license	Andrii Nakryiko
	Fix possible read beyond ELF "license" data section if the license string is not properly zero-terminated. Use the fact that libbpf_strlcpy never accesses the (N-1)st byte of the source string because it's replaced with '\0' anyways. If this happens, it's a violation of contract between libbpf and a user, but not handling this more robustly upsets CIFuzz, so given the fix is trivial, let's fix the potential issue. Fixes: 9fc205b413b3 ("libbpf: Add sane strncpy alternative and use it internally") Signed-off-by: Andrii Nakryiko <andrii@kernel.org> Signed-off-by: Alexei Starovoitov <ast@kernel.org> Link: https://lore.kernel.org/bpf/20211214232054.3458774-1-andrii@kernel.org
2021-12-14	Merge branch 'Stop using bpf_object__find_program_by_title API'	Andrii Nakryiko
	Kui-Feng Lee says: ==================== bpf_object__find_program_by_title is going to be deprecated since v0.7. Replace all use cases with bpf_object__find_program_by_name if possible, or use bpf_object__for_each_program to iterate over programs, matching section names. V3 fixes a broken test case, fexit_bpf2bpf, in selftests/bpf, using bpf_obj__for_each_program API instead. [v2] https://lore.kernel.org/bpf/20211211003608.2764928-1-kuifeng@fb.com/ [v1] https://lore.kernel.org/bpf/20211210190855.1369060-1-kuifeng@fb.com/T/ ==================== Signed-off-by: Andrii Nakryiko <andrii@kernel.org>
2021-12-14	libbpf: Mark bpf_object__find_program_by_title API deprecated.	Kui-Feng Lee
	Deprecate this API since v0.7. All callers should move to bpf_object__find_program_by_name if possible, otherwise use bpf_object__for_each_program to find a program out from a given section. [0] Closes: https://github.com/libbpf/libbpf/issues/292 Signed-off-by: Kui-Feng Lee <kuifeng@fb.com> Signed-off-by: Andrii Nakryiko <andrii@kernel.org> Link: https://lore.kernel.org/bpf/20211214035931.1148209-5-kuifeng@fb.com