BTF allows eBPF programs to remain portable across different kernel versions without recompilation.
Let’s see BTF in action. Imagine you have a simple eBPF program that attaches to sys_enter_execve and prints the filename being executed.
// SPDX-License-Identifier: GPL-2.0
#include <vmlinux.h>
#include <bpf/bpf_helpers.h>
SEC("tp/syscalls/sys_enter_execve")
int bpf_prog1(struct pt_regs *ctx) {
char *filename;
// Accessing the filename argument of execve
bpf_probe_read_user_str(&filename, sizeof(filename), (void *)PT_REGS_PARM2(ctx));
bpf_printk("execve: %s\n", filename);
return 0;
}
char _license[] SEC("license") = "GPL";
Without BTF, if the kernel version changes and the layout of struct pt_regs or the calling convention for execve arguments shifts, this program might crash or produce incorrect output. BTF provides a way to query the kernel for type information at runtime, making the eBPF program aware of the underlying kernel’s data structures.
BTF, which stands for BPF Type Format, is essentially a type system embedded within the kernel. When you compile an eBPF program with BTF enabled (using flags like -g or -fno-ளைbpf-btf for libbpf), the compiler generates metadata describing the types used in your program. This metadata is then passed to the kernel.
When the eBPF verifier loads your program, it can use this BTF information to understand the structure of kernel data types like struct pt_regs or struct task_struct as they exist in that specific kernel. Instead of hardcoding offsets or relying on assumptions about the kernel’s internal layout, the eBPF program can ask the kernel, "What is the offset of the filename field within the execve syscall arguments?" or "What is the size of struct task_struct?".
This dynamic lookup is facilitated by BPF helpers like bpf_probe_read_kernel and bpf_probe_read_user, which can take BTF information into account. When you use bpf_probe_read_user_str to read the filename from PT_REGS_PARM2(ctx), the kernel, guided by BTF, knows precisely where to find that argument based on the current execve syscall’s ABI for that kernel version.
The compilation process with BTF enabled involves tools like bpftool to extract and manage this type information. When you use libbpf to load your eBPF program, it automatically handles the BTF generation and loading into the kernel. The vmlinux.h header, generated by tools like bpftool btf dump, provides the necessary definitions that match the kernel’s BTF data, bridging the gap between your C code and the kernel’s runtime type information.
Consider accessing a nested field. Without BTF, you’d be manually calculating offsets based on kernel version headers, a brittle process. With BTF, you can use helpers that abstract this. For instance, if you wanted to read a field from a struct sock which is a member of another structure, BTF allows the verifier to correctly resolve the path to that nested field. The bpf_core_read helper, for example, leverages BTF to perform these complex memory reads safely and portably.
The core idea is that the eBPF program itself doesn’t contain hardcoded assumptions about kernel structures. Instead, it declares its intent to access certain fields, and the kernel, using the BTF information it has about its own types and the types within the eBPF program, resolves these accesses at load time or runtime. This makes your eBPF programs resilient to kernel upgrades, as long as the fundamental ABI of the syscall or tracepoint event remains compatible at a high level.
When using libbpf, you often see macros like PT_REGS_PARM2(ctx). These are designed to be BTF-aware. They translate to direct memory offsets or more complex lookups that are resolved by the kernel using BTF information during program loading. The verifier uses BTF to confirm that the memory accesses generated by these macros are valid for the current kernel’s pt_regs structure and the syscall ABI.
The most surprising thing is that BTF isn’t just for eBPF programs; it’s a general-purpose type information system within the kernel. This means that other kernel subsystems can also leverage BTF to introspect types, enabling more dynamic and adaptable kernel features beyond eBPF. It’s a foundational piece of infrastructure for modern kernel development.
The next challenge you’ll face is understanding how to use BTF to access more complex data structures and events, especially those that have undergone significant ABI changes between kernel versions.