The eBPF verifier doesn’t just look for unsafe programs; it actively simulates their execution to prove they’ll always terminate and never access invalid memory.
Let’s watch a simple eBPF program get verified. Imagine we have a program that wants to read the current network packet’s length and store it in a map.
#include <linux/bpf.h>
#include <bpf/bpf_helpers.h>
struct {
__uint(type, BPF_MAP_TYPE_ARRAY);
__uint(max_entries, 1);
__uint(key_size, sizeof(int));
__uint(value_size, sizeof(long));
} packet_len_map SEC(".maps");
SEC("xdp")
int xdp_packet_len(struct xdp_md *ctx) {
int key = 0;
long *value;
// Try to get the packet length
long len = ctx->data_end - ctx->data;
// Store it in the map
value = bpf_map_lookup_elem(&packet_len_map, &key);
if (value) {
*value = len;
}
return XDP_PASS;
}
The verifier starts by analyzing the xdp_packet_len function. It sees the entry point and begins tracing possible execution paths. At each instruction, it maintains a "state" representing the program’s current status: which registers hold what kinds of values (e.g., a pointer to xdp_md, an integer, an unknown value), and which memory regions are accessible.
The verifier represents the program’s state as a set of possible states. Initially, it’s a single state where ctx is a pointer to a valid xdp_md struct, and other registers are unknown. When it encounters long len = ctx->data_end - ctx->data;, it knows ctx->data_end and ctx->data are offsets within the xdp_md struct. It can calculate their difference, knowing that xdp_md bounds checks guarantee these values are within the packet buffer. The result, len, is determined to be a valid long representing the packet size.
Next, value = bpf_map_lookup_elem(&packet_len_map, &key);. The verifier knows packet_len_map is an array map. It checks if the key 0 is valid for this map. It confirms key is 0 and packet_len_map has at least one entry. The function bpf_map_lookup_elem returns a pointer to the value in the map or NULL. The verifier models this as a "bounded pointer" that might point to the map’s value or be NULL.
The if (value) check is crucial. The verifier knows that if value is NULL, the code inside the if block is skipped. If value is a valid pointer, it proceeds. Inside the if, *value = len; is executed. The verifier checks if dereferencing value is safe. Since value is known to be a valid pointer to an element within packet_len_map (or NULL, which is handled by the if), and len is a valid long, this store operation is deemed safe.
Finally, return XDP_PASS; terminates the program successfully. The verifier, having explored all paths and found no way for the program to crash, loop infinitely, or access unauthorized memory, allows it to be loaded.
What if we made a mistake? Suppose we tried to read from an arbitrary memory address:
SEC("xdp")
int xdp_bad_access(struct xdp_md *ctx) {
char *ptr = (char *)0xffff; // Arbitrary, invalid address
*ptr = 1; // This will be rejected
return XDP_PASS;
}
The verifier hits char *ptr = (char *)0xffff;. It sees an immediate attempt to cast an immediate value to a pointer. The verifier maintains a set of "known" pointers. Pointers derived from xdp_md are known to be within the packet buffer. Pointers returned by helper functions like bpf_map_lookup_elem are known to be within map bounds. Pointers derived from arithmetic on these known pointers, within bounds, are also tracked. A raw immediate value like 0xffff is not a known safe pointer. The verifier marks ptr as an "invalid pointer" or an "unknown pointer." When it encounters *ptr = 1;, it checks if ptr is a valid, dereferenceable pointer. Since ptr is not derived from a legitimate source (like xdp_md or a map lookup) and points to an arbitrary, potentially unmapped memory address, the verifier rejects the program. It will report an error like "invalid dereference of type X" or "memory access out of bounds."
This simulation is what makes eBPF safe. It’s not about runtime checks; it’s about static, exhaustive analysis before the program ever runs on live data.
One common point of confusion is how the verifier handles loops. For bounded loops, like for (int i = 0; i < 10; i++), the verifier can unroll the loop and simulate each iteration. For unbounded loops, like while (1), the verifier will reject them immediately unless there’s a clear break or return path that is guaranteed to be hit within a finite number of steps. The verifier tracks the maximum number of instructions executed and will reject programs that could exceed a safety limit (typically 1 million instructions for most eBPF programs).
The next thing you’ll grapple with is how to efficiently manage the state space for complex programs, as the number of possible execution paths can explode, leading to verification timeouts or excessive memory usage by the verifier itself.