eBPF maps are not just key-value stores; they are sophisticated, in-kernel data structures that can be shared between eBPF programs and even between the kernel and userspace, fundamentally changing how you can observe and manipulate system behavior.

Let’s see how a simple BPF_MAP_TYPE_HASH map works in practice. Imagine we want to count how many times a specific system call, say sys_enter_openat, is invoked.

// In your eBPF program (e.g., C)
struct {
    __uint(type, BPF_MAP_TYPE_HASH);
    __uint(max_entries, 10240); // Max 10240 entries
    __type(key, u32);          // Key is the syscall number
    __type(value, u64);        // Value is the count
} syscall_counts SEC(".maps");

SEC("kprobe/sys_enter_openat")
int kprobe_sys_enter_openat(struct pt_regs *ctx) {
    u32 syscall_num = ctx->orig_ax; // Get syscall number from registers
    u64 *count;

    // Lookup the syscall number in the map
    count = bpf_map_lookup_elem(&syscall_counts, &syscall_num);
    if (count) {
        // If found, increment the count
        __sync_fetch_and_add(count, 1);
    } else {
        // If not found, initialize count to 1
        u64 initial_count = 1;
        bpf_map_update_elem(&syscall_counts, &syscall_num, &initial_count, BPF_ANY);
    }
    return 0;
}

When this eBPF program runs, every time sys_enter_openat is called, the kernel will attempt to find the syscall_num (which is __NR_openat) as a key in our syscall_counts map. If it finds an entry, it increments the associated u64 value. If not, it creates a new entry with the syscall_num as the key and 1 as the value. Userspace tools can then periodically read these values to get the call count.

The core problem eBPF maps solve is efficient, low-overhead data sharing and aggregation within the kernel, and between the kernel and userspace. Without maps, eBPF programs would have to rely on less efficient mechanisms, like returning values to userspace on every event, which would quickly overwhelm the system. Maps provide a persistent, accessible state that eBPF programs can read from and write to.

Internally, different map types implement different data structures, each optimized for specific access patterns and use cases. The choice of map type dictates how data is stored, how efficiently it can be accessed, and what operations are supported.

  • BPF_MAP_TYPE_HASH: This is the most common type, implementing a hash table. It’s excellent for looking up individual values by key, like our syscall counter example. Lookups, updates, and deletions are typically O(1) on average.

    • Use Case: Storing counts, tracking network connections by IP/port, associating metadata with specific PIDs.
    • Max Entries: Configurable, but each entry consumes memory.
    • Key/Value: Arbitrary data types, but keys must be unique.
  • BPF_MAP_TYPE_ARRAY: A simple array where elements are accessed by their index. It’s extremely fast for lookups if you know the index.

    • Use Case: Storing configuration parameters that are accessed by a fixed index, or when you need to perform operations on all elements in a fixed order.
    • Max Entries: Configurable, but all entries are allocated upfront.
    • Key: u32 index.
    • Value: Arbitrary data type.
  • BPF_MAP_TYPE_PERCPU_ARRAY: Similar to BPF_MAP_TYPE_ARRAY, but each element is replicated per CPU. This is crucial for performance as it avoids locking when multiple eBPF programs on different CPUs try to update the same element.

    • Use Case: Per-CPU counters, per-CPU statistics where contention would otherwise be an issue.
    • Max Entries: Configurable, but each entry is allocated max_entries * num_cpus times.
    • Key: u32 index.
    • Value: Arbitrary data type.
  • BPF_MAP_TYPE_PROG_ARRAY: An array of eBPF program file descriptors. This allows one eBPF program to dynamically call another eBPF program.

    • Use Case: Creating complex, multi-stage eBPF processing pipelines, implementing control flow based on event types.
    • Max Entries: Configurable.
    • Key: u32 index.
    • Value: u32 program file descriptor.
  • BPF_MAP_TYPE_PERCPU_HASH: A per-CPU hash table. Each CPU has its own independent hash table.

    • Use Case: Tracking per-CPU session data or counters where keys are not known beforehand and you want to avoid per-CPU contention.
    • Max Entries: Configurable per-CPU.
    • Key: Arbitrary data type.
    • Value: Arbitrary data type.
  • BPF_MAP_TYPE_RINGBUF: A highly efficient circular buffer for sending data from eBPF programs to userspace. It uses a producer-consumer model and avoids explicit locking for most operations.

    • Use Case: Sending large volumes of event data (logs, traces) to userspace with minimal overhead.
    • Max Entries: Configurable size (e.g., BPF_RB_SIZE_1MB).
    • Key/Value: Not applicable in the traditional sense; it’s a stream of data.
  • BPF_MAP_TYPE_STACK_TRACE: A specialized map for collecting stack traces from within eBPF programs.

    • Use Case: Performance profiling, understanding code paths leading to specific events.
    • Max Entries: Configurable.
    • Key/Value: Not applicable; it stores stack trace IDs.

When you use bpf_map_update_elem with BPF_MAP_TYPE_HASH and BPF_ANY, if the key exists, the value is overwritten. If it doesn’t exist, a new entry is created. This is generally efficient, but for high-contention scenarios on shared maps where you’re always updating the same key, BPF_MAP_TYPE_PERCPU_HASH or BPF_MAP_TYPE_PERCPU_ARRAY (if keys are predictable indices) offer better performance by avoiding the need for locking.

The most surprising thing about eBPF maps is their ability to be directly manipulated by userspace programs via file descriptors. After an eBPF program is loaded, the kernel exposes file descriptors for its maps. Userspace can then read, write, and lseek on these file descriptors as if they were regular files, allowing for dynamic configuration and data retrieval without recompiling or reloading the eBPF program. For instance, you can write configuration values to a map and have your eBPF program read them at runtime, or poll a map for collected statistics.

If you find yourself frequently needing to aggregate counters across all CPUs for a specific event, you’ll likely want to explore BPF_MAP_TYPE_PERCPU_ARRAY and how to sum its values in userspace.

Want structured learning?

Take the full Ebpf course →