eBPF can tell you exactly which process is leaking memory, and it does it by observing memory allocation calls at the kernel level, bypassing application-level instrumentation entirely.

Let’s see this in action. Imagine a simple C program that continuously allocates memory without freeing it:

#include <stdio.h>
#include <stdlib.h>
#include <unistd.h>

int main() {
    for (int i = 0; i < 1000000; i++) {
        char *ptr = malloc(1024); // Allocate 1KB
        if (ptr == NULL) {
            perror("malloc failed");
            return 1;
        }
        // No free(ptr);
        usleep(1000); // Wait 1ms
    }
    return 0;
}

When this runs, its memory usage will climb. To detect this with eBPF, we can use a tool like bpftrace. We want to trace malloc calls and record the size of the allocation and the PID of the process making the call.

Here’s a bpftrace script:

kprobe:kmalloc
/pid != 0/ {
  $size = arg0; // kmalloc first argument is size
  $pid = bpf_get_current_pid();
  printf("PID %d allocated %d bytes\n", $pid, $size);
}

If we run this script while the leaky C program is active, we’ll see output like this:

Attaching 1 probe...
PID 12345 allocated 1024 bytes
PID 67890 allocated 4096 bytes
PID 12345 allocated 1024 bytes
PID 12345 allocated 1024 bytes
...

This tells us that allocations are happening, but not necessarily a leak. To detect the leak, we need to track the net change in memory. A more sophisticated bpftrace script can do this by tracking kmalloc and kfree (or their userspace equivalents, malloc and free, which often wrap kmalloc/kfree).

kprobe:kmalloc
/pid != 0/ {
  $pid = bpf_get_current_pid();
  $size = arg0;
  allocations[$pid] = allocations[$pid] + $size;
}

kprobe:kfree
/pid != 0/ {
  $pid = bpf_get_current_pid();
  $size = arg0; // kfree also takes size as arg0 in some kernel versions
  allocations[$pid] = allocations[$pid] - $size;
}

interval:s:5 {
  printf("--- Interval Report ---\n");
  foreach (pid, size in allocations) {
    if (size > 0) {
      printf("PID %d: Net allocated %d bytes\n", pid, size);
    }
  }
}

This script maintains a map (allocations) where keys are PIDs and values are the net bytes allocated. It increments on kmalloc and decrements on kfree. Every 5 seconds, it prints the net allocated bytes for each process. If a PID consistently shows a large positive net allocation, and its actual memory usage (e.g., from top or ps) is also increasing, you’ve found your leak.

The core problem eBPF solves here is visibility into kernel memory operations. Traditional application debugging often relies on application-level logs or signals, which can be incomplete or absent. By hooking into kernel functions like kmalloc and kfree, eBPF operates at a fundamental level, capturing every allocation request regardless of the application’s internal state or instrumentation. This provides a definitive audit trail of memory management.

The bpftrace script above uses associative arrays (maps) to aggregate data. allocations[$pid] = allocations[$pid] + $size; is a crucial line. When kmalloc is called, the size of the allocation (arg0) is added to the running total for the current process ID ($pid). Conversely, kfree subtracts the freed size. The interval probe then periodically reports the net change. This difference between cumulative allocations and deallocations is the key indicator of a leak.

A common pitfall is assuming kfree’s first argument is always the size. In some kernel versions, kfree takes a void *ptr as its first argument. To get the size, you’d need to trace __kmalloc_track_caller or similar functions that might expose both the pointer and size, or use kretprobe on kmalloc to capture the returned pointer and then trace kfree using the pointer to look up its size (which is more complex). However, for simplicity and common scenarios, tracing kmalloc and kfree with arg0 as size is often sufficient.

Beyond kmalloc/kfree, you can also trace userspace malloc/free calls by attaching eBPF programs to userspace functions using uprobe. This requires knowing the exact symbol names and library paths, which can be more brittle than kernel probes.

The most surprising mechanical detail is how eBPF programs are verified by the kernel before execution. A program that could crash the kernel (e.g., by entering an infinite loop or accessing invalid memory) is rejected. This safety mechanism is paramount, allowing you to run these powerful diagnostic tools on production systems without fear of instability.

The next logical step after identifying a leaking process is to pinpoint the specific allocation site within that process.

Want structured learning?

Take the full Ebpf course →