eBPF lets you instrument the kernel itself without changing its code or crashing it.

Let’s see eBPF trace file system operations.

sudo apt update && sudo apt install -y build-essential clang llvm libelf-dev linux-headers-$(uname -r) zlib1g-dev
git clone https://github.com/iovisor/bcc.git
cd bcc
mkdir build
cd build
cmake ..
make
sudo make install
cd ..

Now, we’ll write a simple eBPF program to trace open() and close() syscalls.

#include <uapi/linux/ptrace.h>
#include <linux/fs.h>

BPF_HASH(open_counts, u64, u64);
BPF_HASH(close_counts, u64, u64);

int trace_entry_open(struct pt_regs *ctx) {
    u64 uid = bpf_get_current_uid_gid();
    u64 *count = open_counts.lookup(&uid);
    if (count) {
        __sync_fetch_and_add(count, 1);
    } else {
        u64 initial_count = 1;
        open_counts.insert(&uid, &initial_count);
    }
    return 0;
}

int trace_entry_close(struct pt_regs *ctx) {
    u64 uid = bpf_get_current_uid_gid();
    u64 *count = close_counts.lookup(&uid);
    if (count) {
        __sync_fetch_and_add(count, 1);
    } else {
        u64 initial_count = 1;
        close_counts.insert(&uid, &initial_count);
    }
    return 0;
}

And a Python script to load and run it.

from bcc import BPF
import time

bpf_text = """
#include <uapi/linux/ptrace.h>
#include <linux/fs.h>

BPF_HASH(open_counts, u64, u64);
BPF_HASH(close_counts, u64, u64);

int trace_entry_open(struct pt_regs *ctx) {
    u64 uid = bpf_get_current_uid_gid();
    u64 *count = open_counts.lookup(&uid);
    if (count) {
        __sync_fetch_and_add(count, 1);
    } else {
        u64 initial_count = 1;
        open_counts.insert(&uid, &initial_count);
    }
    return 0;
}

int trace_entry_close(struct pt_regs *ctx) {
    u64 uid = bpf_get_current_uid_gid();
    u64 *count = close_counts.lookup(&uid);
    if (count) {
        __sync_fetch_and_add(count, 1);
    } else {
        u64 initial_count = 1;
        close_counts.insert(&uid, &initial_count);
    }
    return 0;
}
"""

b = BPF(text=bpf_text)

# Attach to syscalls
open_syscall_id = b.get_syscall_fnname("open")
close_syscall_id = b.get_syscall_fnname("close")

b.attach_kprobe(event=open_syscall_id, fn_name="trace_entry_open")
b.attach_kprobe(event=close_syscall_id, fn_name="trace_entry_close")

print("Tracing open/close syscalls. Press Ctrl+C to stop.")

try:
    while True:
        time.sleep(2)
        print("\n--- File Operation Counts ---")
        
        open_data = b.get_table("open_counts")
        print("Open calls per UID:")
        for uid, count in open_data.items():
            print(f"  UID {uid.value}: {count.value}")

        close_data = b.get_table("close_counts")
        print("Close calls per UID:")
        for uid, count in close_data.items():
            print(f"  UID {uid.value}: {count.value}")
        
        # Clear the maps for the next interval
        open_data.clear()
        close_data.clear()

except KeyboardInterrupt:
    print("\nDetaching BPF program.")
    b.detach_kprobe(event=open_syscall_id)
    b.detach_kprobe(event=close_syscall_id)

Save the C code as trace_fs.c and the Python code as trace_fs.py. Run the Python script: sudo python3 trace_fs.py.

Now, in another terminal, perform some file operations: ls, cat /etc/passwd, touch test_file.txt, rm test_file.txt. You’ll see output like this in the script’s terminal:

Tracing open/close syscalls. Press Ctrl+C to stop.

--- File Operation Counts ---
Open calls per UID:
  UID 1000: 5
Close calls per UID:
  UID 1000: 4

--- File Operation Counts ---
Open calls per UID:
  UID 1000: 2
Close calls per UID:
  UID 1000: 2

This script counts open and close syscalls per user ID (UID). The BPF_HASH maps store key-value pairs, where the key is the UID and the value is the count. bpf_get_current_uid_gid() retrieves the current process’s UID. __sync_fetch_and_add atomically increments the count.

The real power comes from being able to attach to any kernel function. We used kprobe to hook into the entry point of open and close syscalls. BCC handles the compilation and loading of the eBPF program into the kernel. The Python script acts as the user-space agent, interacting with the eBPF maps to retrieve and display data.

What most people don’t realize is that eBPF programs are verified by the kernel before execution to ensure they don’t crash the system. This verification process checks for things like infinite loops, invalid memory accesses, and ensures the program will always terminate. This safety guarantee is fundamental to eBPF’s utility.

To go deeper, you’d explore tracing other file system operations like read, write, or even VFS layer functions for more granular insights.

Want structured learning?

Take the full Ebpf course →