eBPF lets you peer inside your containers by hooking into the kernel, bypassing the need for sidecar containers to observe network traffic or system calls.
Here’s a simple Go program that acts as a basic web server, listening on port 8080:
package main
import (
"fmt"
"log"
"net/http"
)
func handler(w http.ResponseWriter, r *http.Request) {
fmt.Fprintf(w, "Hello from the Go server!\n")
}
func main() {
http.HandleFunc("/", handler)
fmt.Println("Starting server on :8080")
log.Fatal(http.ListenAndServe(":8080", nil))
}
We can build this into a Docker image:
FROM golang:1.20-alpine
WORKDIR /app
COPY . .
RUN go build -o server
CMD ["./server"]
And run it:
docker build -t my-go-server .
docker run -d --name go-server-instance -p 8080:8080 my-go-server
Now, instead of injecting a sidecar to sniff traffic, we can use eBPF. Let’s say we want to see all incoming TCP connections to our go-server-instance on port 8080. We can use a tool like bpftool (part of the linux-tools-common package on many distros) or a more specialized eBPF tool like bcc or cilium/ebpf.
Let’s use a bcc Python script for demonstration. First, ensure you have bcc installed (e.g., sudo apt update && sudo apt install bpfcc-tools python3-bpfcc).
This script attaches a kprobe to the inet_create kernel function, which is called when a new socket is created, and filters for TCP sockets destined for port 8080.
#!/usr/bin/python3
from bcc import BPF
import socket
prog = """
#include <uapi/linux/ptrace.h>
#include <net/sock.h>
#include <bcc/proto.h>
BPF_HASH(pid_map, u64, u64);
int kprobe__inet_create(struct pt_regs *ctx, struct socket *sock, int family, int type, int protocol) {
// Check if it's a TCP socket
if (protocol != IPPROTO_TCP) {
return 0;
}
// Get the PID of the process creating the socket
u64 pid = bpf_get_current_pid_tgid() >> 32;
pid_map.update(&pid, &pid); // Store PID to potentially filter later if needed
return 0;
}
"""
b = BPF(text=prog)
print("Attaching kprobe to inet_create. Listening for TCP connections on port 8080...")
try:
while True:
# In a real scenario, you'd process events here.
# For this example, we're just demonstrating attachment.
# To see actual traffic, you'd need to hook into functions like tcp_connect, tcp_v4_connect, etc.
# and extract port information.
pass
except KeyboardInterrupt:
print("Detaching BPF program.")
b.detach()
This example is simplified to show attachment. To actually observe the specific port, you’d need to go deeper. A more practical script would involve:
- Kprobing
tcp_v4_connectortcp_v6_connect: These functions are called when a TCP connection is initiated. - Accessing
struct sock: Within these probes, you can access the socket structure. - Extracting
sk_dport: Thestruct sockhas a membersk_dport(or similar, depending on kernel version and exact struct) which holds the destination port. - Filtering: Compare
sk_dportto 8080. - Outputting: Print the PID, comm, and port information.
Here’s a more illustrative bcc script that attempts to capture this:
#!/usr/bin/python3
from bcc import BPF
import socket
import struct
# Define the BPF program
bpf_text = """
#include <uapi/linux/ptrace.h>
#include <net/sock.h>
#include <bcc/proto.h>
#include <linux/tcp.h>
// Event structure to send data from kernel to userspace
struct tcp_event_t {
u32 pid;
char comm[TASK_COMM_LEN];
u16 dport;
u16 sport;
};
// Define a map to store events
BPF_PERF_OUTPUT(events);
// Target port
const u16 TARGET_PORT = 8080;
// Kprobe for tcp_v4_connect
int kprobe__tcp_v4_connect(struct pt_regs *ctx, struct tcp_sock *sk, struct sockaddr_in *user_addr) {
// Get the destination port from the sockaddr_in structure
u16 dport = user_addr->sin_port;
// Filter for the target port
if (dport == TARGET_PORT) {
struct tcp_event_t event = {};
event.pid = bpf_get_current_pid_tgid() >> 32;
bpf_get_current_comm(&event.comm, sizeof(event.comm));
event.dport = ntohs(dport); // Network to host byte order for port
event.sport = ntohs(sk->inet.inet_sport);
events.perf_submit(ctx, &event, sizeof(event));
}
return 0;
}
// Kprobe for tcp_v6_connect (optional, for IPv6)
// int kprobe__tcp_v6_connect(struct pt_regs *ctx, struct tcp_sock *sk, struct sockaddr_in6 *user_addr) {
// u16 dport = user_addr->sin6_port;
// if (dport == TARGET_PORT) {
// struct tcp_event_t event = {};
// event.pid = bpf_get_current_pid_tgid() >> 32;
// bpf_get_current_comm(&event.comm, sizeof(event.comm));
// event.dport = ntohs(dport);
// event.sport = ntohs(sk->inet.inet_sport);
// events.perf_submit(ctx, &event, sizeof(event));
// }
// return 0;
// }
"""
# Load BPF program
b = BPF(text=bpf_text)
# Define the event handler function
def print_event(cpu, data, size):
event = struct.unpack("II%dsH", data[:12]) # PID, comm (padded), dport, sport
pid = event[0]
comm = event[1].decode('utf-8').rstrip('\\x00')
dport = event[2]
sport = event[3]
print(f"PID: {pid:<6} Comm: {comm:<16} DPort: {dport:<5} SPort: {sport}")
# Open the perf event buffer and set the handler
b["events"].open_perf_buffer(print_event)
print("Listening for TCP connections to port 8080...")
print("PID Comm DPort SPort")
try:
while True:
b.perf_buffer_poll()
except KeyboardInterrupt:
print("Detaching BPF program.")
b.detach()
When you run this script (sudo python3 your_script.py), and then make a request to your Go server (e.g., curl http://localhost:8080 in another terminal), you’ll see output like:
Listening for TCP connections to port 8080...
PID Comm DPort SPort
PID: 12345 Comm: curl DPort: 8080 SPort: 54321
This shows the curl process (PID 12345) initiating a connection to port 8080. The Go server itself might create its own internal sockets or accept connections, and you could hook into tcp_accept or other functions to observe those as well.
The core idea is that eBPF allows you to instrument the Linux kernel without modifying kernel code or application code. You’re essentially writing small programs that run in a sandboxed environment within the kernel, triggered by specific events (like function calls). This makes it incredibly powerful for observability, security, and networking, as you can gain deep insights into system behavior that would otherwise be opaque or require intrusive measures like sidecars.
The real magic of eBPF is its ability to attach to virtually any kernel function (via kprobes/kretprobes) or user-space function (via uprobes), and to trace network events directly from the kernel’s networking stack. This avoids the overhead and complexity of sidecar containers that often rely on packet capture (like tcpdump) or application-level instrumentation, which can be less performant and harder to manage across diverse workloads.
One aspect people often overlook is the ability to correlate events across different kernel subsystems using eBPF. For instance, you could trace a network connection being established, then trace the corresponding system calls made by the application handling that connection, and even trace file I/O operations triggered by that request, all within a single eBPF program or a set of coordinated programs. This provides a holistic view of request lifecycles that’s hard to achieve with traditional monitoring tools.
The next step is to explore how to trace outgoing connections or observe system calls made by the server process itself using eBPF.