eBPF TC programs are not just a way to filter packets; they fundamentally change how you think about network packet processing by allowing you to inject custom logic directly into the kernel’s network stack at defined points.
Let’s see this in action. Imagine we want to drop all packets destined for port 80 (HTTP) on a specific server.
# First, create a simple eBPF program to drop HTTP traffic
cat << EOF > drop_http.c
#include <linux/bpf.h>
#include <linux/if_ether.h>
#include <linux/ip.h>
#include <linux/tcp.h>
#include <bpf/bpf_helpers.h>
SEC("cls_act")
int drop_http_func(struct __sk_buff *skb) {
void *data_end = (void *)(long)skb->data_end;
void *data = (void *)(long)skb->data;
struct ethhdr *eth = data;
if ((void *)(eth + 1) > data_end)
return TC_ACT_OK; // Not an Ethernet frame, pass it on
if (eth->h_proto != __constant_htons(ETH_P_IP))
return TC_ACT_OK; // Not an IP packet, pass it on
struct iphdr *iph = data + sizeof(struct ethhdr);
if ((void *)(iph + 1) > data_end)
return TC_ACT_OK; // Not a full IP header, pass it on
if (iph->protocol != IPPROTO_TCP)
return TC_ACT_OK; // Not TCP, pass it on
struct tcphdr *tcph = (void *)iph + iph->ihl * 4;
if ((void *)(tcph + 1) > data_end)
return TC_ACT_OK; // Not a full TCP header, pass it on
// Check for destination port 80 (HTTP)
if (tcph->dest == __constant_htons(80)) {
return TC_ACT_SHOT; // Drop the packet
}
return TC_ACT_OK; // Allow other traffic
}
EOF
# Compile the eBPF program
clang -O2 -target bpf -c drop_http.c -o drop_http.o
# Load the eBPF program into the kernel using tc
# Replace eth0 with your actual network interface
tc filter add dev eth0 ingress bpf obj drop_http.o sec cls_act
Now, if you try to curl http://your_server_ip (where your_server_ip is the IP of the interface eth0), the request will be dropped before it even reaches the user-space application. You’ll see a connection refused or timeout error. Other traffic, like SSH on port 22, will pass through unaffected.
The magic here is the TC_ACT_SHOT return code. When an eBPF program attached to the TC (Traffic Control) ingress hook returns TC_ACT_SHOT, the kernel immediately discards the packet without further processing. This is incredibly efficient because the packet is dropped at the earliest possible moment in the network stack, saving CPU cycles and reducing latency. The SEC("cls_act") macro tells the compiler to place this function in the cls_act section, which is a recognized hook point for classification and action within the TC subsystem.
The core problem eBPF TC programs solve is the inflexibility of traditional network packet filtering mechanisms like iptables. iptables operates by traversing a series of tables and chains, and while powerful, it can become a performance bottleneck and is challenging to manage for complex policies. eBPF allows you to write custom logic that runs directly in the kernel’s network path, offering fine-grained control and significantly higher performance. You can inspect virtually every field in a packet, make complex decisions based on packet content, and then decide whether to accept, drop, redirect, or even modify the packet.
The actual mechanism involves attaching an eBPF program to a network device’s ingress or egress hook. When a packet arrives (ingress) or is about to leave (egress), the kernel executes the attached eBPF program. The program receives a struct __sk_buff (socket buffer) as its argument, which is a representation of the packet and its associated metadata. The eBPF program can then use helper functions provided by the kernel (like bpf_skb_load_bytes or direct memory access into the __sk_buff structure) to examine packet headers and data. Based on these observations, the program returns an action code: TC_ACT_OK to pass the packet, TC_ACT_SHOT to drop it, TC_ACT_REDIRECT to send it to another device, or TC_ACT_RECLASSIFY to move it to another TC classifier.
One common point of confusion is the bpf_redirect helper function. While it sounds like it simply redirects packets, its primary use case in TC is for load balancing. When you use bpf_redirect(target_ifindex, 0), target_ifindex is the index of the network interface you want to send the packet to. However, if target_ifindex is set to BPF_REDIRECT_TO_MAP, the function expects a map file descriptor as the first argument, which is then used to look up the target interface index for load balancing. This allows for sophisticated ingress load balancing directly within the kernel, bypassing the need for a separate userspace load balancer for certain scenarios.
With this in place, your next challenge might be dynamically updating the list of blocked ports without recompiling and reloading the eBPF program.