Falco’s performance overhead isn’t a bug; it’s a fundamental trade-off between comprehensive security visibility and system resource utilization.

Let’s see Falco in action. Imagine a simple web server running in Kubernetes. We’ll use kubectl to deploy a basic Nginx pod and then run Falco to monitor it.

# nginx-pod.yaml
apiVersion: v1
kind: Pod
metadata:
  name: nginx-test
spec:
  containers:
  - name: nginx
    image: nginx:latest
    ports:
    - containerPort: 80

Apply this with kubectl apply -f nginx-pod.yaml.

Now, let’s assume Falco is already running in the cluster (e.g., as a DaemonSet). We can tail its output to see events:

kubectl logs -n falco falco-xxxx -f

When we access the Nginx server (curl localhost:80 on a node where the pod is running, or via kubectl port-forward), Falco will generate events. For instance, you might see:

{
  "output": "10:03:15.1234567890: Notice A permitted syscall was made by the container (nginx-test) on host (k8s-node-1). File: /etc/nginx/html/index.html",
  "priority": "Notice",
  "rule": "File access by container",
  "time": "2023-10-27T10:03:15.1234567890Z",
  "ruleInput": {
    "container": {
      "id": "abcdef1234567890abcdef1234567890abcdef1234567890",
      "name": "nginx-test",
      "image": "nginx:latest"
    },
    "fd.name": "/etc/nginx/html/index.html",
    "proc.name": "nginx"
  },
  "source": "syscall"
}

This event signifies that the nginx process within the nginx-test container accessed the /etc/nginx/html/index.html file. Falco, by default, hooks into kernel system calls (syscalls) via eBPF or kernel modules to observe these actions. The rules engine then evaluates these events against a set of predefined or custom rules to detect suspicious activity.

The problem Falco solves is providing fine-grained, runtime security visibility into containerized environments without relying solely on network logs or application-level instrumentation, which often miss low-level, malicious activities. It can detect things like:

  • A container trying to execute a shell (/bin/sh, /bin/bash).
  • A process attempting to read sensitive files like /etc/passwd or /etc/shadow.
  • A container making unexpected network connections.
  • A process modifying critical system binaries.

Internally, Falco works by:

  1. Capturing System Events: It uses eBPF (preferred) or a kernel module to tap into the kernel’s syscall stream. This is the most resource-intensive part.
  2. Filtering Events: A lightweight, in-kernel filter (eBPF) discards events that don’t match any enabled rules, reducing the load on the userspace daemon.
  3. Rule Matching (Userspace): The Falco daemon receives filtered events and matches them against its rules engine.
  4. Alerting: When a rule is matched, Falco triggers an alert, which can be sent to stdout, a file, syslog, or various output plugins (e.g., Kafka, Slack).

The performance overhead comes primarily from the kernel instrumentation and the volume of events generated. Every syscall, especially those involving file I/O, network activity, or process execution, can potentially be captured. The more rules you have enabled, and the more "noisy" your applications are (e.g., heavy I/O, frequent process forking), the higher the overhead.

Measuring Overhead:

The first step is always to establish a baseline. Run your application(s) under normal load without Falco, and then run them with Falco enabled but with a minimal set of rules. Use standard Linux performance tools:

  • top / htop: Look for the Falco process (usually named falco) and monitor its CPU and memory usage. A sustained high CPU percentage (e.g., >15-20%) for the Falco process itself, even with minimal rules, indicates a potential issue.
  • perf: This is your most powerful tool. To measure the kernel’s syscall overhead attributed to Falco’s eBPF program, run:
    perf record -e 'syscalls:sys_enter_*' -p $(pidof falco) -- sleep 10
    perf report
    
    This records all syscall entries for the Falco process for 10 seconds. Analyze the report for disproportionately high counts of specific syscalls initiated by Falco. More importantly, you can use perf to sample the entire system to see what Falco’s eBPF program is contributing. This is more advanced but crucial for deep dives:
    # Sample the whole system, focusing on eBPF probes
    perf record -e 'tracepoint:kprobes/falco_probe_*' -a -- sleep 10
    perf report
    
    Look for high-frequency tracepoints related to Falco’s eBPF hooks.
  • strace (use with extreme caution on production): While not for measuring overhead directly, strace -p <falco_pid> can show you what syscalls Falco is making. This helps understand why it’s consuming CPU.
  • Falco’s own metrics: Falco can output performance metrics to stdout or a file. Configure it to do so:
    # In falco.yaml
    log_level: info
    print_output_to_stdout: true
    metrics:
      enabled: true
      format: json
      outputs:
        - stdout
    
    Then, kubectl logs -n falco falco-xxxx will show JSON metrics, including event processing times and counts.

Reducing Overhead:

  1. Rule Optimization (Most Impactful):

    • Disable Unused Rules: The falco.yaml configuration file allows you to disable entire rule groups or specific rules. Review your rules.d/ directory and falco.yaml for rules you don’t need. For example, if you never want to alert on any file access, disable the File Access rule group.
      # In falco.yaml
      rules:
        enabled:
          - "File Integrity" # Example of keeping specific rules
          - "Container Activity"
        disabled:
          - "File Access" # Disabling an entire group
          - "Shell in Container" # Disabling a specific rule by its ID
      
      The disabled list takes precedence.
    • Use skip_events: For very noisy rules that you still want to monitor but with less frequency, use skip_events. This tells Falco to only trigger the rule after a certain number of matching events have occurred within a time window.
      # Example rule in a .yaml file within rules.d/
      - rule: Unusual network connection
        desc: Detect network connection to an unusual port
        condition: netevt and fd.sport != 80 and fd.sport != 443 and fd.sport < 1024
        output: Unusual network connection attempt (user: %(user.name) user_uid: %(user.uid) comm: %(proc.name) comm_pid: %(proc.pid) comm_ppid: %(proc.ppid) container_id: %(container.id) container_name: %(container.name) - % (fd.sport)
        priority: Warning
        skip_events: 5 # Only alert after 5 such events in a short window
      
    • Be Specific in Conditions: Instead of broad openat or execve rules, narrow them down. For example, instead of evt.type = execve, use evt.type = execve and proc.name = /bin/bash.
    • Avoid Excessive Wildcards: Wildcards in file paths or process names can lead to more events being evaluated.
  2. eBPF Program Tuning:

    • Kernel Module vs. eBPF: If you’re using the kernel module, consider switching to eBPF, which is generally more efficient and less intrusive. Falco defaults to eBPF when available.
    • eBPF Helper Functions: Falco’s eBPF program uses helpers to filter events early. Ensure your kernel is recent enough to support optimized eBPF features.
    • bpf_filter_enabled: Falco’s eBPF program has an internal mechanism to check if any rules are enabled. If you disable all rules, the eBPF program can be made even more passive.
  3. Resource Allocation:

    • CPU Limits/Requests: For Falco running as a Kubernetes DaemonSet, set appropriate CPU requests and limits. Start with cpu: "500m" and monitor. You might need 1 or 2 CPU cores for very busy clusters.
    • Memory: Falco’s memory usage is usually modest, but ensure it has enough. memory: "256Mi" is a common starting point.
  4. Output Configuration:

    • Reduce Output Verbosity: If you’re logging to stdout or a file, ensure the log level is appropriate (info is good, debug is very noisy).
    • Use Efficient Outputs: For high-volume environments, consider asynchronous outputs like Kafka or a dedicated logging system rather than writing directly to a file or stdout on every node.
  5. System-Level Tuning:

    • Kernel Parameters: In rare cases, kernel parameters related to networking or file system performance might indirectly affect syscall rates, but this is usually a last resort.
    • inotify Limits: If you have many rules that watch file system events (though Falco primarily uses syscalls, some rules might indirectly trigger more syscalls), ensure fs.inotify.max_user_watches is set appropriately on your nodes.

The most common mistake people make is enabling the default "all rules" set in production without understanding what each rule does and its potential impact. Start with a very minimal set of critical rules, measure the overhead, and then incrementally add more, re-measuring after each significant change.

The next challenge you’ll likely face is dealing with the sheer volume of legitimate alerts generated by your optimized ruleset, leading to alert fatigue.

Want structured learning?

Take the full Falco course →