GitHub audit logs are a treasure trove of information about who did what, when, and where within your repositories. Normally, you’d comb through them manually or use GitHub’s built-in search, which is fine for spot-checking. But what if you want to detect threats as they happen, at the code level, using a powerful runtime security tool like Falco? That’s where streaming these logs comes in.
The surprising truth is that GitHub audit logs, by default, aren’t designed for real-time security analysis. They’re primarily for compliance and historical review. To make them useful for something like Falco, you need to bridge the gap between GitHub’s event-driven architecture and Falco’s rule-based detection engine.
Let’s see this in action. Imagine a developer accidentally pushes a sensitive file containing API keys directly to a public repository. This is a critical security event.
Here’s a simplified view of what a GitHub audit log entry for that event might look like:
{
"timestamp": "2023-10-27T10:30:00Z",
"event": "create_ref",
"actor": {
"login": "developer_jane",
"id": 12345,
"avatar_url": "https://avatars.githubusercontent.com/u/12345?v=4"
},
"repo": {
"name": "my-awesome-project",
"id": 67890,
"url": "https://api.github.com/repos/my-org/my-awesome-project"
},
"payload": {
"ref": "refs/heads/main",
"ref_type": "branch",
"before": "a1b2c3d4e5f67890",
"after": "f0e9d8c7b6a54321",
"commits": [
{
"sha": "f0e9d8c7b6a54321",
"author": {"name": "Jane Doe", "email": "jane.doe@example.com"},
"message": "feat: Add new feature and accidentally commit API keys",
"distinct": 1,
"url": "https://api.github.com/repos/my-org/my-awesome-project/commits/f0e9d8c7b6a54321"
}
]
},
"org": {
"login": "my-org",
"id": 112233
},
"api_url": "https://api.github.com/organizations/112233/audit/log"
}
Now, how do we get this into Falco? We need a way to ingest these events from GitHub and transform them into a format Falco can understand. The most common approach involves using GitHub’s webhook functionality to trigger an intermediary service.
The Plumbing: GitHub Webhooks, an Ingestion Service, and Falco
- GitHub Webhooks: You configure a webhook in your GitHub repository (or organization) to send
pushevents to a specific URL. This URL will point to your ingestion service. - Ingestion Service: This is a small application (e.g., a Python Flask app, a Node.js service, or even a serverless function like AWS Lambda or Google Cloud Functions) that:
- Receives the webhook payload from GitHub.
- Parses the JSON payload.
- Transforms the relevant GitHub event data into a structured format that can be consumed by Falco. This often means creating custom log entries or enriching existing ones.
- Sends these transformed events to a centralized logging system or directly to Falco’s input. For simplicity, many people stream these to
stdoutand let a container orchestrator or Falco itself pick them up.
- Falco: Falco runs on your infrastructure (e.g., Kubernetes nodes, VMs) and monitors system calls and other event sources. You configure Falco to read from your ingestion service’s output (e.g., a file, a network socket, or
stdoutfrom a container) and apply custom rules to detect malicious patterns within the GitHub audit log data.
Building the Mental Model: How Falco Sees GitHub Events
Falco’s power lies in its rule engine. You write rules that define what constitutes a suspicious event. For GitHub audit logs, you’re essentially creating rules that look for specific patterns within the transformed log data.
Consider the accidental push of sensitive keys. Your ingestion service would capture the push event, extract the commit message, and potentially even the files changed if you configure GitHub to send more detailed push data (though this is less common for audit logs and more for webhook payloads). You’d then send this to Falco.
A Falco rule might look something like this:
- rule: Detect Sensitive Data Exposure in Commit Message
desc: Alerts when a commit message explicitly mentions sensitive data like API keys, passwords, or tokens.
condition: >
evt.type = "github_audit_log" and
evt.source = "github" and
evt.subsystem = "push" and
github.commit.message contains "api key" or
github.commit.message contains "password" or
github.commit.message contains "token"
output: |
Sensitive data mentioned in commit message by user %github.actor.login in repo %github.repo.name.
Commit SHA: %github.commit.sha
Commit Message: %github.commit.message
priority: CRITICAL
tags: [github, security, sensitive-data]
When the ingestion service sends a structured event like this to Falco:
{
"ts": 1698399000000,
"rule": "Detect Sensitive Data Exposure in Commit Message",
"output": "Sensitive data mentioned in commit message by user developer_jane in repo my-awesome-project. Commit SHA: f0e9d8c7b6a54321 Commit Message: feat: Add new feature and accidentally commit API keys",
"priority": "CRITICAL",
"level": "critical",
"tags": ["github", "security", "sensitive-data"],
"event": {
"type": "github_audit_log",
"source": "github",
"subsystem": "push"
},
"github": {
"actor": {"login": "developer_jane"},
"repo": {"name": "my-awesome-project"},
"commit": {
"sha": "f0e9d8c7b6a54321",
"message": "feat: Add new feature and accidentally commit API keys"
}
}
}
Falco matches the condition and triggers an alert.
Leveraging Falco’s Event Fields for Granular Detection
The key to effective detection here is understanding the structure of the events you’re sending into Falco and how to map them to Falco’s event fields. Your ingestion service acts as a translator, taking the raw GitHub audit log JSON and enriching it or restructuring it into a format that aligns with Falco’s expectations.
For instance, you might want to detect unauthorized access attempts. GitHub audit logs contain events like user_two_factor_authentication_enable or user_login. Your ingestion service could parse these, extract the actor.login, target.login (if applicable), and the action performed. Then, you’d create Falco rules that look for specific sequences of actions by specific users, or actions performed outside of expected hours, using fields like github.actor.login, github.action, and github.timestamp.
A more advanced scenario involves detecting patterns of privilege escalation. Imagine a user who is a regular developer suddenly being granted admin access to a critical repository. Your ingestion service would need to capture org.add_member or repo.update_collaborator events, identify the role change (e.g., from member to admin), and send this information to Falco. A Falco rule could then be crafted to flag any promotion to admin status for accounts that haven’t been explicitly approved through a separate process.
The real magic happens when you combine these GitHub events with other system events Falco monitors. For example, if a user who is flagged for suspicious GitHub activity also initiates a suspicious process on a server, Falco can correlate these events and provide a more comprehensive security picture. This means your ingestion service should ideally be able to send enriched events that include context about the user and their actions, which Falco can then cross-reference with its other data sources.
The one thing most people overlook is the sheer volume and variety of events available in GitHub audit logs. Beyond just pushes and logins, you have events related to repository creation, deletion, secret scanning alerts, dependency review changes, and much more. By carefully selecting which events to stream and how to structure them for Falco, you can build incredibly granular detection capabilities that go far beyond simple file integrity monitoring. This allows you to proactively identify and respond to threats that originate from your development workflow itself.
Once you’ve successfully integrated GitHub audit logs into Falco, the next logical step is to explore how to integrate other cloud provider audit logs (like AWS CloudTrail or Azure Activity Logs) for a unified, cross-cloud security posture.