Span compression is a technique to reduce the amount of data generated by your Application Performance Monitoring (APM) system, primarily by eliminating redundant information within traces.

Let’s see it in action. Imagine a typical web request trace. It might involve a web server receiving a request, calling a database, and then calling a microservice. Without compression, each of these hops would generate a distinct "span" – a record of a specific operation within a trace.

Here’s a simplified view of what that might look like in a trace:

[
  {
    "traceId": "abc123xyz",
    "spanId": "span1",
    "parentSpanId": null,
    "name": "GET /users",
    "kind": "SERVER",
    "startTimeUnixNano": 1678886400000000000,
    "endTimeUnixNano": 1678886401000000000,
    "attributes": {
      "http.method": "GET",
      "http.route": "/users"
    }
  },
  {
    "traceId": "abc123xyz",
    "spanId": "span2",
    "parentSpanId": "span1",
    "name": "SQL SELECT",
    "kind": "CLIENT",
    "startTimeUnixNano": 1678886400100000000,
    "endTimeUnixNano": 1678886400500000000,
    "attributes": {
      "db.system": "postgresql",
      "db.statement": "SELECT * FROM users WHERE id = 1"
    }
  },
  {
    "traceId": "abc123xyz",
    "spanId": "span3",
    "parentSpanId": "span1",
    "name": "HTTP POST /auth",
    "kind": "CLIENT",
    "startTimeUnixNano": 1678886400600000000,
    "endTimeUnixNano": 1678886400900000000,
    "attributes": {
      "http.method": "POST",
      "http.url": "http://auth-service:8080/auth"
    }
  }
]

In this scenario, the GET /users span is the parent, and the SQL SELECT and HTTP POST /auth are its children. Each of these is a distinct operation. Span compression comes into play when you have a chain of operations where each subsequent operation is a direct child of the previous one, and there’s no significant work happening between them.

Consider a trace where a service calls another service, which then immediately calls a database. Without compression, you’d have three spans: Service A -> Service B -> Database. The span for Service B’s operation would essentially just be the act of forwarding the request to the database. Span compression allows the APM agent to recognize this pattern and merge the span for Service B into the span for the database call, or even into the parent span if the overhead is minimal.

The core problem span compression solves is the exponential growth of trace data. As your system scales and becomes more distributed, the number of individual operations (spans) within a single trace can explode. This leads to increased storage costs, higher ingestion rates for your APM backend, and slower query times when you’re trying to find specific traces. By intelligently reducing the number of spans, you get the essential information without the overwhelming volume.

Internally, APM agents or collectors implement span compression by analyzing the relationship between spans. When a span is created, the agent looks at its parent. If the parent span is still active and the new span represents a direct, low-overhead call (like a simple network hop or a brief function execution that primarily acts as a proxy), the agent can decide to "compress" the child span. This usually means either:

  1. Merging the child span’s duration and attributes into the parent span. The parent span’s end time is extended, and its attributes might be augmented with information from the child.
  2. Discarding the child span entirely and enriching the parent span with the child’s key information (like the endpoint called or the statement executed).

The exact logic depends on the APM vendor and the specific configuration. You can often tune parameters like the maximum duration of a span that’s eligible for compression, or the types of operations that are considered "compressible."

The levers you control are typically configuration settings within your APM agent or collector. These might be environment variables or configuration files. For example, you might find settings like:

  • OTEL_SPAN_COMPRESSION_ENABLED=true (for OpenTelemetry-based agents)
  • DD_TRACE_SPAN_COMPRESSION_MAX_DURATION_MS=10 (Datadog’s configuration for maximum duration to consider for compression)
  • NEWRELIC_SPAN_COMPRESSION_THRESHOLD_MS=5 (New Relic’s threshold)

You can also often configure which operations are excluded from compression, ensuring that critical, potentially long-running, or complex intermediate steps are always captured as distinct spans.

The most surprising true thing about span compression is that it doesn’t necessarily lose critical information; in fact, it can improve the signal-to-noise ratio by focusing on the truly impactful operations. For instance, if a request to service B is immediately followed by a database query, the time spent in service B might be negligible. Capturing this tiny span adds noise. Compressing it and ensuring the database query span is clearly linked to the original request provides a clearer picture of the bottleneck.

The next concept you’ll likely encounter is sampling. While span compression reduces the volume of data for every trace, sampling reduces the number of traces that are collected and sent to your APM backend in the first place, especially for high-volume applications.

Want structured learning?

Take the full Elastic-apm course →